To see the other types of publications on this topic, follow the link: Dense Vision Tasks.

Journal articles on the topic 'Dense Vision Tasks'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Dense Vision Tasks.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Yao, Chao, Shuo Jin, Meiqin Liu, and Xiaojuan Ban. "Dense Residual Transformer for Image Denoising." Electronics 11, no. 3 (2022): 418. http://dx.doi.org/10.3390/electronics11030418.

Full text
Abstract:
Image denoising is an important low-level computer vision task, which aims to reconstruct a noise-free and high-quality image from a noisy image. With the development of deep learning, convolutional neural network (CNN) has been gradually applied and achieved great success in image denoising, image compression, image enhancement, etc. Recently, Transformer has been a hot technique, which is widely used to tackle computer vision tasks. However, few Transformer-based methods have been proposed for low-level vision tasks. In this paper, we proposed an image denoising network structure based on Tr
APA, Harvard, Vancouver, ISO, and other styles
2

Zhang, Qian, Yeqi Liu, Chuanyang Gong, Yingyi Chen, and Huihui Yu. "Applications of Deep Learning for Dense Scenes Analysis in Agriculture: A Review." Sensors 20, no. 5 (2020): 1520. http://dx.doi.org/10.3390/s20051520.

Full text
Abstract:
Deep Learning (DL) is the state-of-the-art machine learning technology, which shows superior performance in computer vision, bioinformatics, natural language processing, and other areas. Especially as a modern image processing technology, DL has been successfully applied in various tasks, such as object detection, semantic segmentation, and scene analysis. However, with the increase of dense scenes in reality, due to severe occlusions, and small size of objects, the analysis of dense scenes becomes particularly challenging. To overcome these problems, DL recently has been increasingly applied
APA, Harvard, Vancouver, ISO, and other styles
3

Gan, Zhe, Yen-Chun Chen, Linjie Li, et al. "Playing Lottery Tickets with Vision and Language." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 1 (2022): 652–60. http://dx.doi.org/10.1609/aaai.v36i1.19945.

Full text
Abstract:
Large-scale pre-training has recently revolutionized vision-and-language (VL) research. Models such as LXMERT and UNITER have significantly lifted the state of the art over a wide range of VL tasks. However, the large number of parameters in such models hinders their application in practice. In parallel, work on the lottery ticket hypothesis (LTH) has shown that deep neural networks contain small matching subnetworks that can achieve on par or even better performance than the dense networks when trained in isolation. In this work, we perform the first empirical study to assess whether such tra
APA, Harvard, Vancouver, ISO, and other styles
4

Dinh, My-Tham, Deok-Jai Choi, and Guee-Sang Lee. "DenseTextPVT: Pyramid Vision Transformer with Deep Multi-Scale Feature Refinement Network for Dense Text Detection." Sensors 23, no. 13 (2023): 5889. http://dx.doi.org/10.3390/s23135889.

Full text
Abstract:
Detecting dense text in scene images is a challenging task due to the high variability, complexity, and overlapping of text areas. To adequately distinguish text instances with high density in scenes, we propose an efficient approach called DenseTextPVT. We first generated high-resolution features at different levels to enable accurate dense text detection, which is essential for dense prediction tasks. Additionally, to enhance the feature representation, we designed the Deep Multi-scale Feature Refinement Network (DMFRN), which effectively detects texts of varying sizes, shapes, and fonts, in
APA, Harvard, Vancouver, ISO, and other styles
5

Pan, Zizheng, Bohan Zhuang, Haoyu He, Jing Liu, and Jianfei Cai. "Less Is More: Pay Less Attention in Vision Transformers." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 2 (2022): 2035–43. http://dx.doi.org/10.1609/aaai.v36i2.20099.

Full text
Abstract:
Transformers have become one of the dominant architectures in deep learning, particularly as a powerful alternative to convolutional neural networks (CNNs) in computer vision. However, Transformer training and inference in previous works can be prohibitively expensive due to the quadratic complexity of self-attention over a long sequence of representations, especially for high-resolution dense prediction tasks. To this end, we present a novel Less attention vIsion Transformer (LIT), building upon the fact that the early self-attention layers in Transformers still focus on local patterns and br
APA, Harvard, Vancouver, ISO, and other styles
6

CASCO, CLARA, GIANLUCA CAMPANA, ALBA GRIECO, SILVANA MUSETTI, and SALVATORE PERRONE. "Hyper-vision in a patient with central and paracentral vision loss reflects cortical reorganization." Visual Neuroscience 20, no. 5 (2003): 501–10. http://dx.doi.org/10.1017/s0952523803205046.

Full text
Abstract:
SM, a 21-year-old female, presents an extensive central scotoma (30 deg) with dense absolute scotoma (visual acuity = 10/100) in the macular area (10 deg) due to Stargardt's disease. We provide behavioral evidence of cortical plastic reorganization since the patient could perform several visual tasks with her poor-vision eyes better than controls, although high spatial frequency sensitivity and visual acuity are severely impaired. Between 2.5-deg and 12-deg eccentricities, SM presented (1) normal acuity for crowded letters, provided stimulus size is above acuity thresholds for single letters;
APA, Harvard, Vancouver, ISO, and other styles
7

Zhang, Xu, DeZhi Han, and Chin-Chen Chang. "RDMMFET: Representation of Dense Multimodality Fusion Encoder Based on Transformer." Mobile Information Systems 2021 (October 18, 2021): 1–9. http://dx.doi.org/10.1155/2021/2662064.

Full text
Abstract:
Visual question answering (VQA) is the natural language question-answering of visual images. The model of VQA needs to make corresponding answers according to specific questions based on understanding images, the most important of which is to understand the relationship between images and language. Therefore, this paper proposes a new model, Representation of Dense Multimodality Fusion Encoder Based on Transformer, for short, RDMMFET, which can learn the related knowledge between vision and language. The RDMMFET model consists of three parts: dense language encoder, image encoder, and multimod
APA, Harvard, Vancouver, ISO, and other styles
8

Li, Bin, Haifeng Ye, Sihan Fu, Xiaojin Gong, and Zhiyu Xiang. "UnVELO: Unsupervised Vision-Enhanced LiDAR Odometry with Online Correction." Sensors 23, no. 8 (2023): 3967. http://dx.doi.org/10.3390/s23083967.

Full text
Abstract:
Due to the complementary characteristics of visual and LiDAR information, these two modalities have been fused to facilitate many vision tasks. However, current studies of learning-based odometries mainly focus on either the visual or LiDAR modality, leaving visual–LiDAR odometries (VLOs) under-explored. This work proposes a new method to implement an unsupervised VLO, which adopts a LiDAR-dominant scheme to fuse the two modalities. We, therefore, refer to it as unsupervised vision-enhanced LiDAR odometry (UnVELO). It converts 3D LiDAR points into a dense vertex map via spherical projection an
APA, Harvard, Vancouver, ISO, and other styles
9

Liang, Junling, Heng Li, Fei Xu, et al. "A Fast Deployable Instance Elimination Segmentation Algorithm Based on Watershed Transform for Dense Cereal Grain Images." Agriculture 12, no. 9 (2022): 1486. http://dx.doi.org/10.3390/agriculture12091486.

Full text
Abstract:
Cereal grains are a vital part of the human diet. The appearance quality and size distribution of cereal grains play major roles as deciders or indicators of market acceptability, storage stability, and breeding. Computer vision is popular in completing quality assessment and size analysis tasks, in which an accurate instance segmentation is a key step to guaranteeing the smooth completion of tasks. This study proposes a fast deployable instance segmentation method based on a generative marker-based watershed segmentation algorithm, which combines two strategies (one strategy for optimizing ke
APA, Harvard, Vancouver, ISO, and other styles
10

Wang, Yaming, Minjie Wang, Wenqing Huang, Xiaoping Ye, and Mingfeng Jiang. "Deep Spatial-Temporal Neural Network for Dense Non-Rigid Structure from Motion." Mathematics 10, no. 20 (2022): 3794. http://dx.doi.org/10.3390/math10203794.

Full text
Abstract:
Dense non-rigid structure from motion (NRSfM) has long been a challenge in computer vision because of the vast number of feature points. As neural networks develop rapidly, a novel solution is emerging. However, existing methods ignore the significance of spatial–temporal data and the strong capacity of neural networks for learning. This study proposes a deep spatial–temporal NRSfM framework (DST-NRSfM) and introduces a weighted spatial constraint to further optimize the 3D reconstruction results. Layer normalization layers are applied in dense NRSfM tasks to stop gradient disappearance and ha
APA, Harvard, Vancouver, ISO, and other styles
11

Wei, Guoqiang, Zhizheng Zhang, Cuiling Lan, Yan Lu, and Zhibo Chen. "Active Token Mixer." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 3 (2023): 2759–67. http://dx.doi.org/10.1609/aaai.v37i3.25376.

Full text
Abstract:
The three existing dominant network families, i.e., CNNs, Transformers and MLPs, differ from each other mainly in the ways of fusing spatial contextual information, leaving designing more effective token-mixing mechanisms at the core of backbone architecture development. In this work, we propose an innovative token-mixer, dubbed Active Token Mixer (ATM), to actively incorporate contextual information from other tokens in the global scope into the given query token. This fundamental operator actively predicts where to capture useful contexts and learns how to fuse the captured contexts with the
APA, Harvard, Vancouver, ISO, and other styles
12

Tippetts, Beau, Dah Jye Lee, Kirt Lillywhite, and James K. Archibald. "Hardware-Efficient Design of Real-Time Profile Shape Matching Stereo Vision Algorithm on FPGA." International Journal of Reconfigurable Computing 2014 (2014): 1–12. http://dx.doi.org/10.1155/2014/945926.

Full text
Abstract:
A variety of platforms, such as micro-unmanned vehicles, are limited in the amount of computational hardware they can support due to weight and power constraints. An efficient stereo vision algorithm implemented on an FPGA would be able to minimize payload and power consumption in microunmanned vehicles, while providing 3D information and still leaving computational resources available for other processing tasks. This work presents a hardware design of the efficient profile shape matching stereo vision algorithm. Hardware resource usage is presented for the targeted micro-UV platform, Helio-co
APA, Harvard, Vancouver, ISO, and other styles
13

Xing, Shuli, Marely Lee, and Keun-kwang Lee. "Citrus Pests and Diseases Recognition Model Using Weakly Dense Connected Convolution Network." Sensors 19, no. 14 (2019): 3195. http://dx.doi.org/10.3390/s19143195.

Full text
Abstract:
Pests and diseases can cause severe damage to citrus fruits. Farmers used to rely on experienced experts to recognize them, which is a time consuming and costly process. With the popularity of image sensors and the development of computer vision technology, using convolutional neural network (CNN) models to identify pests and diseases has become a recent trend in the field of agriculture. However, many researchers refer to pre-trained models of ImageNet to execute different recognition tasks without considering their own dataset scale, resulting in a waste of computational resources. In this p
APA, Harvard, Vancouver, ISO, and other styles
14

Hackel, T., N. Savinov, L. Ladicky, J. D. Wegner, K. Schindler, and M. Pollefeys. "SEMANTIC3D.NET: A NEW LARGE-SCALE POINT CLOUD CLASSIFICATION BENCHMARK." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences IV-1/W1 (May 30, 2017): 91–98. http://dx.doi.org/10.5194/isprs-annals-iv-1-w1-91-2017.

Full text
Abstract:
This paper presents a new 3D point cloud classification benchmark data set with over four billion manually labelled points, meant as input for data-hungry (deep) learning methods. We also discuss first submissions to the benchmark that use deep convolutional neural networks (CNNs) as a work horse, which already show remarkable performance improvements over state-of-the-art. CNNs have become the de-facto standard for many tasks in computer vision and machine learning like semantic segmentation or object detection in images, but have no yet led to a true breakthrough for 3D point cloud labelling
APA, Harvard, Vancouver, ISO, and other styles
15

Cai, Pingping, Zhenyao Wu, Xinyi Wu, and Song Wang. "Parametric Surface Constrained Upsampler Network for Point Cloud." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 1 (2023): 250–58. http://dx.doi.org/10.1609/aaai.v37i1.25097.

Full text
Abstract:
Designing a point cloud upsampler, which aims to generate a clean and dense point cloud given a sparse point representation, is a fundamental and challenging problem in computer vision. A line of attempts achieves this goal by establishing a point-to-point mapping function via deep neural networks. However, these approaches are prone to produce outlier points due to the lack of explicit surface-level constraints. To solve this problem, we introduce a novel surface regularizer into the upsampler network by forcing the neural network to learn the underlying parametric surface represented by bicu
APA, Harvard, Vancouver, ISO, and other styles
16

S R, Sreela, and Sumam Mary Idicula. "Dense Model for Automatic Image Description Generation with Game Theoretic Optimization." Information 10, no. 11 (2019): 354. http://dx.doi.org/10.3390/info10110354.

Full text
Abstract:
Due to the rapid growth of deep learning technologies, automatic image description generation is an interesting problem in computer vision and natural language generation. It helps to improve access to photo collections on social media and gives guidance for visually impaired people. Currently, deep neural networks play a vital role in computer vision and natural language processing tasks. The main objective of the work is to generate the grammatically correct description of the image using the semantics of the trained captions. An encoder-decoder framework using the deep neural system is used
APA, Harvard, Vancouver, ISO, and other styles
17

Cho, Hoonhee, and Kuk-Jin Yoon. "Event-Image Fusion Stereo Using Cross-Modality Feature Propagation." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 1 (2022): 454–62. http://dx.doi.org/10.1609/aaai.v36i1.19923.

Full text
Abstract:
Event cameras asynchronously output the polarity values of pixel-level log intensity alterations. They are robust against motion blur and can be adopted in challenging light conditions. Owing to these advantages, event cameras have been employed in various vision tasks such as depth estimation, visual odometry, and object detection. In particular, event cameras are effective in stereo depth estimation to find correspondence points between two cameras under challenging illumination conditions and/or fast motion. However, because event cameras provide spatially sparse event stream data, it is dif
APA, Harvard, Vancouver, ISO, and other styles
18

Zhou, Wei, Ziheng Qian, Xinyuan Ni, Yujun Tang, Hanming Guo, and Songlin Zhuang. "Dense Convolutional Neural Network for Identification of Raman Spectra." Sensors 23, no. 17 (2023): 7433. http://dx.doi.org/10.3390/s23177433.

Full text
Abstract:
The rapid development of cloud computing and deep learning makes the intelligent modes of applications widespread in various fields. The identification of Raman spectra can be realized in the cloud, due to its powerful computing, abundant spectral databases and advanced algorithms. Thus, it can reduce the dependence on the performance of the terminal instruments. However, the complexity of the detection environment can cause great interferences, which might significantly decrease the identification accuracies of algorithms. In this paper, a deep learning algorithm based on the Dense network ha
APA, Harvard, Vancouver, ISO, and other styles
19

Karadeniz, Ahmet Serdar, Mehmet Fatih Karadeniz, Gerhard Wilhelm Weber, and Ismail Husein. "IMPROVING CNN FEATURES FOR FACIAL EXPRESSION RECOGNITION." ZERO: Jurnal Sains, Matematika dan Terapan 3, no. 1 (2019): 1. http://dx.doi.org/10.30829/zero.v3i1.5881.

Full text
Abstract:
<span class="fontstyle0">Abstract </span><span class="fontstyle2">Facial expression recognition is one of the challenging tasks in computer<br />vision. In this paper, we analyzed and improved the performances both<br />handcrafted features and deep features extracted by Convolutional Neural<br />Network (CNN). Eigenfaces, HOG, Dense-SIFT were used as handcrafted features.<br />Additionally, we developed features based on the distances between facial<br />landmarks and SIFT descriptors around the centroids of the facial landmarks,<br />lead
APA, Harvard, Vancouver, ISO, and other styles
20

Ley, Pia, Davide Bottari, Bhamy Hariprasad Shenoy, Ramesh Kekunnaya, and Brigitte Roeder. "Restricted recovery of external remapping of tactile stimuli after restoring vision in a congenitally blind man." Seeing and Perceiving 25 (2012): 190. http://dx.doi.org/10.1163/187847612x648198.

Full text
Abstract:
People with surgically removed congenital dense bilateral cataracts offer a natural model of visual deprivation and reafferentation in humans to investigate sensitive periods of multisensory development, for example regarding the recruitment of external or anatomical frames of reference for spatial representation. Here we present a single case (HS; male; 33 years; right-handed), born with congenital dense bilateral cataracts. His lenses were removed at the age of two years, but he received optical aids only at age six. At time of testing, his visual acuity was 30% in the best eye. We performed
APA, Harvard, Vancouver, ISO, and other styles
21

Park, Soya, Jonathan Bragg, Michael Chang, Kevin Larson, and Danielle Bragg. "Exploring Team-Sourced Hyperlinks to Address Navigation Challenges for Low-Vision Readers of Scientific Papers." Proceedings of the ACM on Human-Computer Interaction 6, CSCW2 (2022): 1–23. http://dx.doi.org/10.1145/3555629.

Full text
Abstract:
Reading academic papers is a fundamental part of higher education and research, but navigating these information-dense texts can be challenging. In particular, low-vision readers using magnification encounter additional barriers to quickly skimming and visually locating information. In this work, we explored the design of interfaces to enable readers to: 1) navigate papers more easily, and 2) input the required navigation hooks that AI cannot currently automate. To explore this design space, we ran two exploratory studies. The first focused on current practices of low-vision paper readers, the
APA, Harvard, Vancouver, ISO, and other styles
22

Wei, Shuangfeng, Shangxing Wang, Hao Li, Guangzu Liu, Tong Yang, and Changchang Liu. "A Semantic Information-Based Optimized vSLAM in Indoor Dynamic Environments." Applied Sciences 13, no. 15 (2023): 8790. http://dx.doi.org/10.3390/app13158790.

Full text
Abstract:
In unknown environments, mobile robots can use visual-based Simultaneous Localization and Mapping (vSLAM) to complete positioning tasks while building sparse feature maps and dense maps. However, the traditional vSLAM works in the hypothetical environment of static scenes and rarely considers the dynamic objects existing in the actual scenes. In addition, it is difficult for the robot to perform high-level semantic tasks due to its inability to obtain semantic information from sparse feature maps and dense maps. In order to improve the ability of environment perception and accuracy of mapping
APA, Harvard, Vancouver, ISO, and other styles
23

Zhao, Yinuo, Kun Wu, Zhiyuan Xu, et al. "CADRE: A Cascade Deep Reinforcement Learning Framework for Vision-Based Autonomous Urban Driving." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (2022): 3481–89. http://dx.doi.org/10.1609/aaai.v36i3.20259.

Full text
Abstract:
Vision-based autonomous urban driving in dense traffic is quite challenging due to the complicated urban environment and the dynamics of the driving behaviors. Widely-applied methods either heavily rely on hand-crafted rules or learn from limited human experience, which makes them hard to generalize to rare but critical scenarios. In this paper, we present a novel CAscade Deep REinforcement learning framework, CADRE, to achieve model-free vision-based autonomous urban driving. In CADRE, to derive representative latent features from raw observations, we first offline train a Co-attention Percep
APA, Harvard, Vancouver, ISO, and other styles
24

Huang, X., R. Qin, and M. Chen. "DISPARITY REFINEMENT OF BUILDING EDGES USING ROBUSTLY MATCHED STRAIGHT LINES FOR STEREO MATCHING." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences IV-1 (September 26, 2018): 77–84. http://dx.doi.org/10.5194/isprs-annals-iv-1-77-2018.

Full text
Abstract:
<p><strong>Abstract.</strong> Stereo dense matching has already been one of the dominant tools in 3D reconstruction of urban regions, due to its low cost and high flexibility in generating 3D points. However, the image-derived 3D points are often inaccurate around building edges, which limit its use in several vision tasks (e.g. building modelling). To generate 3D point clouds or digital surface models (DSM) with sharp boundaries, this paper integrates robustly matched lines for improving dense matching, and proposes a non-local disparity refinement of building edges through
APA, Harvard, Vancouver, ISO, and other styles
25

Deschaud, Jean-Emmanuel, David Duque, Jean Pierre Richa, Santiago Velasco-Forero, Beatriz Marcotegui, and François Goulette. "Paris-CARLA-3D: A Real and Synthetic Outdoor Point Cloud Dataset for Challenging Tasks in 3D Mapping." Remote Sensing 13, no. 22 (2021): 4713. http://dx.doi.org/10.3390/rs13224713.

Full text
Abstract:
Paris-CARLA-3D is a dataset of several dense colored point clouds of outdoor environments built by a mobile LiDAR and camera system. The data are composed of two sets with synthetic data from the open source CARLA simulator (700 million points) and real data acquired in the city of Paris (60 million points), hence the name Paris-CARLA-3D. One of the advantages of this dataset is to have simulated the same LiDAR and camera platform in the open source CARLA simulator as the one used to produce the real data. In addition, manual annotation of the classes using the semantic tags of CARLA was perfo
APA, Harvard, Vancouver, ISO, and other styles
26

Lu, Nai Guang, Ming Li Dong, P. Sun, and J. W. Guo. "A Point Matching Method for Stereovision Measurement." Key Engineering Materials 381-382 (June 2008): 305–8. http://dx.doi.org/10.4028/www.scientific.net/kem.381-382.305.

Full text
Abstract:
Many vision tasks such as 3D measurement, scene reconstruction, object recognition, etc., rely on feature correspondence among images. This paper presents a point matching method for 3D surface measurement. The procedure of the method is as follows: (1) rectification for stereo image pairs; (2) computation of epipolar lines; (3) sequential matching in vertical direction; (4) sequential matching in horizontal direction. The fourth step is performed to deal with the ambiguity in dense areas where points have closer vertical coordinates. In the fourth step a threshold limit of vertical coordinate
APA, Harvard, Vancouver, ISO, and other styles
27

Li, Yongbo, Yuanyuan Ma, Wendi Cai, Zhongzhao Xie, and Tao Zhao. "Complementary Convolution Residual Networks for Semantic Segmentation in Street Scenes with Deep Gaussian CRF." Journal of Advanced Computational Intelligence and Intelligent Informatics 25, no. 1 (2021): 3–12. http://dx.doi.org/10.20965/jaciii.2021.p0003.

Full text
Abstract:
To understand surrounding scenes accurately, the semantic segmentation of images is vital in autonomous driving tasks, such as navigation, and route planning. Currently, convolutional neural networks (CNN) are widely employed in semantic segmentation to perform precise prediction in the dense pixel level. A recent trend in network design is the stacking of small convolution kernels. In this work, small convolution kernels (3 × 3) are decomposed into complementary convolution kernels (1 × 3 + 3 × 1, 3 × 1 + 1 × 3), the complementary small convolution kernels perform better in the classification
APA, Harvard, Vancouver, ISO, and other styles
28

Xia, Y., P. d’Angelo, J. Tian, and P. Reinartz. "DENSE MATCHING COMPARISON BETWEEN CLASSICAL AND DEEP LEARNING BASED ALGORITHMS FOR REMOTE SENSING DATA." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B2-2020 (August 12, 2020): 521–25. http://dx.doi.org/10.5194/isprs-archives-xliii-b2-2020-521-2020.

Full text
Abstract:
Abstract. Deep learning and convolutional neural networks (CNN) have obtained a great success in image processing, by means of its powerful feature extraction ability to learn specific tasks. Many deep learning based algorithms have been developed for dense image matching, which is a hot topic in the community of computer vision. These methods are tested for close-range or street-view stereo data, however, not well studied with remote sensing datasets, including aerial and satellite data. As more high-quality datasets are collected by recent airborne and spaceborne sensors, it is necessary to
APA, Harvard, Vancouver, ISO, and other styles
29

Liu, Qi, Shibiao Xu, Jun Xiao, and Ying Wang. "Sharp Feature-Preserving 3D Mesh Reconstruction from Point Clouds Based on Primitive Detection." Remote Sensing 15, no. 12 (2023): 3155. http://dx.doi.org/10.3390/rs15123155.

Full text
Abstract:
High-fidelity mesh reconstruction from point clouds has long been a fundamental research topic in computer vision and computer graphics. Traditional methods require dense triangle meshes to achieve high fidelity, but excessively dense triangles may lead to unnecessary storage and computational burdens, while also struggling to capture clear, sharp, and continuous edges. This paper argues that the key to high-fidelity reconstruction lies in preserving sharp features. Therefore, we introduce a novel sharp-feature-preserving reconstruction framework based on primitive detection. It includes an im
APA, Harvard, Vancouver, ISO, and other styles
30

Bello, R. W., E. S. Ikeremo, F. N. Otobo, D. A. Olubummo, and O. C. Enuma. "Cattle Segmentation and Contour Detection Based on Solo for Precision Livestock Husbandry." Journal of Applied Sciences and Environmental Management 26, no. 10 (2022): 1713–20. http://dx.doi.org/10.4314/jasem.v26i10.15.

Full text
Abstract:
Segmenting objects such as herd of cattle in natural and cluttered images is among the herculean dense prediction tasks of computer vision application to agriculture. To achieve the segmentation goal, we based the segmentation on the model of single objects by locations (SOLO) which is capable of exploiting the contextual cues and segmenting individual cattle by their locations and sizes. For its simple approach to instance segmentation with the use of instance categories, SOLO outperforms Mask R-CNN which uses detect-then-segment approach to predict a mask for each instance of cattle. The mod
APA, Harvard, Vancouver, ISO, and other styles
31

Hu, Shiyong, Jia Yan, and Dexiang Deng. "Contextual Information Aided Generative Adversarial Network for Low-Light Image Enhancement." Electronics 11, no. 1 (2021): 32. http://dx.doi.org/10.3390/electronics11010032.

Full text
Abstract:
Low-light image enhancement has been gradually becoming a hot research topic in recent years due to its wide usage as an important pre-processing step in computer vision tasks. Although numerous methods have achieved promising results, some of them still generate results with detail loss and local distortion. In this paper, we propose an improved generative adversarial network based on contextual information. Specifically, residual dense blocks are adopted in the generator to promote hierarchical feature interaction across multiple layers and enhance features at multiple depths in the network.
APA, Harvard, Vancouver, ISO, and other styles
32

Wieczorek, Grzegorz, Sheikh Badar ud din Tahir, Israr Akhter, and Jaroslaw Kurek. "Vehicle Detection and Recognition Approach in Multi-Scale Traffic Monitoring System via Graph-Based Data Optimization." Sensors 23, no. 3 (2023): 1731. http://dx.doi.org/10.3390/s23031731.

Full text
Abstract:
Over the past few years, significant investments in smart traffic monitoring systems have been made. The most important step in machine learning is detecting and recognizing objects relative to vehicles. Due to variations in vision and different lighting conditions, the recognition and tracking of vehicles under varying extreme conditions has become one of the most challenging tasks. To deal with this, our proposed system presents an adaptive method for robustly recognizing several existing automobiles in dense traffic settings. Additionally, this research presents a broad framework for effect
APA, Harvard, Vancouver, ISO, and other styles
33

Pérez, Javier, Mitch Bryson, Stefan B. Williams, and Pedro J. Sanz. "Recovering Depth from Still Images for Underwater Dehazing Using Deep Learning." Sensors 20, no. 16 (2020): 4580. http://dx.doi.org/10.3390/s20164580.

Full text
Abstract:
Estimating depth from a single image is a challenging problem, but it is also interesting due to the large amount of applications, such as underwater image dehazing. In this paper, a new perspective is provided; by taking advantage of the underwater haze that may provide a strong cue to the depth of the scene, a neural network can be used to estimate it. Using this approach the depthmap can be used in a dehazing method to enhance the image and recover original colors, offering a better input to image recognition algorithms and, thus, improving the robot performance during vision-based tasks su
APA, Harvard, Vancouver, ISO, and other styles
34

Zhang, Dongdong, Chunping Wang, and Qiang Fu. "CAFC-Net: A Critical and Align Feature Constructing Network for Oriented Ship Detection in Aerial Images." Computational Intelligence and Neuroscience 2022 (February 24, 2022): 1–11. http://dx.doi.org/10.1155/2022/3391391.

Full text
Abstract:
Ship detection is one of the fundamental tasks in computer vision. In recent years, the methods based on convolutional neural networks have made great progress. However, improvement of ship detection in aerial images is limited by large-scale variation, aspect ratio, and dense distribution. In this paper, a Critical and Align Feature Constructing Network (CAFC-Net) which is an end-to-end single-stage rotation detector is proposed to improve ship detection accuracy. The framework is formed by three modules: a Biased Attention Module (BAM), a Feature Alignment Module (FAM), and a Distinctive Det
APA, Harvard, Vancouver, ISO, and other styles
35

Bhanushali, Darshan, Robert Relyea, Karan Manghi, et al. "LiDAR-Camera Fusion for 3D Object Detection." Electronic Imaging 2020, no. 16 (2020): 257–1. http://dx.doi.org/10.2352/issn.2470-1173.2020.16.avm-255.

Full text
Abstract:
The performance of autonomous agents in both commercial and consumer applications increases along with their situational awareness. Tasks such as obstacle avoidance, agent to agent interaction, and path planning are directly dependent upon their ability to convert sensor readings into scene understanding. Central to this is the ability to detect and recognize objects. Many object detection methodologies operate on a single modality such as vision or LiDAR. Camera-based object detection models benefit from an abundance of feature-rich information for classifying different types of objects. LiDA
APA, Harvard, Vancouver, ISO, and other styles
36

Naik, Prof Shruti P., Vishal Lohbande, Shreyas Hambir, Rohit Korade, and Rahul Hatkar. "Fire Detection with Image Processing." International Journal for Research in Applied Science and Engineering Technology 11, no. 5 (2023): 2123–28. http://dx.doi.org/10.22214/ijraset.2023.52073.

Full text
Abstract:
Abstract: Convolutional neural networks (CNNs) have yielded state-of-theart performance in image classification and other computer vision tasks. Their application in fire detection systems will substantially improve detection accuracy, which will eventually minimize fire disasters and reduce the ecological and social ramifications. However, the major concern with CNNbased fire detection systems is their implementation in realworld surveillance networks, due to their high memory and computational requirements for inference. In this paper, we propose an original, energy-friendly, and computation
APA, Harvard, Vancouver, ISO, and other styles
37

Fennimore, Steven A., David C. Slaughter, Mark C. Siemens, Ramon G. Leon, and Mazin N. Saber. "Technology for Automation of Weed Control in Specialty Crops." Weed Technology 30, no. 4 (2016): 823–37. http://dx.doi.org/10.1614/wt-d-16-00070.1.

Full text
Abstract:
Specialty crops, like flowers, herbs, and vegetables, generally do not have an adequate spectrum of herbicide chemistries to control weeds and have been dependent on hand weeding to achieve commercially acceptable weed control. However, labor shortages have led to higher costs for hand weeding. There is a need to develop labor-saving technologies for weed control in specialty crops if production costs are to be contained. Machine vision technology, together with data processors, have been developed to enable commercial machines to recognize crop row patterns and control automated devices that
APA, Harvard, Vancouver, ISO, and other styles
38

Naik, Prof Shruti P., Shreyas Hambir, Vishal Lohbande, Rohit Korade, and Rahul Hatkar. "Fire Detection with Image Processing." International Journal for Research in Applied Science and Engineering Technology 11, no. 2 (2023): 321–24. http://dx.doi.org/10.22214/ijraset.2023.49014.

Full text
Abstract:
Abstract: Convolutional neural networks (CNNs) have yielded state-of-theart performance in image classification and other computer vision tasks. Their application in fire detection systems will substantially improve detection accuracy, which will eventually minimize fire disasters and reduce the ecological and social ramifications. However, the major concern with CNNbased fire detection systems is their implementation in real-world surveillance networks, due to their high memory and computational requirements for inference. In this paper, we propose an original, energy-friendly, and computatio
APA, Harvard, Vancouver, ISO, and other styles
39

Song, Kechen, Yiming Zhang, Yanqi Bao, Ying Zhao, and Yunhui Yan. "Self-Enhanced Mixed Attention Network for Three-Modal Images Few-Shot Semantic Segmentation." Sensors 23, no. 14 (2023): 6612. http://dx.doi.org/10.3390/s23146612.

Full text
Abstract:
As an important computer vision technique, image segmentation has been widely used in various tasks. However, in some extreme cases, the insufficient illumination would result in a great impact on the performance of the model. So more and more fully supervised methods use multi-modal images as their input. The dense annotated large datasets are difficult to obtain, but the few-shot methods still can have satisfactory results with few pixel-annotated samples. Therefore, we propose the Visible-Depth-Thermal (three-modal) images few-shot semantic segmentation method. It utilizes the homogeneous i
APA, Harvard, Vancouver, ISO, and other styles
40

Ververas, Evangelos, and Stefanos Zafeiriou. "SliderGAN: Synthesizing Expressive Face Images by Sliding 3D Blendshape Parameters." International Journal of Computer Vision 128, no. 10-11 (2020): 2629–50. http://dx.doi.org/10.1007/s11263-020-01338-7.

Full text
Abstract:
Abstract Image-to-image (i2i) translation is the dense regression problem of learning how to transform an input image into an output using aligned image pairs. Remarkable progress has been made in i2i translation with the advent of deep convolutional neural networks and particular using the learning paradigm of generative adversarial networks (GANs). In the absence of paired images, i2i translation is tackled with one or multiple domain transformations (i.e., CycleGAN, StarGAN etc.). In this paper, we study the problem of image-to-image translation, under a set of continuous parameters that co
APA, Harvard, Vancouver, ISO, and other styles
41

Huang, S., F. Nex, Y. Lin, and M. Y. Yang. "SEMANTIC SEGMENTATION OF BUILDING IN AIRBORNE IMAGES." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W13 (June 4, 2019): 35–42. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w13-35-2019.

Full text
Abstract:
<p><strong>Abstract.</strong> Building is a key component to the reconstructing of LoD3 city modelling. Compared to terrestrial view, airborne datasets have more occlusions at street level but can cover larger area in the urban areas. With the popularity of the Deep Learning, many tasks in the field of computer vision can be solved in easier and efficiency way. In this paper, we propose a method to apply deep neural networks to building façade segmentation. In particular, the FC-DenseNet and the DeepLabV3+ algorithms are used to segment the building from airborne images and g
APA, Harvard, Vancouver, ISO, and other styles
42

Xiao, Feng, Haibin Wang, Yueqin Xu, and Ruiqing Zhang. "Fruit Detection and Recognition Based on Deep Learning for Automatic Harvesting: An Overview and Review." Agronomy 13, no. 6 (2023): 1625. http://dx.doi.org/10.3390/agronomy13061625.

Full text
Abstract:
Continuing progress in machine learning (ML) has led to significant advancements in agricultural tasks. Due to its strong ability to extract high-dimensional features from fruit images, deep learning (DL) is widely used in fruit detection and automatic harvesting. Convolutional neural networks (CNN) in particular have demonstrated the ability to attain accuracy and speed levels comparable to those of humans in some fruit detection and automatic harvesting fields. This paper presents a comprehensive overview and review of fruit detection and recognition based on DL for automatic harvesting from
APA, Harvard, Vancouver, ISO, and other styles
43

Shchetinin, E. Yu. "ON AUTOMATIC DETECTION OF ANOMALIES IN ELECTROCARDIOGRAMMS WITH GENERATIVE MACHINE LEARNING." Vestnik komp'iuternykh i informatsionnykh tekhnologii, no. 216 (June 2022): 51–59. http://dx.doi.org/10.14489/vkit.2022.06.pp.051-059.

Full text
Abstract:
Anomaly detection is an important area of application of artificial intelligence in various areas of large data analysis, such as computer system security, fraud detection in bank transfers, reliability of computer vision systems and others. The detection of anomalies is also a key task of the analysis of biomedical information, since the violation of the stability of the recognition systems of dangerous diseases based on the analysis of biomedical signals and MRI, CT images, for example, can lead to erroneous diagnosis of patients. One of the main problems in machine learning and data analysi
APA, Harvard, Vancouver, ISO, and other styles
44

Lemenkova, Polina, and Olivier Debeir. "Recognizing the Wadi Fluvial Structure and Stream Network in the Qena Bend of the Nile River, Egypt, on Landsat 8-9 OLI Images." Information 14, no. 4 (2023): 249. http://dx.doi.org/10.3390/info14040249.

Full text
Abstract:
With methods for processing remote sensing data becoming widely available, the ability to quantify changes in spatial data and to evaluate the distribution of diverse landforms across target areas in datasets becomes increasingly important. One way to approach this problem is through satellite image processing. In this paper, we primarily focus on the methods of the unsupervised classification of the Landsat OLI/TIRS images covering the region of the Qena governorate in Upper Egypt. The Qena Bend of the Nile River presents a remarkable morphological feature in Upper Egypt, including a dense dr
APA, Harvard, Vancouver, ISO, and other styles
45

Dua, Sakshi, Sethuraman Sambath Kumar, Yasser Albagory, et al. "Developing a Speech Recognition System for Recognizing Tonal Speech Signals Using a Convolutional Neural Network." Applied Sciences 12, no. 12 (2022): 6223. http://dx.doi.org/10.3390/app12126223.

Full text
Abstract:
Deep learning-based machine learning models have shown significant results in speech recognition and numerous vision-related tasks. The performance of the present speech-to-text model relies upon the hyperparameters used in this research work. In this research work, it is shown that convolutional neural networks (CNNs) can model raw and tonal speech signals. Their performance is on par with existing recognition systems. This study extends the role of the CNN-based approach to robust and uncommon speech signals (tonal) using its own designed database for target research. The main objective of t
APA, Harvard, Vancouver, ISO, and other styles
46

Yan, Xu, Jiantao Gao, Jie Li, et al. "Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 4 (2021): 3101–9. http://dx.doi.org/10.1609/aaai.v35i4.16419.

Full text
Abstract:
LiDAR point cloud analysis is a core task for 3D computer vision, especially for autonomous driving. However, due to the severe sparsity and noise interference in the single sweep LiDAR point cloud, the accurate semantic segmentation is non-trivial to achieve. In this paper, we propose a novel sparse LiDAR point cloud semantic segmentation framework assisted by learned contextual shape priors. In practice, an initial semantic segmentation (SS) of a single sweep point cloud can be achieved by any appealing network and then flows into the semantic scene completion (SSC) module as the input. By m
APA, Harvard, Vancouver, ISO, and other styles
47

Maurer, M., M. Hofer, F. Fraundorfer, and H. Bischof. "AUTOMATED INSPECTION OF POWER LINE CORRIDORS TO MEASURE VEGETATION UNDERCUT USING UAV-BASED IMAGES." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences IV-2/W3 (August 18, 2017): 33–40. http://dx.doi.org/10.5194/isprs-annals-iv-2-w3-33-2017.

Full text
Abstract:
Power line corridor inspection is a time consuming task that is performed mostly manually. As the development of UAVs made huge progress in recent years, and photogrammetric computer vision systems became well established, it is time to further automate inspection tasks. In this paper we present an automated processing pipeline to inspect vegetation undercuts of power line corridors. For this, the area of inspection is reconstructed, geo-referenced, semantically segmented and inter class distance measurements are calculated. The presented pipeline performs an automated selection of the proper
APA, Harvard, Vancouver, ISO, and other styles
48

Chen, Xiao, Mujiahui Yuan, Chenye Fan, Xingwu Chen, Yaan Li, and Haiyan Wang. "Research on an Underwater Object Detection Network Based on Dual-Branch Feature Extraction." Electronics 12, no. 16 (2023): 3413. http://dx.doi.org/10.3390/electronics12163413.

Full text
Abstract:
Underwater object detection is challenging in computer vision research due to the complex underwater environment, poor image quality, and varying target scales, making it difficult for existing object detection networks to achieve high accuracy in underwater tasks. To address the issues of limited data and multi-scale targets in underwater detection, we propose a Dual-Branch Underwater Object Detection Network (DB-UODN) based on dual-branch feature extraction. In the feature extraction stage, we design a dual-branch structure by combining the You Only Look Once (YOLO) v7 backbone with the Enha
APA, Harvard, Vancouver, ISO, and other styles
49

Tsourounis, Dimitrios, Dimitris Kastaniotis, Christos Theoharatos, Andreas Kazantzidis, and George Economou. "SIFT-CNN: When Convolutional Neural Networks Meet Dense SIFT Descriptors for Image and Sequence Classification." Journal of Imaging 8, no. 10 (2022): 256. http://dx.doi.org/10.3390/jimaging8100256.

Full text
Abstract:
Despite the success of hand-crafted features in computer visioning for many years, nowadays, this has been replaced by end-to-end learnable features that are extracted from deep convolutional neural networks (CNNs). Whilst CNNs can learn robust features directly from image pixels, they require large amounts of samples and extreme augmentations. On the contrary, hand-crafted features, like SIFT, exhibit several interesting properties as they can provide local rotation invariance. In this work, a novel scheme combining the strengths of SIFT descriptors with CNNs, namely SIFT-CNN, is presented. G
APA, Harvard, Vancouver, ISO, and other styles
50

Jiang, S., W. Yao, and M. Heurich. "DEAD WOOD DETECTION BASED ON SEMANTIC SEGMENTATION OF VHR AERIAL CIR IMAGERY USING OPTIMIZED FCN-DENSENET." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W16 (September 17, 2019): 127–33. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w16-127-2019.

Full text
Abstract:
<p><strong>Abstract.</strong> The assessment of the forests’ health conditions is an important task for biodiversity, forest management, global environment monitoring, and carbon dynamics. Several research works were proposed to evaluate the state condition of a forest based on remote sensing technology. Concerning existing technologies, employing traditional machine learning approaches to detect the dead wood in aerial colour-infrared (CIR) imagery is one of the major trends due to its spectral capability to explicitly capturing vegetation health conditions. However, the com
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!