To see the other types of publications on this topic, follow the link: Image representation methods.

Journal articles on the topic 'Image representation methods'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Image representation methods.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

HU, CHAO, LI LIU, BO SUN, and MAX Q. H. MENG. "COMPACT REPRESENTATION AND PANORAMIC REPRESENTATION FOR CAPSULE ENDOSCOPE IMAGES." International Journal of Information Acquisition 06, no. 04 (December 2009): 257–68. http://dx.doi.org/10.1142/s0219878909001989.

Full text
Abstract:
A capsule endoscope robot is a miniature medical instrument for inspection of gastrointestinal tract. In this paper, we present image compact representation and preliminary panoramic representation methods for the capsule endoscope. First, the characteristics of the capsule endoscopic images are investigated and different coordinate representations of the circular image are discussed. Secondly, effective compact representation methods including special DPCM and wavelet compression techniques are applied to the endoscopic images to get high compression ratio and signal to noise ratio. Then, a preliminary approach to panoramic representation of endoscopic images is presented.
APA, Harvard, Vancouver, ISO, and other styles
2

Cortez, Diogo, Paulo Nunes, Manuel Menezes de Sequeira, and Fernando Pereira. "Image segmentation towards new image representation methods." Signal Processing: Image Communication 6, no. 6 (February 1995): 485–98. http://dx.doi.org/10.1016/0923-5965(94)00031-d.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Al-Obaide, Zahraa H., and Ayad A. Al-Ani. "COMPARISON STUDY BETWEEN IMAGE RETRIEVAL METHODS." Iraqi Journal of Information and Communication Technology 5, no. 1 (April 29, 2022): 16–30. http://dx.doi.org/10.31987/ijict.5.1.182.

Full text
Abstract:
Searching for a relevant image in an archive is a problematic research issue for the computer vision research community. The majority of search engines retrieve images using traditional text-based approaches that rely on captions and metadata. Extensive research has been reported in the last two decades for content-based image retrieval (CBIR), analysis, and image classification. Content-Based Image Retrieval is a process that provides a framework for image search, and low-level visual features are commonly used to retrieve the images from the image database. The essential requirement in any image retrieval process is to sort the images with a close similarity in terms of visual appearance. The shape, color, and texture are examples of low-level image features. In image classification-based models and CBIR, high-level image visuals are expressed in the form of feature vectors made up of numerical values. The researcher's findings a significant gap between human visual comprehension and image feature representation. In this paper, we plan to present a comparison study and a comprehensive overview of the recent developments in the field of CBIR and image representation.
APA, Harvard, Vancouver, ISO, and other styles
4

Choi, Jaewoong, Daeha Kim, and Byung Cheol Song. "Style-Guided and Disentangled Representation for Robust Image-to-Image Translation." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 1 (June 28, 2022): 463–71. http://dx.doi.org/10.1609/aaai.v36i1.19924.

Full text
Abstract:
Recently, various image-to-image translation (I2I) methods have improved mode diversity and visual quality in terms of neural networks or regularization terms. However, conventional I2I methods relies on a static decision boundary and the encoded representations in those methods are entangled with each other, so they often face with ‘mode collapse’ phenomenon. To mitigate mode collapse, 1) we design a so-called style-guided discriminator that guides an input image to the target image style based on the strategy of flexible decision boundary. 2) Also, we make the encoded representations include independent domain attributes. Based on two ideas, this paper proposes Style-Guided and Disentangled Representation for Robust Image-to-Image Translation (SRIT). SRIT showed outstanding FID by 8%, 22.8%, and 10.1% for CelebA-HQ, AFHQ, and Yosemite datasets, respectively. The translated images of SRIT reflect the styles of target domain successfully. This indicates that SRIT shows better mode diversity than previous works.
APA, Harvard, Vancouver, ISO, and other styles
5

Lu, Jiahao, Johan Öfverstedt, Joakim Lindblad, and Nataša Sladoje. "Is image-to-image translation the panacea for multimodal image registration? A comparative study." PLOS ONE 17, no. 11 (November 28, 2022): e0276196. http://dx.doi.org/10.1371/journal.pone.0276196.

Full text
Abstract:
Despite current advancement in the field of biomedical image processing, propelled by the deep learning revolution, multimodal image registration, due to its several challenges, is still often performed manually by specialists. The recent success of image-to-image (I2I) translation in computer vision applications and its growing use in biomedical areas provide a tempting possibility of transforming the multimodal registration problem into a, potentially easier, monomodal one. We conduct an empirical study of the applicability of modern I2I translation methods for the task of rigid registration of multimodal biomedical and medical 2D and 3D images. We compare the performance of four Generative Adversarial Network (GAN)-based I2I translation methods and one contrastive representation learning method, subsequently combined with two representative monomodal registration methods, to judge the effectiveness of modality translation for multimodal image registration. We evaluate these method combinations on four publicly available multimodal (2D and 3D) datasets and compare with the performance of registration achieved by several well-known approaches acting directly on multimodal image data. Our results suggest that, although I2I translation may be helpful when the modalities to register are clearly correlated, registration of modalities which express distinctly different properties of the sample are not well handled by the I2I translation approach. The evaluated representation learning method, which aims to find abstract image-like representations of the information shared between the modalities, manages better, and so does the Mutual Information maximisation approach, acting directly on the original multimodal images. We share our complete experimental setup as open-source (https://github.com/MIDA-group/MultiRegEval), including method implementations, evaluation code, and all datasets, for further reproducing and benchmarking.
APA, Harvard, Vancouver, ISO, and other styles
6

RIZO-RODRÍGUEZ, DAYRON, HEYDI MÉNDEZ-VAZQUEZ, and EDEL GARCÍA-REYES. "ILLUMINATION INVARIANT FACE RECOGNITION IN QUATERNION DOMAIN." International Journal of Pattern Recognition and Artificial Intelligence 27, no. 03 (May 2013): 1360004. http://dx.doi.org/10.1142/s0218001413600045.

Full text
Abstract:
The performance of face recognition systems tends to decrease when images are affected by illumination. Feature extraction is one of the main steps of a face recognition process, where it is possible to alleviate the illumination effects on face images. In order to increase the accuracy of recognition tasks, different methods for obtaining illumination invariant features have been developed. The aim of this work is to compare two different ways to represent face image descriptions in terms of their illumination invariant properties for face recognition. The first representation is constructed following the structure of complex numbers and the second one is based on quaternion numbers. Using four different face description approaches both representations are constructed, transformed into frequency domain and expressed in polar coordinates. The most illumination invariant component of each frequency domain representation is determined and used as the representative information of the face image. Verification and identification experiments are then performed in order to compare the discriminative power of the selected components. Representative component of the quaternion representation overcame the complex one.
APA, Harvard, Vancouver, ISO, and other styles
7

Lu, Xuchao, Li Song, Rong Xie, Xiaokang Yang, and Wenjun Zhang. "Deep Binary Representation for Efficient Image Retrieval." Advances in Multimedia 2017 (2017): 1–10. http://dx.doi.org/10.1155/2017/8961091.

Full text
Abstract:
With the fast growing number of images uploaded every day, efficient content-based image retrieval becomes important. Hashing method, which means representing images in binary codes and using Hamming distance to judge similarity, is widely accepted for its advantage in storage and searching speed. A good binary representation method for images is the determining factor of image retrieval. In this paper, we propose a new deep hashing method for efficient image retrieval. We propose an algorithm to calculate the target hash code which indicates the relationship between images of different contents. Then the target hash code is fed to the deep network for training. Two variants of deep network, DBR and DBR-v3, are proposed for different size and scale of image database. After training, our deep network can produce hash codes with large Hamming distance for images of different contents. Experiments on standard image retrieval benchmarks show that our method outperforms other state-of-the-art methods including unsupervised, supervised, and deep hashing methods.
APA, Harvard, Vancouver, ISO, and other styles
8

Bouarara, Hadj Ahmed, and Yasmin Bouarara. "Swarm Intelligence Methods for Unsupervised Images Classification." International Journal of Organizational and Collective Intelligence 6, no. 2 (April 2016): 50–74. http://dx.doi.org/10.4018/ijoci.2016040104.

Full text
Abstract:
Nowadays, Google estimates that more than 1000 billion the number of images on the internet where the classification of this type of data represents a big problem in the scientific community. Several techniques have been proposed belonging to the world of image-mining. The substance of our work is the application of swarm intelligence methods for the unsupervised image classification (UIC) problem following four steps: image digitalization by developing a new representation approach in order to transform each image into a set of term (set of pixels); image clustering using three methods: firstly a distances combination by social worker bees (DC-SWBs) based on the principle of filtering where each image must successfully pass three filters, secondly Artificial social spiders (ASS) method based on the silky structure and the principle of weaving and the third method called artificial immune system (AIS); For the authors' experiment they use the benchmark MuHavi with changing for each test the configuration (image representation, distance measures and threshold).
APA, Harvard, Vancouver, ISO, and other styles
9

Cohen, Ido, Eli David, and Nathan Netanyahu. "Supervised and Unsupervised End-to-End Deep Learning for Gene Ontology Classification of Neural In Situ Hybridization Images." Entropy 21, no. 3 (February 26, 2019): 221. http://dx.doi.org/10.3390/e21030221.

Full text
Abstract:
In recent years, large datasets of high-resolution mammalian neural images have become available, which has prompted active research on the analysis of gene expression data. Traditional image processing methods are typically applied for learning functional representations of genes, based on their expressions in these brain images. In this paper, we describe a novel end-to-end deep learning-based method for generating compact representations of in situ hybridization (ISH) images, which are invariant-to-translation. In contrast to traditional image processing methods, our method relies, instead, on deep convolutional denoising autoencoders (CDAE) for processing raw pixel inputs, and generating the desired compact image representations. We provide an in-depth description of our deep learning-based approach, and present extensive experimental results, demonstrating that representations extracted by CDAE can help learn features of functional gene ontology categories for their classification in a highly accurate manner. Our methods improve the previous state-of-the-art classification rate (Liscovitch, et al.) from an average AUC of 0.92 to 0.997, i.e., it achieves 96% reduction in error rate. Furthermore, the representation vectors generated due to our method are more compact in comparison to previous state-of-the-art methods, allowing for a more efficient high-level representation of images. These results are obtained with significantly downsampled images in comparison to the original high-resolution ones, further underscoring the robustness of our proposed method.
APA, Harvard, Vancouver, ISO, and other styles
10

Li, Fengpeng, Jiabao Li, Wei Han, Ruyi Feng, and Lizhe Wang. "Unsupervised Representation High-Resolution Remote Sensing Image Scene Classification via Contrastive Learning Convolutional Neural Network." Photogrammetric Engineering & Remote Sensing 87, no. 8 (August 1, 2021): 577–91. http://dx.doi.org/10.14358/pers.87.8.577.

Full text
Abstract:
Inspired by the outstanding achievement of deep learning, supervised deep learning representation methods for high-spatial-resolution remote sensing image scene classification obtained state-of-the-art performance. However, supervised deep learning representation methods need a considerable amount of labeled data to capture class-specific features, limiting the application of deep learning-based methods while there are a few labeled training samples. An unsupervised deep learning representation, high-resolution remote sensing image scene classification method is proposed in this work to address this issue. The proposed method, called contrastive learning, narrows the distance between positive views: color channels belonging to the same images widens the gaps between negative view pairs consisting of color channels from different images to obtain class-specific data representations of the input data without any supervised information. The classifier uses extracted features by the convolutional neural network (CNN)-based feature extractor with labeled information of training data to set space of each category and then, using linear regression, makes predictions in the testing procedure. Comparing with existing unsupervised deep learning representation high-resolution remote sensing image scene classification methods, contrastive learning CNN achieves state-of-the-art performance on three different scale benchmark data sets: small scale RSSCN7 data set, midscale aerial image data set, and large-scale NWPU-RESISC45 data set.
APA, Harvard, Vancouver, ISO, and other styles
11

Çakır, Serdar. "Mel-cepstral feature extraction methods for image representation." Optical Engineering 49, no. 9 (September 1, 2010): 097004. http://dx.doi.org/10.1117/1.3488050.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Zhang, Li Liang, Xi Ling Liu, and Shi Liang Zhang. "An Algorithm for Image Enhancement via Sparse Representation." Applied Mechanics and Materials 556-562 (May 2014): 4806–10. http://dx.doi.org/10.4028/www.scientific.net/amm.556-562.4806.

Full text
Abstract:
This paper presents an approach of enhance images subjective visual quality, based on image sparse representation. Firstly, comparativing and analysing the performance of the current several popular image denoising methods by two kinds of different content image, and using the K-SVD, MB3D and CSR algorithm, we obtain clean images namely the images noise removing. Then, decomposing the already denoised image into both cartoon and texture component by Morphological Component Analysis (MCA ) method, and superresolution the cartoon part and enhance the contrast of the texture in image. Finally, fusion between the cartoon and the texture gain the desired image.
APA, Harvard, Vancouver, ISO, and other styles
13

Peng, Feng, and Kai Li. "Deep Image Clustering Based on Label Similarity and Maximizing Mutual Information across Views." Applied Sciences 13, no. 1 (January 3, 2023): 674. http://dx.doi.org/10.3390/app13010674.

Full text
Abstract:
Most existing deep image clustering methods use only class-level representations for clustering. However, the class-level representation alone is not sufficient to describe the differences between images belonging to the same cluster. This may lead to high intra-class representation differences, which will harm the clustering performance. To address this problem, this paper proposes a clustering model named Deep Image Clustering based on Label Similarity and Maximizing Mutual Information Across Views (DCSM). DCSM consists of a backbone network, class-level and instance-level mapping block. The class-level mapping block learns discriminative class-level features by selecting similar (dissimilar) pairs of samples. The proposed extended mutual information is to maximize the mutual information between features extracted from views that were obtained by using data augmentation on the same image and as a constraint on the instance-level mapping block. This forces the instance-level mapping block to capture high-level features that affect multiple views of the same image, thus reducing intra-class differences. Four representative datasets are selected for our experiments, and the results show that the proposed model is superior to the current advanced image clustering models.
APA, Harvard, Vancouver, ISO, and other styles
14

Yu, Siquan, Jiaxin Liu, Zhi Han, Yong Li, Yandong Tang, and Chengdong Wu. "Representation Learning Based on Autoencoder and Deep Adaptive Clustering for Image Clustering." Mathematical Problems in Engineering 2021 (January 9, 2021): 1–11. http://dx.doi.org/10.1155/2021/3742536.

Full text
Abstract:
Image clustering is a complex procedure, which is significantly affected by the choice of image representation. Most of the existing image clustering methods treat representation learning and clustering separately, which usually bring two problems. On the one hand, image representations are difficult to select and the learned representations are not suitable for clustering. On the other hand, they inevitably involve some clustering step, which may bring some error and hurt the clustering results. To tackle these problems, we present a new clustering method that efficiently builds an image representation and precisely discovers cluster assignments. For this purpose, the image clustering task is regarded as a binary pairwise classification problem with local structure preservation. Specifically, we propose here such an approach for image clustering based on a fully convolutional autoencoder and deep adaptive clustering (DAC). To extract the essential representation and maintain the local structure, a fully convolutional autoencoder is applied. To manipulate feature to clustering space and obtain a suitable image representation, the DAC algorithm participates in the training of autoencoder. Our method can learn an image representation that is suitable for clustering and discover the precise clustering label for each image. A series of real-world image clustering experiments verify the effectiveness of the proposed algorithm.
APA, Harvard, Vancouver, ISO, and other styles
15

Shi, Li, Xiao Ke Niu, Zhi Zhong Wang, and Hui Ge Shi. "A Study on Image Representation Method Based on Biological Visual Mechanism." Applied Mechanics and Materials 249-250 (December 2012): 1283–88. http://dx.doi.org/10.4028/www.scientific.net/amm.249-250.1283.

Full text
Abstract:
Image representation is a key issue among many image processing tasks. By considering the problems faced by current general image representation methods, such as excessive computing amount, sensitivity to noise, lack of self-adaptability etc, a novel image representation method based on biologic visual mechanisms is proposed in this paper. Through simulating the primary visual cortex to realize the sparse representation of outside image it also introduced the synchronization mechanism to make it more accordant with visual system. Finally the presented method was verified by applying it to compress natural images and digital literature images respectively. The result showed that this new representation method is better than the general sparse representation method on both aspects of compression ratio and noise sensitivity.
APA, Harvard, Vancouver, ISO, and other styles
16

Zdunek, Rafał, and Tomasz Sadowski. "Image Completion with Hybrid Interpolation in Tensor Representation." Applied Sciences 10, no. 3 (January 22, 2020): 797. http://dx.doi.org/10.3390/app10030797.

Full text
Abstract:
The issue of image completion has been developed considerably over the last two decades, and many computational strategies have been proposed to fill-in missing regions in an incomplete image. When the incomplete image contains many small-sized irregular missing areas, a good alternative seems to be the matrix or tensor decomposition algorithms that yield low-rank approximations. However, this approach uses heuristic rank adaptation techniques, especially for images with many details. To tackle the obstacles of low-rank completion methods, we propose to model the incomplete images with overlapping blocks of Tucker decomposition representations where the factor matrices are determined by a hybrid version of the Gaussian radial basis function and polynomial interpolation. The experiments, carried out for various image completion and resolution up-scaling problems, demonstrate that our approach considerably outperforms the baseline and state-of-the-art low-rank completion methods.
APA, Harvard, Vancouver, ISO, and other styles
17

Sahoo, Arabinda, and Pranati Das. "Dictionary based Image Compression via Sparse Representation." International Journal of Electrical and Computer Engineering (IJECE) 7, no. 4 (August 1, 2017): 1964. http://dx.doi.org/10.11591/ijece.v7i4.pp1964-1972.

Full text
Abstract:
Nowadays image compression has become a necessity due to a large volume of images. For efficient use of storage space and data transmission, it becomes essential to compress the image. In this paper, we propose a dictionary based image compression framework via sparse representation, with the construction of a trained over-complete dictionary. The over-complete dictionary is trained using the intra-prediction residuals obtained from different images and is applied for sparse representation. In this method, the current image block is first predicted from its spatially neighboring blocks, and then the prediction residuals are encoded via sparse representation. Sparse approximation algorithm and the trained over-complete dictionary are applied for sparse representation of prediction residuals. The detail coefficients obtained from sparse representation are used for encoding. Experimental result shows that the proposed method yields both improved coding efficiency and image quality as compared to some state-of-the-art image compression methods.
APA, Harvard, Vancouver, ISO, and other styles
18

PAJAROLA, RENATO, MIGUEL SAINZ, and YU MENG. "DMESH: FAST DEPTH-IMAGE MESHING AND WARPING." International Journal of Image and Graphics 04, no. 04 (October 2004): 653–81. http://dx.doi.org/10.1142/s0219467804001580.

Full text
Abstract:
In this paper we present a novel and efficient depth-image representation and warping technique called DMesh which is based on a piece-wise linear approximation of the depth-image as a textured and simplified triangle mesh. We describe the application of a hierarchical multiresolution triangulation method to generate adaptively triangulated depth-meshes efficiently from reference depth-images, discuss depth-mesh segmentation methods to avoid occlusion artifacts and propose a new hardware accelerated depth-image rendering technique that supports per-pixel weighted blending of multiple depth-images in real-time. Applications of our technique include image-based object representations and the use of depth-images in large scale walk-through visualization systems.
APA, Harvard, Vancouver, ISO, and other styles
19

Zheng, Min, Yangliao Geng, and Qingyong Li. "Revisiting Local Descriptors via Frequent Pattern Mining for Fine-Grained Image Retrieval." Entropy 24, no. 2 (January 20, 2022): 156. http://dx.doi.org/10.3390/e24020156.

Full text
Abstract:
Fine-grained image retrieval aims at searching relevant images among fine-grained classes given a query. The main difficulty of this task derives from the small interclass distinction and the large intraclass variance of fine-grained images, posing severe challenges to the methods that only resort to global or local features. In this paper, we propose a novel fine-grained image retrieval method, where global–local aware feature representation is learned. Specifically, the global feature is extracted by selecting the most relevant deep descriptors. Meanwhile, we explore the intrinsic relationship of different parts via the frequent pattern mining, thus obtaining the representative local feature. Further, an aggregation feature that learns global–local aware feature representation is designed. Consequently, the discriminative ability among different fine-grained classes is enhanced. We evaluate the proposed method on five popular fine-grained datasets. Extensive experimental results demonstrate that the performance of fine-grained image retrieval is improved with the proposed global–local aware representation.
APA, Harvard, Vancouver, ISO, and other styles
20

Arevalo, John, Angel Cruz-Roa, and Fabio A. González O. "Representación de imágenes de histopatología utilizada en tareas de análisis automático: estado del arte." Revista Med 22, no. 2 (December 1, 2014): 79. http://dx.doi.org/10.18359/rmed.1184.

Full text
Abstract:
<p>This paper presents a review of the state-of-the-art in histopathology image representation used in automatic image analysis tasks. Automatic analysis of histopathology images is important for building computer-assisted diagnosis tools, automatic image enhancing systems and virtual microscopy systems, among other applications. Histopathology images have a rich mix of visual patterns with particularities that make them difficult to analyze. The paper discusses these particularities, the acquisition process and the challenges found when doing automatic analysis. Second an overview of recent works and methods addressed to deal with visual content representation in different automatic image analysis tasks is presented. Third an overview of applications of image representation methods in several medical domains and tasks is presented. Finally, the paper concludes with current trends of automatic analysis of histopathology images like digital pathology.</p>
APA, Harvard, Vancouver, ISO, and other styles
21

Zhou, Jianhang, Shaoning Zeng, and Bob Zhang. "Linear Representation-Based Methods for Image Classification: A Survey." IEEE Access 8 (2020): 216645–70. http://dx.doi.org/10.1109/access.2020.3041154.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Mallat, Stéphane, and Gabriel Peyré. "A review of Bandlet methods for geometrical image representation." Numerical Algorithms 44, no. 3 (June 1, 2007): 205–34. http://dx.doi.org/10.1007/s11075-007-9092-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Zhang, Xue Jun, and Bing Liang Hu. "Image Super-Resolution via Saliency Sparse Representation." Applied Mechanics and Materials 568-570 (June 2014): 659–62. http://dx.doi.org/10.4028/www.scientific.net/amm.568-570.659.

Full text
Abstract:
The paper proposes a new approach to single-image super resolution (SR), which is based on sparse representation. Previous researchers just focus on the global intensive patch, without local intensive patch. The performance of dictionary trained by the local saliency intensive patch is more significant. Motivated by this, we joined the saliency detection to detect marked area in the image. We proposed a sparse representation for saliency patch of the low-resolution input, and used the coefficients of this representation to generate the high-resolution output. Compared to precious approaches which simply sample a large amount of image patch pairs, the saliency dictionary pair is a more compact representation of the patch pairs, reducing the computational cost substantially. Through the experiment, we demonstrate that our algorithm generates high-resolution images that are competitive or even superior in quality to images produced by other similar SR methods.
APA, Harvard, Vancouver, ISO, and other styles
24

Tsai, Chih-Fong. "Bag-of-Words Representation in Image Annotation: A Review." ISRN Artificial Intelligence 2012 (November 29, 2012): 1–19. http://dx.doi.org/10.5402/2012/376804.

Full text
Abstract:
Content-based image retrieval (CBIR) systems require users to query images by their low-level visual content; this not only makes it hard for users to formulate queries, but also can lead to unsatisfied retrieval results. To this end, image annotation was proposed. The aim of image annotation is to automatically assign keywords to images, so image retrieval users are able to query images by keywords. Image annotation can be regarded as the image classification problem: that images are represented by some low-level features and some supervised learning techniques are used to learn the mapping between low-level features and high-level concepts (i.e., class labels). One of the most widely used feature representation methods is bag-of-words (BoW). This paper reviews related works based on the issues of improving and/or applying BoW for image annotation. Moreover, many recent works (from 2006 to 2012) are compared in terms of the methodology of BoW feature generation and experimental design. In addition, several different issues in using BoW are discussed, and some important issues for future research are discussed.
APA, Harvard, Vancouver, ISO, and other styles
25

Wang, Ruize, Zhongyu Wei, Piji Li, Qi Zhang, and Xuanjing Huang. "Storytelling from an Image Stream Using Scene Graphs." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 9185–92. http://dx.doi.org/10.1609/aaai.v34i05.6455.

Full text
Abstract:
Visual storytelling aims at generating a story from an image stream. Most existing methods tend to represent images directly with the extracted high-level features, which is not intuitive and difficult to interpret. We argue that translating each image into a graph-based semantic representation, i.e., scene graph, which explicitly encodes the objects and relationships detected within image, would benefit representing and describing images. To this end, we propose a novel graph-based architecture for visual storytelling by modeling the two-level relationships on scene graphs. In particular, on the within-image level, we employ a Graph Convolution Network (GCN) to enrich local fine-grained region representations of objects on scene graphs. To further model the interaction among images, on the cross-images level, a Temporal Convolution Network (TCN) is utilized to refine the region representations along the temporal dimension. Then the relation-aware representations are fed into the Gated Recurrent Unit (GRU) with attention mechanism for story generation. Experiments are conducted on the public visual storytelling dataset. Automatic and human evaluation results indicate that our method achieves state-of-the-art.
APA, Harvard, Vancouver, ISO, and other styles
26

Wu, Hui, Min Wang, Wengang Zhou, Yang Hu, and Houqiang Li. "Learning Token-Based Representation for Image Retrieval." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (June 28, 2022): 2703–11. http://dx.doi.org/10.1609/aaai.v36i3.20173.

Full text
Abstract:
In image retrieval, deep local features learned in a data-driven manner have been demonstrated effective to improve retrieval performance. To realize efficient retrieval on large image database, some approaches quantize deep local features with a large codebook and match images with aggregated match kernel. However, the complexity of these approaches is non-trivial with large memory footprint, which limits their capability to jointly perform feature learning and aggregation. To generate compact global representations while maintaining regional matching capability, we propose a unified framework to jointly learn local feature representation and aggregation. In our framework, we first extract local features using CNNs. Then, we design a tokenizer module to aggregate them into a few visual tokens, each corresponding to a specific visual pattern. This helps to remove background noise, and capture more discriminative regions in the image. Next, a refinement block is introduced to enhance the visual tokens with self-attention and cross-attention. Finally, different visual tokens are concatenated to generate a compact global representation. The whole framework is trained end-to-end with image-level labels. Extensive experiments are conducted to evaluate our approach, which outperforms the state-of-the-art methods on the Revisited Oxford and Paris datasets.
APA, Harvard, Vancouver, ISO, and other styles
27

Silva, Samuel Henrique, Arun Das, Adel Aladdini, and Peyman Najafirad. "Adaptive Clustering of Robust Semantic Representations for Adversarial Image Purification on Social Networks." Proceedings of the International AAAI Conference on Web and Social Media 16 (May 31, 2022): 968–79. http://dx.doi.org/10.1609/icwsm.v16i1.19350.

Full text
Abstract:
Advances in Artificial Intelligence (AI) have made it possible to automate human-level visual search and perception tasks on the massive sets of image data shared on social media on a daily basis. However, AI-based automated filters are highly susceptible to deliberate image attacks that can lead to content misclassification of cyberbulling, child sexual abuse material (CSAM), adult content, and deepfakes. One of the most effective methods to defend against such disturbances is adversarial training, but this comes at the cost of generalization for unseen attacks and transferability across models. In this article, we propose a robust defense against adversarial image attacks, which is model agnostic and generalizable to unseen adversaries. We begin with a baseline model, extracting the latent representations for each class and adaptively clustering the latent representations that share a semantic similarity. Next, we obtain the distributions for these clustered latent representations along with their originating images. We then learn semantic reconstruction dictionaries (SRD). We adversarially train a new model constraining the latent space representation to minimize the distance between the adversarial latent representation and the true cluster distribution. To purify the image, we decompose the input into low and high-frequency components. The high-frequency component is reconstructed based on the best SRD from the clean dataset. In order to evaluate the best SRD, we rely on the distance between the robust latent representations and semantic cluster distributions. The output is a purified image with no perturbations. Evaluations using comprehensive datasets including image benchmarks and social media images demonstrate that our proposed purification approach guards and enhances the accuracy of AI-based image filters for unlawful and harmful perturbed images considerably.
APA, Harvard, Vancouver, ISO, and other styles
28

Yang, Xiaomin, Kai Liu, Zhongliang Gan, and Binyu Yan. "Multiscale and Multitopic Sparse Representation for Multisensor Infrared Image Superresolution." Journal of Sensors 2016 (2016): 1–14. http://dx.doi.org/10.1155/2016/7036349.

Full text
Abstract:
Methods based on sparse coding have been successfully used in single-image superresolution (SR) reconstruction. However, the traditional sparse representation-based SR image reconstruction for infrared (IR) images usually suffers from three problems. First, IR images always lack detailed information. Second, a traditional sparse dictionary is learned from patches with a fixed size, which may not capture the exact information of the images and may ignore the fact that images naturally come at different scales in many cases. Finally, traditional sparse dictionary learning methods aim at learning a universal and overcomplete dictionary. However, many different local structural patterns exist. One dictionary is inadequate in capturing all of the different structures. We propose a novel IR image SR method to overcome these problems. First, we combine the information from multisensors to improve the resolution of the IR image. Then, we use multiscale patches to represent the image in a more efficient manner. Finally, we partition the natural images into documents and group such documents to determine the inherent topics and to learn the sparse dictionary of each topic. Extensive experiments validate that using the proposed method yields better results in terms of quantitation and visual perception than many state-of-the-art algorithms.
APA, Harvard, Vancouver, ISO, and other styles
29

Liu, Guichi, Lei Gao, and Lin Qi. "Hyperspectral Image Classification via Multi-Feature-Based Correlation Adaptive Representation." Remote Sensing 13, no. 7 (March 25, 2021): 1253. http://dx.doi.org/10.3390/rs13071253.

Full text
Abstract:
In recent years, representation-based methods have attracted more attention in the hyperspectral image (HSI) classification. Among them, sparse representation-based classifier (SRC) and collaborative representation-based classifier (CRC) are the two representative methods. However, SRC only focuses on sparsity but ignores the data correlation information. While CRC encourages grouping correlated variables together but lacks the ability of variable selection. As a result, SRC and CRC are incapable of producing satisfied performance. To address these issues, in this work, a correlation adaptive representation (CAR) is proposed, enabling a CAR-based classifier (CARC). Specifically, the proposed CARC is able to explore sparsity and data correlation information jointly, generating a novel representation model that is adaptive to the structure of the dictionary. To further exploit the correlation between the test samples and the training samples effectively, a distance-weighted Tikhonov regularization is integrated into the proposed CARC. Furthermore, to handle the small training sample problem in the HSI classification, a multi-feature correlation adaptive representation-based classifier (MFCARC) and MFCARC with Tikhonov regularization (MFCART) are presented to improve the classification performance by exploring the complementary information across multiple features. The experimental results show the superiority of the proposed methods over state-of-the-art algorithms.
APA, Harvard, Vancouver, ISO, and other styles
30

WANG, YUSHI, QINGMING HUANG, and WEN GAO. "PORNOGRAPHIC IMAGE DETECTION BASED ON MULTILEVEL REPRESENTATION." International Journal of Pattern Recognition and Artificial Intelligence 23, no. 08 (December 2009): 1633–55. http://dx.doi.org/10.1142/s0218001409007739.

Full text
Abstract:
With the proliferation of pornographic images on the Internet, it is essential to automatically detect pornographic images by analyzing image content. Most traditional detection systems are based on low-level features and generate many false positives due to images that contain large regions of skin-like colors. In this paper, we present a novel detection method based on local features, such as SIFT (Scale Invariant Feature Transform) visual words. Support Vector Machine (SVM) is used to classify images according to their multilevel representation based on visual words and the distribution of pornography-related visual words. The multilevel representation captures inter-word statistics and fuses various visual components of pornographic scenes. Experimental results demonstrate that our method outperforms traditional skin-region and human-body-model based methods, and performs well on a wide range of test data, in particular, on human-related images.
APA, Harvard, Vancouver, ISO, and other styles
31

Fu, Lingli, Chao Ren, Xiaohai He, Xiaohong Wu, and Zhengyong Wang. "Single Remote Sensing Image Super-Resolution with an Adaptive Joint Constraint Model." Sensors 20, no. 5 (February 26, 2020): 1276. http://dx.doi.org/10.3390/s20051276.

Full text
Abstract:
Remote sensing images have been widely used in many applications. However, the resolution of the obtained remote sensing images may not meet the increasing demands for some applications. In general, the sparse representation-based super-resolution (SR) method is one of the most popular methods to solve this issue. However, traditional sparse representation SR methods do not fully exploit the complementary constraints of images. Therefore, they cannot accurately reconstruct the unknown HR images. To address this issue, we propose a novel adaptive joint constraint (AJC) based on sparse representation for the single remote sensing image SR. First, we construct a nonlocal constraint by using the nonlocal self-similarity. Second, we propose a local structure filter according to the local gradient of the image and then construct a local constraint. Next, the nonlocal and local constraints are introduced into the sparse representation-based SR framework. Finally, the parameters of the joint constraint model are selected adaptively according to the level of image noise. We utilize the alternate iteration algorithm to tackle the minimization problem in AJC. Experimental results show that the proposed method achieves good SR performance in preserving image details and significantly improves the objective evaluation indices.
APA, Harvard, Vancouver, ISO, and other styles
32

Pang, Haibo, Chengming Liu, Zhe Zhao, Guangjun Zai, and Zhanbo Li. "Scene Image Retrieval Based on Manifold Structures of Canonical Images." International Journal of Pattern Recognition and Artificial Intelligence 31, no. 03 (February 2017): 1755005. http://dx.doi.org/10.1142/s0218001417550059.

Full text
Abstract:
Image retrieval methods have been dramatically developed in the last decade. In this paper, we propose a novel method for image retrieval based on manifold structures of canonical images. Firstly, we present the image normalization process to find a set of canonical images that anchors the probabilistic distributions around the real data manifolds to learn the representations that better encode the manifold structures in general high-dimensional image space. In addition, we employ the canonical images as the centers of the conditional multivariate Gaussian distributions. This approach allows to learn more detailed structures of the partial manifolds resulting in improved representation of the high level properties of scene images. Furthermore, we use the probabilistic framework of the extended model to retrieve images based on the similarity measure of reciprocal likelihood of pairs of images and the sum of likelihood of one of two images based on the other’s best distributions. We estimate our method using SUN database. In the experiments on scene image retrieval, the proposed method is efficient, and exhibits superior capabilities compared to other methods, such as GIST.
APA, Harvard, Vancouver, ISO, and other styles
33

Fu, Y., Y. Ye, G. Liu, B. Zhang, and R. Zhang. "ROBUST MULTIMODAL IMAGE MATCHING BASED ON MAIN STRUCTURE FEATURE REPRESENTATION." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B3-2020 (August 21, 2020): 583–89. http://dx.doi.org/10.5194/isprs-archives-xliii-b3-2020-583-2020.

Full text
Abstract:
Abstract. Image matching is a crucial procedure for multimodal remote sensing image processing. However, the performance of conventional methods is often degraded in matching multimodal images due to significant nonlinear intensity differences. To address this problem, this letter proposes a novel image feature representation named Main Structure with Histogram of Orientated Phase Congruency (M-HOPC). M-HOPC is able to precisely capture similar structure properties between multimodal images by reinforcing the main structure information for the construction of the phase congruency feature description. Specifically, each pixel of an image is assigned an independent weight for feature descriptor according to the main structure such as large contours and edges. Then M-HOPC is integrated as the similarity measure for correspondence detection by a template matching scheme. Three pairs of multimodal images including optical, LiDAR, and SAR data have been used to evaluate the proposed method. The results show that M-HOPC is robust to nonlinear intensity differences and achieves the superior matching performance compared with other state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
34

WANG, ZHIYONG, ZHERU CHI, DAGAN FENG, and AH CHUNG TSOI. "CONTENT-BASED IMAGE RETRIEVAL WITH RELEVANCE FEEDBACK USING ADAPTIVE PROCESSING OF TREE-STRUCTURE IMAGE REPRESENTATION." International Journal of Image and Graphics 03, no. 01 (January 2003): 119–43. http://dx.doi.org/10.1142/s0219467803000944.

Full text
Abstract:
Content-based image retrieval has become an essential technique in multimedia data management. However, due to the difficulties and complications involved in the various image processing tasks, a robust semantic representation of image content is still very difficult (if not impossible) to achieve. In this paper, we propose a novel content-based image retrieval approach with relevance feedback using adaptive processing of tree-structure image representation. In our approach, each image is first represented with a quad-tree, which is segmentation free. Then a neural network model with the Back-Propagation Through Structure (BPTS) learning algorithm is employed to learn the tree-structure representation of the image content. This approach that integrates image representation and similarity measure in a single framework is applied to the relevance feedback of the content-based image retrieval. In our approach, an initial ranking of the database images is first carried out based on the similarity between the query image and each of the database images according to global features. The user is then asked to categorize the top retrieved images into similar and dissimilar groups. Finally, the BPTS neural network model is used to learn the user's intention for a better retrieval result. This process continues until satisfactory retrieval results are achieved. In the refining process, a fine similarity grading scheme can also be adopted to improve the retrieval performance. Simulations on texture images and scenery pictures have demonstrated promising results which compare favorably with the other relevance feedback methods tested.
APA, Harvard, Vancouver, ISO, and other styles
35

Prof. Sathish. "Light Field Image Coding with Image Prediction in Redundancy." Journal of Soft Computing Paradigm 2, no. 3 (July 21, 2020): 160–67. http://dx.doi.org/10.36548/jscp.2020.3.003.

Full text
Abstract:
The proposed work involves a hybrid data representation using efficient light field coding. The existing light field coding solution are implemented using sub-aperture or micro-images. However, the full capacity in terms of intrinsic redundancy in light field images is not completely explored. This paper represents a hybrid data representation which explores four major redundancy types. Using coding block, the most predominant redundancy is exploited to find the optimum coding solution that provides maximum flexibility. To show how efficient the hybrid representation works, we have proposed a combination of pseudo-video sequence coding approach with pixel prediction methods. The observed experimental results shows a positive bit rate saving when compared to other similar methods. Similarly, the proposed method is also said to outperform other coding algorithms such as WaSP and MuLE when compared on a HEVC-based benchmark.
APA, Harvard, Vancouver, ISO, and other styles
36

Ma, Xiaole, Shaohai Hu, Shuaiqi Liu, Jing Fang, and Shuwen Xu. "Remote Sensing Image Fusion Based on Sparse Representation and Guided Filtering." Electronics 8, no. 3 (March 8, 2019): 303. http://dx.doi.org/10.3390/electronics8030303.

Full text
Abstract:
In this paper, a remote sensing image fusion method is presented since sparse representation (SR) has been widely used in image processing, especially for image fusion. Firstly, we used source images to learn the adaptive dictionary, and sparse coefficients were obtained by sparsely coding the source images with the adaptive dictionary. Then, with the help of improved hyperbolic tangent function (tanh) and l 0 − max , we fused these sparse coefficients together. The initial fused image can be obtained by the image fusion method based on SR. To take full advantage of the spatial information of the source images, the fused image based on the spatial domain (SF) was obtained at the same time. Lastly, the final fused image could be reconstructed by guided filtering of the fused image based on SR and SF. Experimental results show that the proposed method outperforms some state-of-the-art methods on visual and quantitative evaluations.
APA, Harvard, Vancouver, ISO, and other styles
37

Khaldi, Amine. "Steganographic Techniques Classification According to Image Format." International Annals of Science 8, no. 1 (November 4, 2019): 143–49. http://dx.doi.org/10.21467/ias.8.1.143-149.

Full text
Abstract:
In this work, we present a classification of steganographic methods applicable to digital images. We also propose a classification of steganographic methods according to the type of image used. We noticed there are no methods that can be applied to all image formats. Each type of image has its characteristics and each steganographic method operates on a precise colorimetric representation. This classification provides an overview of the techniques used for the steganography of digital images
APA, Harvard, Vancouver, ISO, and other styles
38

Asefpour Vakilian, A., and M. R. Saradjian. "OPTIMIZATION OF THE SPARSE REPRESENTATION PARAMETERS FOR THE FUSION OF REMOTELY SENSED SATELLITE IMAGES." ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences X-4/W1-2022 (January 13, 2023): 71–77. http://dx.doi.org/10.5194/isprs-annals-x-4-w1-2022-71-2023.

Full text
Abstract:
Abstract. Image fusion methods are widely used in remote sensing applications to obtain more information about the features in the study area. One of the recent satellite image fusion techniques that can deal with noise and reduce computational cost and deal with geometric misregistration is sparse representation model. The important part of creating a generalized sparse representation model for satellite image fusion problems is defining initial constraints and adjusting the corresponding regularization coefficients. Regularization coefficients play an essential role in the performance of the sparse representation model and convergence of the optimization solution. Also, the number and size of sub-images extracted from the dictionary matrix in the sparse representation model, and the number of iterations of the optimization step are important in building a sparse representation model. Therefore, in this research, the four parameters that affect the performance of the sparse representation model were investigated: the number of sub-images, the size of sub-images, regularization coefficients, and the number of iterations. Results obtained from pan-sharpening of OLI-8 images showed that optimal values for the number and size of sub-images, regularization coefficients, and the number of iterations were equal to 150, 9×9 pixels, 10-4, and 4 respectively. Results from this study can be generalized to other satellite image fusion problems using sparse representation models.
APA, Harvard, Vancouver, ISO, and other styles
39

LI, ZHAOKUI, LIXIN DING, YAN WANG, and JINRONG HE. "FACE REPRESENTATION WITH GRADIENT ORIENTATIONS AND EULER MAPPING: APPLICATION TO FACE RECOGNITION." International Journal of Pattern Recognition and Artificial Intelligence 28, no. 08 (December 2014): 1456014. http://dx.doi.org/10.1142/s021800141456014x.

Full text
Abstract:
This paper proposes a simple, yet very powerful local face representation, called the Gradient Orientations and Euler Mapping (GOEM). GOEM consists of two stages: gradient orientations and Euler mapping. In the first stage, we calculate gradient orientations of a central pixel and get the corresponding orientation representations by performing convolution operator. These representation results display spatial locality and orientation properties. To encompass different spatial localities and orientations, we concatenate all these representation results and derive a concatenated orientation feature vector. In the second stage, we define an explicit Euler mapping which maps the space of the concatenated orientation into a complex space. For a mapping image, we find that the imaginary part and the real part characterize the high frequency and the low frequency components, respectively. To encompass different frequencies, we concatenate the imaginary part and the real part and derive a concatenated mapping feature vector. For a given image, we use the two stages to construct a GOEM image and derive an augmented feature vector which resides in a space of very high dimensionality. In order to derive low-dimensional feature vector, we present a class of GOEM-based kernel subspace learning methods for face recognition. These methods, which are robust to changes in occlusion and illumination, apply the kernel subspace learning model with explicit Euler mapping to an augmented feature vector derived from the GOEM representation of face images. Experimental results show that our methods significantly outperform popular methods and achieve state-of-the-art performance for difficult problems such as illumination and occlusion-robust face recognition.
APA, Harvard, Vancouver, ISO, and other styles
40

Zhu, Yi, Lei Li, and Xindong Wu. "Stacked Convolutional Sparse Auto-Encoders for Representation Learning." ACM Transactions on Knowledge Discovery from Data 15, no. 2 (April 2021): 1–21. http://dx.doi.org/10.1145/3434767.

Full text
Abstract:
Deep learning seeks to achieve excellent performance for representation learning in image datasets. However, supervised deep learning models such as convolutional neural networks require a large number of labeled image data, which is intractable in applications, while unsupervised deep learning models like stacked denoising auto-encoder cannot employ label information. Meanwhile, the redundancy of image data incurs performance degradation on representation learning for aforementioned models. To address these problems, we propose a semi-supervised deep learning framework called stacked convolutional sparse auto-encoder, which can learn robust and sparse representations from image data with fewer labeled data records. More specifically, the framework is constructed by stacking layers. In each layer, higher layer feature representations are generated by features of lower layers in a convolutional way with kernels learned by a sparse auto-encoder. Meanwhile, to solve the data redundance problem, the algorithm of Reconstruction Independent Component Analysis is designed to train on patches for sphering the input data. The label information is encoded using a Softmax Regression model for semi-supervised learning. With this framework, higher level representations are learned by layers mapping from image data. It can boost the performance of the base subsequent classifiers such as support vector machines. Extensive experiments demonstrate the superior classification performance of our framework compared to several state-of-the-art representation learning methods.
APA, Harvard, Vancouver, ISO, and other styles
41

LU, JIAN, JIAPENG TIAN, CHEN XU, and YURU ZOU. "A DICTIONARY LEARNING APPROACH FOR FRACTAL IMAGE CODING." Fractals 27, no. 02 (March 2019): 1950020. http://dx.doi.org/10.1142/s0218348x19500208.

Full text
Abstract:
In recent years, sparse representations of images have shown to be efficient approaches for image recovery. Following this idea, this paper investigates incorporating a dictionary learning approach into fractal image coding, which leads to a new model containing three terms: a patch-based sparse representation prior over a learned dictionary, a quadratic term measuring the closeness of the underlying image to a fractal image, and a data-fidelity term capturing the statistics of Gaussian noise. After the dictionary is learned, the resulting optimization problem with fractal coding can be solved effectively. The new method can not only efficiently recover noisy images, but also admirably achieve fractal image noiseless coding/compression. Experimental results suggest that in terms of visual quality, peak-signal-to-noise ratio, structural similarity index and mean absolute error, the proposed method significantly outperforms the state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
42

Abdulazeem Ahmed, Eman, Malek Alzaqebah, Sana Jawarneh, Jehad Saad Alqurni, Fahad A. Alghamdi, Hayat Alfagham, Lubna Mahmoud Abdel Jawad, Usama A. Badawi, Mutasem K. Alsmadi, and Ibrahim Almarashdeh. "Comparison of specific segmentation methods used for copy move detection." International Journal of Electrical and Computer Engineering (IJECE) 13, no. 2 (April 1, 2023): 2363. http://dx.doi.org/10.11591/ijece.v13i2.pp2363-2374.

Full text
Abstract:
<p><span lang="EN-US">In this digital age, the widespread use of digital images and the availability of image editors have made the credibility of images controversial. To confirm the credibility of digital images many image forgery detection types are arises, copy-move forgery is consisting of transforming any image by duplicating a part of the image, to add or hide existing objects. Several methods have been proposed in the literature to detect copy-move forgery, these methods use the key point-based and block-based to find the duplicated areas. However, the key point-based and block-based have a drawback of the ability to handle the smooth region. In addition, image segmentation plays a vital role in changing the representation of the image in a meaningful form for analysis. Hence, we execute a comparison study for segmentation based on two clustering algorithms (i.e., k-means and super pixel segmentation with density-based spatial clustering of applications with noise (DBSCAN)), the paper compares methods in term of the accuracy of detecting the forgery regions of digital images. K-means shows better performance compared with DBSCAN and with other techniques in the literature.</span></p>
APA, Harvard, Vancouver, ISO, and other styles
43

Guo, Yawen, Hui Yuan, and Kun Zhang. "Associating Images with Sentences Using Recurrent Canonical Correlation Analysis." Applied Sciences 10, no. 16 (August 10, 2020): 5516. http://dx.doi.org/10.3390/app10165516.

Full text
Abstract:
Associating images with sentences has drawn much attention recently. Existing methods commonly represent an image by indistinctively describing all its contents in an one-time static way, which ignores two facts that (1) the association analysis can only be effective for partial salient contents and the associated sentence, and (2) visual information acquisition is a dynamical rather than static process. To deal with this issue, we propose a recurrent canonical correlation analysis (RCCA) method for associating images with sentences. RCCA includes a contextual attention-based LSTM-RNN which can selectively attend to salient regions of an image at each time step, and then represent all the salient contents within a few steps. Different from existing attention-based models, our model focuses on the modelling of contextual visual attention mechanism for the task of association analysis. RCCA also includes a conventional LSTM-RNN for sentence representation learning. The resulting representations of images and sentences are fed into CCA to maximize linear correlation, where parameters of LSTM-RNNs and CCA are jointly learned. Due to the effective image representation learning, our model can well associate images with sentences with complex contents, and achieve better performance in terms of image annotation and retrieval.
APA, Harvard, Vancouver, ISO, and other styles
44

Zhang, Lina, Haidong Dai, and Yu Sang. "Med-SRNet: GAN-Based Medical Image Super-Resolution via High-Resolution Representation Learning." Computational Intelligence and Neuroscience 2022 (June 20, 2022): 1–9. http://dx.doi.org/10.1155/2022/1744969.

Full text
Abstract:
High-resolution (HR) medical imaging data provide more anatomical details of human body, which facilitates early-stage disease diagnosis. But it is challenging to get clear HR medical images because of the limiting factors, such as imaging systems, imaging environments, and human factors. This work presents a novel medical image super-resolution (SR) method via high-resolution representation learning based on generative adversarial network (GAN), namely, Med-SRNet. We use GAN as backbone of SR considering the advantages of GAN that can significantly reconstruct the visual quality of the images, and the high-frequency details of the images are more realistic in the image SR task. Furthermore, we employ the HR network (HRNet) in GAN generator to maintain the HR representations and repeatedly use multi-scale fusions to strengthen HR representations for facilitating SR. Moreover, we adopt deconvolution operations to recover high-quality HR representations from all the parallel lower resolution (LR) streams with the aim to yield richer aggregated features, instead of simple bilinear interpolation operations used in HRNetV2. When evaluated on a home-made medical image dataset and two public COVID-19 CT datasets, the proposed Med-SRNet outperforms other leading edge methods, which obtains higher peak signal to noise ratio (PSNR) values and structural similarity (SSIM) values, i.e., maximum improvement of 1.75 and minimum increase of 0.433 on the PSNR metric for “Brain” test sets under 8 × and maximum improvement of 0.048 and minimum increase of 0.016 on the SSIM metric for “Lung” test sets under 8 × compared with other methods.
APA, Harvard, Vancouver, ISO, and other styles
45

Cho, Soo Sun, Dong Won Han, and Chi Jung Hwang. "Web Image Classification Using an Optimized Feature Set." Key Engineering Materials 277-279 (January 2005): 361–68. http://dx.doi.org/10.4028/www.scientific.net/kem.277-279.361.

Full text
Abstract:
Redundant images currently abundant in World Wide Web pages need to be removed in order to transform or simplify the Web pages for suitable display in small-screened devices. Classifying removable images on the Web pages according to their uniqueness of content will allow simpler representation of Web pages. For such classification, machine learning based methods can be used to categorize images into two groups; eliminable and non-eliminable. We use two representative learning methods, the Naïve Bayesian classifier and C4.5 decision trees. For our Web image classification, we propose new features that have expressive power for Web images to be classified. We apply image samples to the two classifiers and analyze the results. In addition, we propose an algorithm to construct an optimized subset from a whole feature set, which includes most influential features for the purposes of classification. By using the optimized feature set, the accuracy of classification is found to improve markedly.
APA, Harvard, Vancouver, ISO, and other styles
46

Xu, Fang, Jinghong Liu, Yueming Song, Hui Sun, and Xuan Wang. "Multi-Exposure Image Fusion Techniques: A Comprehensive Review." Remote Sensing 14, no. 3 (February 7, 2022): 771. http://dx.doi.org/10.3390/rs14030771.

Full text
Abstract:
Multi-exposure image fusion (MEF) is emerging as a research hotspot in the fields of image processing and computer vision, which can integrate images with multiple exposure levels into a full exposure image of high quality. It is an economical and effective way to improve the dynamic range of the imaging system and has broad application prospects. In recent years, with the further development of image representation theories such as multi-scale analysis and deep learning, significant progress has been achieved in this field. This paper comprehensively investigates the current research status of MEF methods. The relevant theories and key technologies for constructing MEF models are analyzed and categorized. The representative MEF methods in each category are introduced and summarized. Then, based on the multi-exposure image sequences in static and dynamic scenes, we present a comparative study for 18 representative MEF approaches using nine commonly used objective fusion metrics. Finally, the key issues of current MEF research are discussed, and a development trend for future research is put forward.
APA, Harvard, Vancouver, ISO, and other styles
47

Anwar, Shahzad, Qingjie Zhao, Muhammad Farhan Manzoor, and Saqib Ishaq Khan. "Saliency Detection Using Sparse and Nonlinear Feature Representation." Scientific World Journal 2014 (2014): 1–16. http://dx.doi.org/10.1155/2014/137349.

Full text
Abstract:
An important aspect of visual saliency detection is how features that form an input image are represented. A popular theory supports sparse feature representation, an image being represented with a basis dictionary having sparse weighting coefficient. Another method uses a nonlinear combination of image features for representation. In our work, we combine the two methods and propose a scheme that takes advantage of both sparse and nonlinear feature representation. To this end, we use independent component analysis (ICA) and covariant matrices, respectively. To compute saliency, we use a biologically plausible center surround difference (CSD) mechanism. Our sparse features are adaptive in nature; the ICA basis function are learnt at every image representation, rather than being fixed. We show that Adaptive Sparse Features when used with a CSD mechanism yield better results compared to fixed sparse representations. We also show that covariant matrices consisting of nonlinear integration of color information alone are sufficient to efficiently estimate saliency from an image. The proposed dual representation scheme is then evaluated against human eye fixation prediction, response to psychological patterns, and salient object detection on well-known datasets. We conclude that having two forms of representation compliments one another and results in better saliency detection.
APA, Harvard, Vancouver, ISO, and other styles
48

Zaqout, Ihab. "Content-Based Image Retrieval using Color Quantization and Angle Representation." INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 13, no. 10 (October 30, 2014): 5094–104. http://dx.doi.org/10.24297/ijct.v13i10.2332.

Full text
Abstract:
An efficient non-uniform color quantization and similarity measurement methods are proposed to enhance the content-based image retrieval (CBIR) applications. The HSV color space is selected because it is close to human visual perception system, and a non-uniform color method is proposed to quantize an image into 37 colors. The marker histogram (MH) vector of size 296 values is generated by segmenting the quantized image into 8 regions (multiplication of 45°) and count the occurrences of the quantized colors in their particular angles. To cope with rotated images, an incremental displacement to the MH is applied 7 times. To find similar images, we proposed a new similarity measurement and other 4 existing metrics. A uniform color quantization of related work is implemented too and compared to our quantization method. One-hundred test images are selected from the Corel-1000 images database. Our experimental results conclude high retrieving precision ratios compared to other techniques.
APA, Harvard, Vancouver, ISO, and other styles
49

PANDE, AFTAB. "DEFORMATIONS OF GALOIS REPRESENTATIONS AND THE THEOREMS OF SATO–TATE AND LANG–TROTTER." International Journal of Number Theory 07, no. 08 (December 2011): 2065–79. http://dx.doi.org/10.1142/s1793042111004939.

Full text
Abstract:
We construct infinitely ramified Galois representations ρ such that the al(ρ)'s have distributions in contrast to the statements of Sato–Tate, Lang–Trotter and others. Using similar methods we deform a residual Galois representation for number fields and obtain an infinitely ramified representation with very large image, generalizing a result of Ramakrishna.
APA, Harvard, Vancouver, ISO, and other styles
50

Dong, Huihui, Wenping Ma, Yue Wu, Jun Zhang, and Licheng Jiao. "Self-Supervised Representation Learning for Remote Sensing Image Change Detection Based on Temporal Prediction." Remote Sensing 12, no. 11 (June 9, 2020): 1868. http://dx.doi.org/10.3390/rs12111868.

Full text
Abstract:
Traditional change detection (CD) methods operate in the simple image domain or hand-crafted features, which has less robustness to the inconsistencies (e.g., brightness and noise distribution, etc.) between bitemporal satellite images. Recently, deep learning techniques have reported compelling performance on robust feature learning. However, generating accurate semantic supervision that reveals real change information in satellite images still remains challenging, especially for manual annotation. To solve this problem, we propose a novel self-supervised representation learning method based on temporal prediction for remote sensing image CD. The main idea of our algorithm is to transform two satellite images into more consistent feature representations through a self-supervised mechanism without semantic supervision and any additional computations. Based on the transformed feature representations, a better difference image (DI) can be obtained, which reduces the propagated error of DI on the final detection result. In the self-supervised mechanism, the network is asked to identify different sample patches between two temporal images, namely, temporal prediction. By designing the network for the temporal prediction task to imitate the discriminator of generative adversarial networks, the distribution-aware feature representations are automatically captured and the result with powerful robustness can be acquired. Experimental results on real remote sensing data sets show the effectiveness and superiority of our method, improving the detection precision up to 0.94–35.49%.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography