Academic literature on the topic 'Large intra-class variations'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Large intra-class variations.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Large intra-class variations"

1

Tao, Jianbin, Wenbin Wu, and Meng Xu. "Using the Bayesian Network to Map Large-Scale Cropping Intensity by Fusing Multi-Source Data." Remote Sensing 11, no. 2 (2019): 168. http://dx.doi.org/10.3390/rs11020168.

Full text
Abstract:
Global food demand will increase over the next few decades, and sustainable agricultural intensification on current cropland may be a preferred option to meet this demand. Mapping cropping intensity with remote sensing data is of great importance for agricultural production, food security, and agricultural sustainability in the context of global climate change. However, there are some challenges in large-scale cropping intensity mapping. First, existing indicators are too coarse, and fine indicators for measuring cropping intensity are lacking. Second, the regional, intra-class variations detected in time-series remote sensing data across vast areas represent environment-related clusters for each cropping intensity level. However, few existing studies have taken into account the intra-class variations caused by varied crop patterns, crop phenology, and geographical differentiation. In this research, we first presented a new definition, a normalized cropping intensity index (CII), to quantify cropping intensity precisely. We then proposed a Bayesian network model fusing prior knowledge (BNPK) to address the issue of intra-class variations when mapping CII over large areas. This method can fuse regional differentiation factors as prior knowledge into the model to reduce the uncertainty. Experiments on five sample areas covering the main grain-producing areas of mainland China proved the effectiveness of the model. Our research proposes the framework of obtain a CII map with both a finer spatial resolution and a fine temporal resolution at a national scale.
APA, Harvard, Vancouver, ISO, and other styles
2

Du, Yu, Yonggang Lu, and Ligang Zhao. "Class Specific Dictionary Learning with the Independence Between-class and Dependence Intra-class Coefficient’s Constraint." Journal of Physics: Conference Series 2224, no. 1 (2022): 012105. http://dx.doi.org/10.1088/1742-6596/2224/1/012105.

Full text
Abstract:
Abstract In the field of image classification, deep learning has become the focus of research. But when the number of training samples is small, especially when there are a large number of intra-class variations in the small samples, the performance of deep learning is often not satisfactory. To deal with the problem, a new dictionary learning method based on sparse representation and an improved coefficient’s constraint is proposed. A general dictionary is learned to eliminate noise signal, and then based on the general dictionary, a class specific dictionary is learned by an improved coefficient’s constraint which maintaining the independence of the dictionary atoms between-classes, while allowing the dependence of the dictionary atoms intra- class. The class specific dictionary combined with the general dictionary is used for the image recognition. Experimental results show that, compared with the state-of-the-art dictionary learning methods, the proposed method usually shows better performance on image classification with small data sets.
APA, Harvard, Vancouver, ISO, and other styles
3

Xu, Wanjiang. "Deep Large Margin Nearest Neighbor for Gait Recognition." Journal of Intelligent Systems 30, no. 1 (2021): 604–19. http://dx.doi.org/10.1515/jisys-2020-0077.

Full text
Abstract:
Abstract Gait recognition in video surveillance is still challenging because the employed gait features are usually affected by many variations. To overcome this difficulty, this paper presents a novel Deep Large Margin Nearest Neighbor (DLMNN) method for gait recognition. The proposed DLMNN trains a convolutional neural network to project gait feature onto a metric subspace, under which intra-class gait samples are pulled together as small as possible while inter-class samples are pushed apart by a large margin. We provide an extensive evaluation in terms of various scenarios, namely, normal, carrying, clothing, and cross-view condition on two widely used gait datasets. Experimental results demonstrate that the proposed DLMNN achieves competitive gait recognition performances and promising computational efficiency.
APA, Harvard, Vancouver, ISO, and other styles
4

Qin, Huafeng, and Peng Wang. "A Template Generation and Improvement Approach for Finger-Vein Recognition." Information 10, no. 4 (2019): 145. http://dx.doi.org/10.3390/info10040145.

Full text
Abstract:
Finger-vein biometrics have been extensively investigated for person verification. One of the open issues in finger-vein verification is the lack of robustness against variations of vein patterns due to the changes in physiological and imaging conditions during the acquisition process, which results in large intra-class variations among the finger-vein images captured from the same finger and may degrade the system performance. Despite recent advances in biometric template generation and improvement, current solutions mainly focus on the extrinsic biometrics (e.g., fingerprints, face, signature) instead of intrinsic biometrics (e.g., vein). This paper proposes a weighted least square regression based model to generate and improve enrollment template for finger-vein verification. Driven by the primary target of biometric template generation and improvement, i.e., verification error minimization, we assume that a good template has the smallest intra-class distance with respect to the images from the same class in a verification system. Based on this assumption, the finger-vein template generation is converted into an optimization problem. To improve the performance, the weights associated with similarity are computed for template generation. Then, the enrollment template is generated by solving the optimization problem. Subsequently, a template improvement model is proposed to gradually update vein features in the template. To the best of our knowledge, this is the first proposed work of template generation and improvement for finger-vein biometrics. The experimental results on two public finger-vein databases show that the proposed schemes minimize the intra-class variations among samples and significantly improve finger-vein recognition accuracy.
APA, Harvard, Vancouver, ISO, and other styles
5

Zhang, Xiang, Wanqing Zhao, Hangzai Luo, Jinye Peng, and Jianping Fan. "Class Guided Channel Weighting Network for Fine-Grained Semantic Segmentation." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (2022): 3344–52. http://dx.doi.org/10.1609/aaai.v36i3.20244.

Full text
Abstract:
Deep learning has achieved promising performance on semantic segmentation, but few works focus on semantic segmentation at the fine-grained level. Fine-grained semantic segmentation requires recognizing and distinguishing hundreds of sub-categories. Due to the high similarity of different sub-categories and large variations in poses, scales, rotations, and color of the same sub-category in the fine-grained image set, the performance of traditional semantic segmentation methods will decline sharply. To alleviate these dilemmas, a new approach, named Class Guided Channel Weighting Network (CGCWNet), is developed in this paper to enable fine-grained semantic segmentation. For the large intra-class variations, we propose a Class Guided Weighting (CGW) module, which learns the image-level fine-grained category probabilities by exploiting second-order feature statistics, and use them as global information to guide semantic segmentation. For the high similarity between different sub-categories, we specially build a Channel Relationship Attention (CRA) module to amplify the distinction of features. Furthermore, a Detail Enhanced Guided Filter (DEGF) module is proposed to refine the boundaries of object masks by using an edge contour cue extracted from the enhanced original image. Experimental results on PASCAL VOC 2012 and six fine-grained image sets show that our proposed CGCWNet has achieved state-of-the-art results.
APA, Harvard, Vancouver, ISO, and other styles
6

Al-Dulaimi, Khamael, Jasmine Banks, Aiman Al-Sabaawi, Kien Nguyen, Vinod Chandran, and Inmaculada Tomeo-Reyes. "Classification of HEp-2 Staining Pattern Images Using Adapted Multilayer Perceptron Neural Network-Based Intra-Class Variation of Cell Shape." Sensors 23, no. 4 (2023): 2195. http://dx.doi.org/10.3390/s23042195.

Full text
Abstract:
There exists a growing interest from the clinical practice research communities in the development of methods to automate HEp-2 stained cells classification procedure from histopathological images. Challenges faced by these methods include variations in cell densities and cell patterns, overfitting of features, large-scale data volume and stained cells. In this paper, a multi-class multilayer perceptron technique is adapted by adding a new hidden layer to calculate the variation in the mean, scale, kurtosis and skewness of higher order spectra features of the cell shape information. The adapted technique is then jointly trained and the probability of classification calculated using a Softmax activation function. This method is proposed to address overfitting, stained and large-scale data volume problems, and classify HEp-2 staining cells into six classes. An extensive experimental analysis is studied to verify the results of the proposed method. The technique has been trained and tested on the dataset from ICPR-2014 and ICPR-2016 competitions using the Task-1. The experimental results have shown that the proposed model achieved higher accuracy of 90.3% (with data augmentation) than of 87.5% (with no data augmentation). In addition, the proposed framework is compared with existing methods, as well as, the results of methods using in ICPR2014 and ICPR2016 competitions.The results demonstrate that our proposed method effectively outperforms recent methods.
APA, Harvard, Vancouver, ISO, and other styles
7

Zhao, Yan, Ganyun Lv, and Gongyi Hong. "Few-Shot Segmentation via Capturing Interclass and Intraclass Cues Using Class Activation Map." Complexity 2022 (July 4, 2022): 1–7. http://dx.doi.org/10.1155/2022/4901746.

Full text
Abstract:
Few-shot segmentation is a challenging task due to the limited class cues provided by a few of annotations. Discovering more class cues from known and unknown classes is the essential to few-shot segmentation. Existing method generates class cues mainly from common cues intra new classes where the similarity between support images and query images is measured to locate the foreground regions. However, the support images are not sufficient enough to measure the similarity since one or a few of support mask cannot describe the object of new class with large variations. In this paper, we capture the class cues by considering all images in the unknown classes, i.e., not only the support images but also the query images are used to capture the foreground regions. Moreover, the class-level labels in the known classes are also considered to capture the discriminative feature of new classes. The two aspects are achieved by class activation map which is used as attention map to improve the feature extraction. A new few-shot segmentation based on mask transferring and class activation map is proposed, and a new class activation map based on feature clustering is proposed to refine the class activation map. The proposed method is validated on Pascal Voc dataset. Experimental results demonstrate the effectiveness of the proposed method with larger mIoU values.
APA, Harvard, Vancouver, ISO, and other styles
8

Guo, Lei, Gang Xie, Xinying Xu, and Jinchang Ren. "Effective Melanoma Recognition Using Deep Convolutional Neural Network with Covariance Discriminant Loss." Sensors 20, no. 20 (2020): 5786. http://dx.doi.org/10.3390/s20205786.

Full text
Abstract:
Melanoma recognition is challenging due to data imbalance and high intra-class variations and large inter-class similarity. Aiming at the issues, we propose a melanoma recognition method using deep convolutional neural network with covariance discriminant loss in dermoscopy images. Deep convolutional neural network is trained under the joint supervision of cross entropy loss and covariance discriminant loss, rectifying the model outputs and the extracted features simultaneously. Specifically, we design an embedding loss, namely covariance discriminant loss, which takes the first and second distance into account simultaneously for providing more constraints. By constraining the distance between hard samples and minority class center, the deep features of melanoma and non-melanoma can be separated effectively. To mine the hard samples, we also design the corresponding algorithm. Further, we analyze the relationship between the proposed loss and other losses. On the International Symposium on Biomedical Imaging (ISBI) 2018 Skin Lesion Analysis dataset, the two schemes in the proposed method can yield a sensitivity of 0.942 and 0.917, respectively. The comprehensive results have demonstrated the efficacy of the designed embedding loss and the proposed methodology.
APA, Harvard, Vancouver, ISO, and other styles
9

Zeng, Rui, and Jingsong He. "Grouping Bilinear Pooling for Fine-Grained Image Classification." Applied Sciences 12, no. 10 (2022): 5063. http://dx.doi.org/10.3390/app12105063.

Full text
Abstract:
Fine-grained image classification is a challenging computer visual task due to the small interclass variations and large intra-class variations. Extracting expressive feature representation is an effective way to improve the accuracy of fine-grained image classification. Bilinear pooling is a simple and effective high-order feature interaction method. Compared with common pooling methods, bilinear pooling can obtain better feature representation by capturing complex associations between high-order features. However, the dimensions of bilinear representation are often up to hundreds of thousands or even millions. In order to get compact bilinear representation, we propose grouping bilinear pooling (GBP) for fine-grained image classification in this paper. Firstly, by dividing the feature layers into different groups, and then carrying out intra-group bilinear pooling or inter-group bilinear pooling. The representation captured by GBP can achieve the same accuracy with less than 0.4% parameters compared with full bilinear representation when using the same backbone. This extreme compact representation largely overcomes the high redundancy of the full bilinear representation, the computational cost and storage consumption. Besides, it is because GBP compresses the bilinear representation to the extreme that it can be used with more powerful backbones as a plug-and-play module. The effectiveness of GBP is proved by experiments on the widely used fine-grained recognition datasets CUB and Stanford Cars.
APA, Harvard, Vancouver, ISO, and other styles
10

Boylan, Sinead M., Janet E. Cade, Sara F. L. Kirk, et al. "Assessing caffeine exposure in pregnant women." British Journal of Nutrition 100, no. 4 (2008): 875–82. http://dx.doi.org/10.1017/s0007114508939842.

Full text
Abstract:
Studies on the effects of caffeine on health, while numerous, have produced inconsistent results. One of the most uncertain and controversial effects is on pregnancy outcome. Studies have produced conflicting results due to a number of methodological variations. The major challenge is the accurate assessment of caffeine intake. The aim of the present study was to explore different methods of assessing caffeine exposure in pregnant women. Twenty-four healthy pregnant women from the UK city of Leeds completed both a detailed questionnaire, the caffeine assessment tool (CAT) designed specifically to assess caffeine intake and a prospective 3 d food and drink diary. The women also provided nine saliva samples over two consecutive days for estimation of caffeine and a metabolite (paraxanthine). Caffeine intakes from the CAT and diary showed adequate agreement (intra-class correlation coefficient of 0·5). For saliva caffeine and paraxanthine measures, the between-sample variation (within the same woman) was greater than between-woman and between-day variation. However, there was still adequate agreement between these measures and the CAT. The CAT is a valuable tool that is now being used in a large prospective study investigating caffeine's role in pregnancy outcome.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Large intra-class variations"

1

Somanath, Gowri, and Chandra Kambhamettu. "Abstraction and Generalization of 3D Structure for Recognition in Large Intra-Class Variation." In Computer Vision – ACCV 2010. Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-19318-7_38.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Mahantesh, K., and Manjunath Aradhya V N. "An Impact of Gaussian Mixtures in Image Retrieval System." In Advances in Computational Intelligence and Robotics. IGI Global, 2016. http://dx.doi.org/10.4018/978-1-4666-9474-3.ch002.

Full text
Abstract:
The difficulty of searching for patterns in data is still exploratory and, ever increasing image datasets with high intra-class variations has created a large scope for generalizing image classification problems. This chapter initiates the inclusivity of discrete latent variables leading to mixture of Gaussians capturing multimodal distributions from segmented regions. Further, these mixtures are analyzed in maximum likelihood framework to extract discriminatory features in compact and de-correlated feature space. Conversely, it is less evident in literature that combining these features with diverse distance measure techniques and neural network classifiers improves the classification performance. In this chapter, we study, explore and demonstrate the idea of subspace mixture models as hybrid intelligent technique for image retrieval systems.
APA, Harvard, Vancouver, ISO, and other styles
3

Chowdhury, Rinku Roy, and Laura C. Schneider. "Land Cover and Land Use: Classification and Change Analysis." In Integrated Land-Change Science and Tropical Deforestation in the Southern Yucatan. Oxford University Press, 2004. http://dx.doi.org/10.1093/oso/9780199245307.003.0015.

Full text
Abstract:
Despite its international designation as a hotspot of biodiversity and tropical deforestation (Achard et al. 1988), the micro-scale land-cover mapping of southern Yucatán peninsular region remains surprisingly incomplete, hindering various kinds of research, including that proposed in the SYPR project. This chapter details the methodology for the thematic classification and change detection of land use and cover in the tropical sub-humid environment of the region. A hybrid approach using principal components and texture analyses of Landsat TM data enabled the distinction of land-cover classes at the local scale, including mature and secondary forest, savannas, and cropland/pasture. Results indicate that texture analysis increases the statistical separability of cover class signatures, the magnitude of improvement varying among pairs of land-cover classes. At a local level, the availability of exhaustive training site data over recent history (10–13 years) in a repository of highly detailed land-use sketch maps allows the distinction of greater numbers of land-cover classes, including three successional stages of vegetation. At the regional scale, finely detailed land-cover classes are aggregated for greater ability to generalize in a terrain wherein vegetation exhibits marked regional and seasonal variation in intra-class spectral properties. Post-classification change detection identifies the quantities and spatial pattern of major land-cover changes in a ten-year period in the region. Change analysis results indicate an average annual rate of deforestation of 0.4 per cent, with much regional variation and most change located at three subregional hotspots. Deforestation as well as successional regrowth is highest in a southern hotspot located in the newly colonized southern part of the region, an area where commercial chili production is large. The objectives of this chapter are to describe and evaluate: (1) an experimental methodology that iteratively combines three suites of image-processing techniques (PCA, texture transformation, and NDVI); (2) the statistical separability of distinct land-cover signatures; and (3) a post-classification change detection for the region from 1987 to 1997 in order to derive regional deforestation rates, and identify the spatial pattern of deforestation and secondary forest succession. Specifically, a region encompassing 18,700km2 (those land units completely within the defined region; Fig. 7.1) was mapped using a maximum likelihood supervised classification of lower-order principal components of Landsat TM imagery after tasseled-cap and texture transformations.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Large intra-class variations"

1

Conly, Christopher, Alex Dillhoff, and Vassilis Athitsos. "Leveraging intra-class variations to improve large vocabulary gesture recognition." In 2016 23rd International Conference on Pattern Recognition (ICPR). IEEE, 2016. http://dx.doi.org/10.1109/icpr.2016.7899751.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Rattani, Ajita, Gian Luca Marcialis, and Fabio Roli. "Capturing large intra-class variations of biometric data by template co-updating." In 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops). IEEE, 2008. http://dx.doi.org/10.1109/cvprw.2008.4563116.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Kim, Sungho, In So Kweon, and Chil-Woo Lee. "Visual Categorization Robust to Large Intra-Class Variations using Entropy-guided Codebook." In 2007 IEEE International Conference on Robotics and Automation. IEEE, 2007. http://dx.doi.org/10.1109/robot.2007.364060.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Li, Zheng, Caili Guo, Zerun Feng, Jenq-Neng Hwang, and Xijun Xue. "Multi-View Visual Semantic Embedding." In Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}. International Joint Conferences on Artificial Intelligence Organization, 2022. http://dx.doi.org/10.24963/ijcai.2022/158.

Full text
Abstract:
Visual Semantic Embedding (VSE) is a dominant method for cross-modal vision-language retrieval. Its purpose is to learn an embedding space so that visual data can be embedded in a position close to the corresponding text description. However, there are large intra-class variations in the vision-language data. For example, multiple texts describing the same image may be described from different views, and the descriptions of different views are often dissimilar. The mainstream VSE method embeds samples from the same class in similar positions, which will suppress intra-class variations and lead to inferior generalization performance. This paper proposes a Multi-View Visual Semantic Embedding (MV-VSE) framework, which learns multiple embeddings for one visual data and explicitly models intra-class variations. To optimize MV-VSE, a multi-view upper bound loss is proposed, and the multi-view embeddings are jointly optimized while retaining intra-class variations. MV-VSE is plug-and-play and can be applied to various VSE models and loss functions without excessively increasing model complexity. Experimental results on the Flickr30K and MS-COCO datasets demonstrate the superior performance of our framework.
APA, Harvard, Vancouver, ISO, and other styles
5

Yu, Yunlong, Dingyi Zhang, Yingming Li, and Zhongfei Zhang. "Multi-Proxy Learning from an Entropy Optimization Perspective." In Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}. International Joint Conferences on Artificial Intelligence Organization, 2022. http://dx.doi.org/10.24963/ijcai.2022/222.

Full text
Abstract:
Deep Metric Learning, a task that learns a feature embedding space where semantically similar samples are located closer than dissimilar samples, is a cornerstone of many computer vision applications. Most of the existing proxy-based approaches usually exploit the global context via learning a single proxy for each training class, which struggles in capturing the complex non-uniform data distribution with different patterns. In this work, we present an easy-to-implement framework to effectively capture the local neighbor relationships via learning multiple proxies for each class that collectively approximate the intra-class distribution. In the context of large intra-class visual diversity, we revisit the entropy learning under the multi-proxy learning framework and provide a training routine that both minimizes the entropy of intra-class probability distribution and maximizes the entropy of inter-class probability distribution. In this way, our model is able to better capture the intra-class variations and smooth the inter-class differences and thus facilitates to extract more semantic feature representations for the downstream tasks. Extensive experimental results demonstrate that the proposed approach achieves competitive performances. Codes and an appendix are provided.
APA, Harvard, Vancouver, ISO, and other styles
6

Ye, Mang, Zheng Wang, Xiangyuan Lan, and Pong C. Yuen. "Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking." In Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/152.

Full text
Abstract:
Cross-modality person re-identification between the thermal and visible domains is extremely important for night-time surveillance applications. Existing works in this filed mainly focus on learning sharable feature representations to handle the cross-modality discrepancies. However, besides the cross-modality discrepancy caused by different camera spectrums, visible thermal person re-identification also suffers from large cross-modality and intra-modality variations caused by different camera views and human poses. In this paper, we propose a dual-path network with a novel bi-directional dual-constrained top-ranking loss to learn discriminative feature representations. It is advantageous in two aspects: 1) end-to-end feature learning directly from the data without extra metric learning steps, 2) it simultaneously handles the cross-modality and intra-modality variations to ensure the discriminability of the learnt representations. Meanwhile, identity loss is further incorporated to model the identity-specific information to handle large intra-class variations. Extensive experiments on two datasets demonstrate the superior performance compared to the state-of-the-arts.
APA, Harvard, Vancouver, ISO, and other styles
7

Jin, Taisong, Liujuan Cao, Baochang Zhang, Xiaoshuai Sun, Cheng Deng, and Rongrong Ji. "Hypergraph Induced Convolutional Manifold Networks." In Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/371.

Full text
Abstract:
Deep convolutional neural networks (DCNN) with manifold embedding have achieved considerable attention in computer vision. However, prior arts are usually based on the neighborhood-based graph modeling only the pairwise relationship between two samples, which fail to fully capture intra-class variations and thus suffer from severe performance loss for noisy data. While such intra-class variations can be well captured via sophisticated hypergraph structure, we are motivated and lead a hypergraph induced Convolutional Manifold Network (H-CMN) to significantly improve the representation capacity of DCNN for the complex data. Specifically, two innovative designs are provides: 1) our manifold preserving method is implemented based on a mini-batch, which can be efficiently plugged into the existing DCNN training pipelines and be scalable for large datasets; 2) a robust hypergraph is built for each mini-batch, which not only offers a strong robustness against typical noise, but also captures the variances from multiple features. Extensive experiments on the image classification task on large benchmarking datasets demonstrate that our model achieves much better performance than the state-of-the-art
APA, Harvard, Vancouver, ISO, and other styles
8

Xu, Wanlu, Hong Liu, Wei Shi, Ziling Miao, Zhisheng Lu, and Feihu Chen. "Adversarial Feature Disentanglement for Long-Term Person Re-identification." In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/166.

Full text
Abstract:
Most existing person re-identification methods are effective in short-term scenarios because of their appearance dependencies. However, these methods may fail in long-term scenarios where people might change their clothes. To this end, we propose an adversarial feature disentanglement network (AFD-Net) which contains intra-class reconstruction and inter-class adversary to disentangle the identity-related and identity-unrelated (clothing) features. For intra-class reconstruction, the person images with the same identity are represented and disentangled into identity and clothing features by two separate encoders, and further reconstructed into original images to reduce intra-class feature variations. For inter-class adversary, the disentangled features across different identities are exchanged and recombined to generate adversarial clothes-changing images for training, which makes the identity and clothing features more independent. Especially, to supervise these new generated clothes-changing images, a re-feeding strategy is designed to re-disentangle and reconstruct these new images for image-level self-supervision in the original image space and feature-level soft-supervision in the disentangled feature space. Moreover, we collect a challenging Market-Clothes dataset and a real-world PKU-Market-Reid dataset for evaluation. The results on one large-scale short-term dataset (Market-1501) and five long-term datasets (three public and two we proposed) confirm the superiority of our method against other state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
9

Ouyang, Yi, Ming Tang, Jinqiao Wang, Hanqing Lu, and Songde Ma. "Boosting relative spaces for categorizing objects with large intra-class variation." In Proceeding of the 16th ACM international conference. ACM Press, 2008. http://dx.doi.org/10.1145/1459359.1459454.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

zhong Wu, Cong, Hao Dong, Xuan jie Lin, et al. "Adaptive Filtering Remote Sensing Image Segmentation Network based on Attention Mechanism." In 10th International Conference on Information Technology Convergence and Services (ITCSE 2021). AIRCC Publishing Corporation, 2021. http://dx.doi.org/10.5121/csit.2021.110903.

Full text
Abstract:
It is difficult to segment small objects and the edge of the object because of larger-scale variation, larger intra-class variance of background and foreground-background imbalance in the remote sensing imagery. In convolutional neural networks, high frequency signals may degenerate into completely different ones after downsampling. We define this phenomenon as aliasing. Meanwhile, although dilated convolution can expand the receptive field of feature map, a much more complex background can cause serious alarms. To alleviate the above problems, we propose an attention-based mechanism adaptive filtered segmentation network. Experimental results on the Deepglobe Road Extraction dataset and Inria Aerial Image Labeling dataset showed that our method can effectively improve the segmentation accuracy. The F1 value on the two data sets reached 82.67% and 85.71% respectively.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography