Academic literature on the topic 'ImageNet Database'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'ImageNet Database.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "ImageNet Database"

1

Fei-Fei, L., J. Deng, and K. Li. "ImageNet: Constructing a large-scale image database." Journal of Vision 9, no. 8 (2010): 1037. http://dx.doi.org/10.1167/9.8.1037.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Huang, Yuming. "Multiple SOTA Convolutional Neural Networks for Facial Expression Recognition." Applied and Computational Engineering 8, no. 1 (2023): 240–45. http://dx.doi.org/10.54254/2755-2721/8/20230135.

Full text
Abstract:
Facial Expression Recognition (FER) has been a popular topic in the field of computer vision. Various and plentiful facial expression datasets emerged every year for people to train their models and compete. ImageNet, as a massive database for image classification, became a standard benchmark for new computer vision models. Many excellent models such as VGG, ResNet, and EfficientNet managed to excel and were regarded as state-of-the-art models (SOTAs). This study aims to investigate whether SOTA models trained on ImageNet can perform exceptionally well in FER tasks. The models are categorized into three groups based on different weight initialization strategies and then trained and evaluated on the FER-2013 dataset. The results indicate that models with weights trained on ImageNet can be fine-tuned and perform well in FER-2013, particularly when compared to other groups. Finally, simpler models with less computational costs are promoted considering the need for real-time application of facial expression recognition.
APA, Harvard, Vancouver, ISO, and other styles
3

Sobti, Priyal, Anand Nayyar, Niharika, and Preeti Nagrath. "EnsemV3X: a novel ensembled deep learning architecture for multi-label scene classification." PeerJ Computer Science 7 (May 25, 2021): e557. http://dx.doi.org/10.7717/peerj-cs.557.

Full text
Abstract:
Convolutional neural network is widely used to perform the task of image classification, including pretraining, followed by fine-tuning whereby features are adapted to perform the target task, on ImageNet. ImageNet is a large database consisting of 15 million images belonging to 22,000 categories. Images collected from the Web are labeled using Amazon Mechanical Turk crowd-sourcing tool by human labelers. ImageNet is useful for transfer learning because of the sheer volume of its dataset and the number of object classes available. Transfer learning using pretrained models is useful because it helps to build computer vision models in an accurate and inexpensive manner. Models that have been pretrained on substantial datasets are used and repurposed for our requirements. Scene recognition is a widely used application of computer vision in many communities and industries, such as tourism. This study aims to show multilabel scene classification using five architectures, namely, VGG16, VGG19, ResNet50, InceptionV3, and Xception using ImageNet weights available in the Keras library. The performance of different architectures is comprehensively compared in the study. Finally, EnsemV3X is presented in this study. The proposed model with reduced number of parameters is superior to state-of-of-the-art models Inception and Xception because it demonstrates an accuracy of 91%.
APA, Harvard, Vancouver, ISO, and other styles
4

Manoj krishna, M., M. Neelima, M. Harshali, and M. Venu Gopala Rao. "Image classification using Deep learning." International Journal of Engineering & Technology 7, no. 2.7 (2018): 614. http://dx.doi.org/10.14419/ijet.v7i2.7.10892.

Full text
Abstract:
The image classification is a classical problem of image processing, computer vision and machine learning fields. In this paper we study the image classification using deep learning. We use AlexNet architecture with convolutional neural networks for this purpose. Four test images are selected from the ImageNet database for the classification purpose. We cropped the images for various portion areas and conducted experiments. The results show the effectiveness of deep learning based image classification using AlexNet.
APA, Harvard, Vancouver, ISO, and other styles
5

Varga, Domonkos. "Multi-Pooled Inception Features for No-Reference Image Quality Assessment." Applied Sciences 10, no. 6 (2020): 2186. http://dx.doi.org/10.3390/app10062186.

Full text
Abstract:
Image quality assessment (IQA) is an important element of a broad spectrum of applications ranging from automatic video streaming to display technology. Furthermore, the measurement of image quality requires a balanced investigation of image content and features. Our proposed approach extracts visual features by attaching global average pooling (GAP) layers to multiple Inception modules of on an ImageNet database pretrained convolutional neural network (CNN). In contrast to previous methods, we do not take patches from the input image. Instead, the input image is treated as a whole and is run through a pretrained CNN body to extract resolution-independent, multi-level deep features. As a consequence, our method can be easily generalized to any input image size and pretrained CNNs. Thus, we present a detailed parameter study with respect to the CNN base architectures and the effectiveness of different deep features. We demonstrate that our best proposal—called MultiGAP-NRIQA—is able to outperform the state-of-the-art on three benchmark IQA databases. Furthermore, these results were also confirmed in a cross database test using the LIVE In the Wild Image Quality Challenge database.
APA, Harvard, Vancouver, ISO, and other styles
6

T., Tritva Jyothi Kiran. "Deep Transform Learning Vision Accuracy Analysis on GPU using Tensor Flow." International Journal of Recent Technology and Engineering (IJRTE) 9, no. 3 (2020): 224–27. https://doi.org/10.35940/ijrte.C4402.099320.

Full text
Abstract:
Transfer learning is one of the most amazing concepts in machine learning and A.I. Transfer learning is completely unsupervised model. Transfer learning is a machine learning technique in which a network that has been trained to perform a specific task is being reused or repurposed as a starting point to perform another similar task. For this work I used ImageNet Dataset and MobileNet model to analyse Accuracy performance of my Deep Transform learning model on GPU of Intel® Core™ i3-7100U CPU using TensorFlow 2.0 Hub and Keras. ImageNet is an open source Large-Scale dataset of images consisting of 1000 classes and over 1.5 million images. And my overall idea is to analyse accuracy of Vision performance on the very poor network configuration. This work reached an Accuracy almost near to 100% on GPU of Intel® Core™ i3-7100U CPU which is great result with datasets used in this work are not easy to deal and having a lot of classes. That’s why it’s impacting the performance of the network. To classify and predict from tons of images from more classes on low configured network is really challenging one, it’s a great thing the computer vision accuracy showed an excellent vision nearly 100% on GPU in my work.
APA, Harvard, Vancouver, ISO, and other styles
7

Chen, Yao-Mei, Yenming J. Chen, Yun-Kai Tsai, Wen-Hsien Ho, and Jinn-Tsong Tsai. "Classification of human electrocardiograms by multi-layer convolutional neural network and hyperparameter optimization." Journal of Intelligent & Fuzzy Systems 40, no. 4 (2021): 7883–91. http://dx.doi.org/10.3233/jifs-189610.

Full text
Abstract:
A multi-layer convolutional neural network (MCNN) with hyperparameter optimization (HyperMCNN) is proposed for classifying human electrocardiograms (ECGs). For performance tests of the HyperMCNN, ECG recordings for patients with cardiac arrhythmia (ARR), congestive heart failure (CHF), and normal sinus rhythm (NSR) were obtained from three PhysioNet databases: MIT-BIH Arrhythmia Database, BIDMC Congestive Heart Failure Database, and MIT-BIH Normal Sinus Rhythm Database, respectively. The MCNN hyperparameters in convolutional layers included number of filters, filter size, padding, and filter stride. The hyperparameters in max-pooling layers were pooling size and pooling stride. Gradient method was also a hyperparameter used to train the MCNN model. Uniform experimental design approach was used to optimize the hyperparameter combination for the MCNN. In performance tests, the resulting 16-layer CNN with an appropriate hyperparameter combination (16-layer HyperMCNN) was used to distinguish among ARR, CHF, and NSR. The experimental results showed that the average correct rate and standard deviation obtained by the 16-layer HyperMCNN were superior to those obtained by a 16-layer CNN with a hyperparameter combination given by Matlab examples. Furthermore, in terms of performance in distinguishing among ARR, CHF, and NSR, the 16-layer HyperMCNN was superior to the 25-layer AlexNet, which was the neural network that had the best image identification performance in the ImageNet Large Scale Visual Recognition Challenge in 2012.
APA, Harvard, Vancouver, ISO, and other styles
8

TİRYAKİ, Volkan Müjdat. "Deep Transfer Learning to Classify Mass and Calcification Pathologies from Screen Film Mammograms." Bitlis Eren Üniversitesi Fen Bilimleri Dergisi 12, no. 1 (2023): 57–65. http://dx.doi.org/10.17798/bitlisfen.1190134.

Full text
Abstract:
The number of breast cancer diagnosis is the biggest among all cancers, but it can be treated if diagnosed early. Mammography is commonly used for detecting abnormalities and diagnosing the breast cancer. Breast cancer screening and diagnosis are still being performed by radiologists. In the last decade, deep learning was successfully applied on big image classification databases such as ImageNet. Deep learning methods for the automated breast cancer diagnosis is under investigation. In this study, breast cancer mass and calcification pathologies are classified by using deep transfer learning methods. A total of 3,360 patches were used from the Digital Database for Screening Mammography (DDSM) and CBIS-DDSM mammogram databases for convolutional neural network training and testing. Transfer learning was applied using Resnet50, Xception, NASNet, and EfficientNet-B7 network backbones. The best classification performance was achieved by the Xception network. On the original CBIS-DDSM test data, an AUC of 0.9317 was obtained for the five-way classification problem. The results are promising for the implementation of automated diagnosis of breast cancer.
APA, Harvard, Vancouver, ISO, and other styles
9

Krasteva, Vessela, Todor Stoyanov, Stefan Naydenov, Ramun Schmid, and Irena Jekova. "Detection of Atrial Fibrillation in Holter ECG Recordings by ECHOView Images: A Deep Transfer Learning Study." Diagnostics 15, no. 7 (2025): 865. https://doi.org/10.3390/diagnostics15070865.

Full text
Abstract:
Background/Objectives: The timely and accurate detection of atrial fibrillation (AF) is critical from a clinical perspective. Detecting short or transient AF events is challenging in 24–72 h Holter ECG recordings, especially when symptoms are infrequent. This study aims to explore the potential of deep transfer learning with ImageNet deep neural networks (DNNs) to improve the interpretation of short-term ECHOView images for the presence of AF. Methods: Thirty-second ECHOView images, composed of stacked heartbeat amplitudes, were rescaled to fit the input of 18 pretrained ImageNet DNNs with the top layers modified for binary classification (AF, non-AF). Transfer learning provided both retrained DNNs by training only the top layers (513–2048 trainable parameters) and fine-tuned DNNs by slowly training retrained DNNs (0.38–23.48 M parameters). Results: Transfer learning used 13,536 training and 6624 validation samples from the two leads in the IRIDIA-AF Holter ECG database, evenly split between AF and non-AF cases. The top-ranked DNNs evaluated on 11,400 test samples from independent records are the retrained EfficientNetV2B1 (96.3% accuracy with minimal inter-patient (1%) and inter-lead (0.3%) drops), and fine-tuned EfficientNetV2B1 and DenseNet-121, -169, -201 (97.2–97.6% accuracy with inter-patient (1.4–1.6%) and inter-lead (0.5–1.2%) drops). These models can process shorter ECG episodes with a tolerable accuracy drop of up to 0.6% for 20 s and 4–15% for 10 s. Case studies present the GradCAM heatmaps of retrained EfficientNetV2B1 overlaid on raw ECG and ECHOView images to illustrate model interpretability. Conclusions: In an extended deep transfer learning study, we validate that ImageNet DNNs applied to short-term ECHOView images through retraining and fine-tuning can significantly enhance automated AF diagnoses. GradCAM heatmaps provide meaningful model interpretability, highlighting ECG regions of interest aligned with cardiologist focus.
APA, Harvard, Vancouver, ISO, and other styles
10

Li, Fuqiang, Tongzhuang Zhang, Yong Liu, and Feiqi Long. "Deep Residual Vector Encoding for Vein Recognition." Electronics 11, no. 20 (2022): 3300. http://dx.doi.org/10.3390/electronics11203300.

Full text
Abstract:
Vein recognition has been drawing more attention recently because it is highly secure and reliable for practical biometric applications. However, underlying issues such as uneven illumination, low contrast, and sparse patterns with high inter-class similarities make the traditional vein recognition systems based on hand-engineered features unreliable. Recent successes of convolutional neural networks (CNNs) for large-scale image recognition tasks motivate us to replace the traditional hand-engineered features with the superior CNN to design a robust and discriminative vein recognition system. To address the difficulty of direct training or fine-tuning of a CNN with existing small-scale vein databases, a new knowledge transfer approach is formulated using pre-trained CNN models together with a training dataset (e.g., ImageNet) as a robust descriptor generation machine. With the generated deep residual descriptors, a very discriminative model, namely deep residual vector encoding (DRVE), is proposed by a hierarchical design of dictionary learning, coding, and classifier training procedures. Rigorous experiments are conducted with a high-quality hand-dorsa vein database, and superior recognition results compared with state-of-the-art models fully demonstrate the effectiveness of the proposed models. An additional experiment with the PolyU multispectral palmprint database is designed to illustrate the generalization ability.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "ImageNet Database"

1

Trevino, Hector Guillermo 1965. "ImageNet communications and user interface for a distributed color image database." Thesis, The University of Arizona, 1991. http://hdl.handle.net/10150/277925.

Full text
Abstract:
High speed networking technology has evolved tremendously over the past few years. As a result, network applications which require the transfer of large amounts of data over large geographical areas are now possible using fiber optic networks. One of these such applications is the transfer of color image data at the regional, national, and international levels. ImageNet is a distributed color image database system with multiple database nodes and user workstations linked by a communications network. Each database node serves a number of user workstations within a predefined region. The database nodes are interconnected by a high speed point to point fiber optic network whereas the workstations communicate with the database nodes through a serial link. The work presented here is the design and implementation of the user interface and communication software for a workstation. For development and prototyping purposes, this software was designed to run over an Ethernet network. The results obtained showed that the user workstation software provided the required functionality. We were able to make data dictionary requests, formulate queries, make single and multiple online as well as offline image transfers, and display images.
APA, Harvard, Vancouver, ISO, and other styles
2

Zampieri, Carlos Elias Arminio. "Recuperação de imagens multiescala intervalar." [s.n.], 2010. http://repositorio.unicamp.br/jspui/handle/REPOSIP/275789.

Full text
Abstract:
Orientador: Jorge Stolfi<br>Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação<br>Made available in DSpace on 2018-08-16T21:27:57Z (GMT). No. of bitstreams: 1 Zampieri_CarlosEliasArminio_M.pdf: 4003666 bytes, checksum: a730c8935e9f68bc9c1cd9a6e9d68c8c (MD5) Previous issue date: 2010<br>Resumo: Neste trabalho apresentamos um método geral para busca de imagem por conteúdo (BIPC, CBIR) em grandes coleções de imagens, usando estimação intervalar multiescala de distância. Consideramos especificamente buscas por exemplo, em que o objetivo é encontrar a imagem da coleção que é mais próxima a uma imagem dada, segundo alguma função de distância de imagens. Neste trabalho não procuramos desenvolver métricas que melhor atendem as intenções do usuário; em vez disso, supondo que a métrica está escolhida, apresentamos um algoritmo genérico (que denominamos MuSIS, de Multiscale Image Search) para realizar a busca de maneira eficiente usando aritmética intervalar. Estimativas intervalares das distâncias entre imagens são usadas para eliminar rapidamente imagens candidatas, considerando apenas versões reduzidas das mesmas, de maneira semelhante ao paradigma de otimização branch-and-bound. Como parte deste trabalho, desenvolvemos estimadores intervalares eficazes para distância euclidiana e algumas variantes da mesma, incluindo métricas sensíveis ao gradiente em escalas variadas. Experimentos indicaram que o método promove significativa redução de custos em relação à busca exaustiva. Apesar de menos eficiente do que outros métodos comumente usados para BIPC, o algoritmo MuSIS sempre retorna a resposta exata - isto é, a imagem mais próxima na métrica escolhida - e não apenas uma aproximação. A abordagem MuSIS é compatível com uma ampla variedade de funções de distância, sem a necessidade de pré-calcular ou armazenar descritores específicos para cada função<br>Abstract: We present a general method for content-based image retrieval (CBIR) in large image collections, using multiscale interval distance estimation. We consider specifically queries by example, where the goal is to find the image in the collection that is closest to a given image, according to some image distance function. In this work we do not aim to develop metrics that best meet the user's intentions; instead, assuming that the metric is chosen, we describe an algorithm (wich we call MuSIS, for MultiScale Image Search) to perform the search efficiently using interval arithmetic. Interval estimates of the image distances are used to quickly discard candidate images after examining only small versions of them, in a manner similar to the branch-and-bound optimization paradigm. As part of this work, we developed effective interval estimators for the Euclidean distance and for some variations of it, including metrics that are sensitive to the gradient at various scales. Experiments indicate that the method yields significant cost savings over exhaustive search. Although less efficient than other methods commonly used for CBIR, the MuSIS algorithm always returns the exact answer - that is, the nearest image in metric chosen - and not just an approximation thereof. The MuSIS approach is compatible with a wide variety of distance functions without the need to pre-compute or store specific descriptors for each function<br>Mestrado<br>Processamento de Imagens<br>Mestre em Ciência da Computação
APA, Harvard, Vancouver, ISO, and other styles
3

Bergamasco, Leila Cristina Carneiro. "Recuperação de imagens cardiacas tridimensionais por conteúdo." Universidade de São Paulo, 2013. http://www.teses.usp.br/teses/disponiveis/100/100131/tde-23092013-152421/.

Full text
Abstract:
Os modelos tridimensionais fornecem uma visão mais completa dos objetos analisados por considerarem a profundidade de cada um deles. Com o crescimento de modelos tridimensionais disponíveis atualmente na área de saúde, se faz necessária a implementação de mecanismos eficientes de busca, que ofereçam formas alternativas para localizar casos de pacientes com determinadas características. A disponibilização de um histórico de imagens similares em relação àquelas pertencentes ao exame do paciente pode auxiliar no diagnóstico oferecendo casos semelhantes. O presente projeto visou desenvolver técnicas para recuperação de imagens médicas tridimensionais com base em seu conteúdo, com foco no contexto médico, mais especificamente na área cardíaca. Pretendeu-se contribuir com a detecção de anomalias por meio da disponibilização de quadros clínicos similares por meio de um protótipo de sistema de consulta. Para alcançar o objetivo proposto foram realizadas as seguintes etapas: revisão bibliográfica, definição da base de dados, implementação de extratores e funções de similaridade, construção de um protótipo de sistema de recuperação, realização de testes com imagens médicas e análises dos resultados. Os resultados obtidos com os métodos desenvolvidos foram positivos, alcançando em alguns testes 90% de precisão no retorno da busca. Verificou-se que extratores que levaram em consideração a informação espacial das deformações obtiveram um resultado melhor do que os métodos que analisaram os modelos sob uma perspectiva global. Estes resultados confirmaram o potencial que a recuperação por conteúdo possui no contexto médico podendo auxiliar na composição de diagnósticos, além de contribuir com a área de Computação no sentido de ter desenvolvido técnicas para recuperação por conteúdo no domínio de modelos tridimensionais.<br>Three-dimensional models provide a more complete view about objects analyzed by considering their depth. Considering the growth of three-dimensional models currently available in Health area, it is necessary to implement ecient query mechanisms that oer alternative ways to locate cases of patients with certain characteristics. Providing a images historical similar to those belonging to the patient can aided the diagnosis oering similar clinical cases. This project aimed to develop techniques to recovery three-dimensional medical images based on their content and apply them in the medical context, specically in the Cardiology area. This project intended to contribute to the detection of anomalies making available similar clinical cases, throught a prototype of query system. To achieve the proposed objectives the following phases are planned: literature review, denition of the database that will be used, extractors and similarity functions implementation, cons-truction of a retrieval system prototype, conduction of tests with medical imaging and analysis of results. The results obtained with the methods developed were positive, in some tests were achieved 90% of accuracy in the search return. It was found that descrip-tors that took into account the spatial information of the deformations obtained a better result than the methods which analyzed the models from a global perspective. These results conrmed the potential of content based retrieval has in the medical context to assist in diagnosis composition as well as contributing to the Computing eld in the sense of had developed content based retrieval methods on three-dimensional models domain.
APA, Harvard, Vancouver, ISO, and other styles
4

Santos, Marcelo dos. "Ambiente para avaliação de algoritmos de processamento de imagens médicas." Universidade de São Paulo, 2006. http://www.teses.usp.br/teses/disponiveis/3/3142/tde-19042007-165507/.

Full text
Abstract:
Constantemente, uma variedade de novos métodos de processamento de imagens é apresentada à comunidade. Porém poucos têm provado sua utilidade na rotina clínica. A análise e comparação de diferentes abordagens por meio de uma mesma metodologia são essenciais para a qualificação do projeto de um algoritmo. Porém, é difícil comparar o desempenho e adequabilidade de diferentes algoritmos de uma mesma maneira. A principal razão deve-se à dificuldade para avaliar exaustivamente um software, ou pelo menos, testá-lo num conjunto abrangente e diversificado de casos clínicos. Muitas áreas - como o desenvolvimento de software e treinamentos em Medicina - necessitam de um conjunto diverso e abrangente de dados sobre imagens e informações associadas. Tais conjuntos podem ser utilizados para desenvolver, testar e avaliar novos softwares clínicos, utilizando dados públicos. Este trabalho propõe o desenvolvimento de um ambiente de base de imagens médicas de diferentes modalidades para uso livre em diferentes propósitos. Este ambiente - implementado como uma arquitetura de base distribuída de imagens - armazena imagens médicas com informações de aquisição, laudos, algoritmos de processamento de imagens, gold standards e imagens pós-processadas. O ambiente também possui um modelo de revisão de documentos que garante a qualidade dos conjuntos de dados. Como exemplo da facilidade e praticidade de uso, são apresentadas as avaliações de duas categorias de métodos de processamento de imagens médicas: segmentação e compressão. Em adição, a utilização do ambiente em outras atividades, como no projeto do arquivo didático digital do HC-FMUSP, demonstra a robustez da arquitetura proposta e sua aplicação em diferentes propósitos.<br>Constantly, a variety of new image processing methods are presented to the community. However, few of them have proved to be useful when used in clinical routine. The task of analyzing and comparing different algorithms, methods and applications through a sound testing is an essential qualification of algorithm design. However, it is usually very difficult to compare the performance and adequacy of different algorithms in the same way. The main reason is due to the difficulty to assess exhaustively the software, or at least using a comprehensive and diverse number of clinical cases for comparison. Several areas such as software development, image processing and medical training need a diverse and comprehensive dataset of images and related information. Such datasets could be used to develop, test and evaluate new medical software, using public data. This work presents the development of a free, online, multipurpose and multimodality medical image database environment. The environment, implemented such as a distributed medical image database, stores medical images, reports, image processing softwares, gold standards and post-processed images. Also, this environment implements a peer review model which assures the quality of all datasets. As an example of feasibility and easyness of use, it is shown the evaluation in two categories of medical image processing methods: segmentation and compression. In addition, the use of the set of applications proposed in this work in other activities, such as the HC-FMUSP digital teaching file, shows the robustness of the proposed architecture and its applicability on different purposes.
APA, Harvard, Vancouver, ISO, and other styles
5

Fedel, Gabriel de Souza. "Busca multimodal para apoio à pesquisa em biodiversidade." [s.n.], 2011. http://repositorio.unicamp.br/jspui/handle/REPOSIP/275751.

Full text
Abstract:
Orientador: Cláudia Maria Bauzer Medeiros<br>Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação<br>Made available in DSpace on 2018-08-18T07:07:49Z (GMT). No. of bitstreams: 1 Fedel_GabrieldeSouza_M.pdf: 14390093 bytes, checksum: 63058da33a22121e927f1cdbaff297d3 (MD5) Previous issue date: 2011<br>Resumo: A pesquisa em computação aplicada à biodiversidade apresenta muitos desafios, que vão desde o grande volume de dados altamente heterogêneos até a variedade de tipos de usuários. Isto gera a necessidade de ferramentas versáteis de recuperação. As ferramentas disponíveis ainda são limitadas e normalmente só consideram dados textuais, deixando de explorar a potencialidade da busca por dados de outra natureza, como imagens ou sons. Esta dissertação analisa os problemas de realizar consultas multimodais a partir de predicados que envolvem texto e imagem para o domínio de biodiversidade, especificando e implementando um conjunto de ferramentas para processar tais consultas. As contribuições do trabalho, validado com dados reais, incluem a construção de uma ontologia taxonômica associada a nomes vulgares e a possibilidade de apoiar dois perfis de usuários (especialistas e leigos). Estas características estendem o escopo da consultas atualmente disponíveis em sistemas de biodiversidade. Este trabalho está inserido no projeto Bio-CORE, uma parceria entre pesquisadores de computação e biologia para criar ferramentas computacionais para dar apoio à pesquisa em biodiversidade<br>Abstract: Research on Computing applied to biodiversity present several challenges, ranging from the massive volumes of highly heterogeneous data to the variety in user profiles. This kind of scenario requires versatile data retrieval and management tools. Available tools are still limited. Most often, they only consider textual data and do not take advantage of the multiple data types available, such as images or sounds. This dissertation discusses issues concerning multimodal queries that involve both text and images as search parameters, for the domanin of biodiversity. It presents the specification and implementation of a set of tools to process such queries, which were validate with real data from Unicamp's Zoology Museum. The aim contributions also include the construction of a taxonomic ontology that includes species common names, and support to both researchers and non-experts in queries. Such features extend the scop of queries available in biodiversity information systems. This research is associated with the Biocore project, jointly conducted by researchers in computing and biology, to design and develop computational tools to support research in biodiversity<br>Mestrado<br>Banco de Dados<br>Mestre em Ciência da Computação
APA, Harvard, Vancouver, ISO, and other styles
6

Almeida, Junior Jurandy Gomes de 1983. "Recuperação de imagens por cor utilizando analise de distribuição discreta de caracteristicas." [s.n.], 2007. http://repositorio.unicamp.br/jspui/handle/REPOSIP/276206.

Full text
Abstract:
Orientadores: Siome Klein Goldenstein, Ricardo da Silva Torres<br>Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação<br>Made available in DSpace on 2018-08-09T20:33:35Z (GMT). No. of bitstreams: 1 AlmeidaJunior_JurandyGomesde_M.pdf: 4495355 bytes, checksum: 23f3f269bbf0d0e9336b8f3d53677c93 (MD5) Previous issue date: 2007<br>Resumo: A evolução das tecnologias de aquisição, transmissão e armazenamento de imagens tem permitido a construção dc bancos dc imagens cada vez maiores. À medida em que cresce o volume de imagens nessas coleções, cresce também o intcresse por sistemas capazes de recuperar essas imagens. Essa tarefa tcem sido endereçada pelos sistemas de recuperação de imagens por conteúdo. Nesses sistemas, o conteúdo de uma imagem é descrito a partir de suas características visuais de baixo nível, tais como cor, forma e textura. Um sistema de recuperação de imagens por conteúdo idcal deve ser eficaz e eficiente. A eficácia é resultado de representações abstratas das imagens. Em geral, os métodos que realizam esse processo normalmente falham na presença de diferentes condições de iluminação, oclusão e foco. A eficiência, por outro lado, é resultado da organização dada à essas representações. Em geral, os métodos de agrupamento constituem uma das técnicas mais úteis para diminuir o espaço de busca e acelerar o processamento de uma consulta. Para endereçar a eficácia, este trabalho apresenta o 81FT -Texton, um método capaz de incorporar informações sobre iluminação, oclusão e foco nas características visuais de baixo nível. Esse método baseia-se na distribuição discreta de características invariantes locais e em propriedades de baixo nível das imagens. Em relação às questões de eficiência, este trabalho apresenta o DAH-Cluster, um novo paradigma de agrupamento aplicado à recuperação de imagens por conteúdo. Esse método combina características dos paradigmas hierárquicos divisivo e aglomerativo. Além disso, o DAH-Cluster introduz um novo conceito; chamado fator de reagrupamento, que permite agrupar elementos similares que seriam separados pelos paradigmas tradicionais. Experimentos mostram que a combinação dessas técnicas permite a criação de um mecanismo robusto de recuperação de imagens por conteúdo, atingindo resultados mais eficazes e mais eficientes que as abordagens tradicionais descritas na literatura. As principais contribuições deste trabalho são: (1) um novo método para recuperação de imagens capaz de incorporar informações sobre iluminação, oclusão e foco nas características visuais de baixo nível; e (2) um novo paradigma de agrupamento de dados que pode ser aplicado à recuperação de informação<br>Abstract: Advances in data storage, data transmission, and image acquisition have enabled the creation of large images datasets. This has spurred great interest for systems that are ablc to efficicntly rctricve images from these collections. This task has been addressed by thc so-called Content-Based Image Retrieval (CBIR) systems. ln these systems, image content is represented by their low-level features, such as color, shape, and texture. An ideal CBIR system should be effective and efficient. Effectiveness is achieved from image's abstract representations. ln general, traditional approaches for this process often fail in presence of different illumination, occlusion, and viewpoint conditions. Efficiency, on the other hand, is achieved from the organization given for these representations. ln general, data clustering approaches are one of the most useful techniques to reduce search space and speed up query processing. To address effectiveness issues, this work presents 81FT-Texton, a new method to incorporate illumination, occlusion, and viewpoint conditions into low-level features. This approach is based on discrete distributions of local invariant features and low-level image properties. With regard to efficiency issues, this work presents DAH-Cluster, a new clustering paradigm applied to CBIR. This approach combines features from both divisive and agglomerative hierarchical clustering paradigms. ln addition, DAH-Cluster introduces a new concept, called factor of reclustering, that allows grouping similar elements that would be separated by traditional clustering paradigms. Experiments show that the combination of these techniques allows the creation of a robust CBIR mechanism, achieving more effective and efficient results than traditional approaches in literature. The main contributions of this work are: (1) a new method for image retrieval that incorporates illumination, occ1usion, and viewpoint conditions into low-level features; and (2) a new data clustering paradigm that can be applied to information retrieval tasks<br>Mestrado<br>Sistemas de Informação<br>Mestre em Ciência da Computação
APA, Harvard, Vancouver, ISO, and other styles
7

Souza, Gabriel Gustavo Barros de [UNESP]. "Proposta de atualização de cadastro urbano a partir de detecção de alterações em imagens QUICK BIRD tomadas em diferentes épocas." Universidade Estadual Paulista (UNESP), 2009. http://hdl.handle.net/11449/86780.

Full text
Abstract:
Made available in DSpace on 2014-06-11T19:22:25Z (GMT). No. of bitstreams: 0 Previous issue date: 2009-06-19Bitstream added on 2014-06-13T18:49:02Z : No. of bitstreams: 1 souza_ggb_me_prud.pdf: 3569562 bytes, checksum: 1967b82d0ff7aaed4984bec889309e19 (MD5)<br>Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)<br>A atualização cadastral de área urbana é uma das questões mais importantes a ser considerada no planejamento municipal. Por esta área tratar de uma riqueza de detalhes acentuada, quando comparada as área rurais e de expansão urbana, torna-se difícil traçar uma metodologia de atualização de dados cadastrais que possa ser generalizada às áreas urbanas dos municípios. Isso não apenas em metodologia, mas para atender as necessidades e realidades que se deseja atualizar no Cadastro. Neste trabalho é apresentada uma proposta de atualização cadastral de área urbana a partir da utilização de imagens de satélite de alta resolução espacial (Quick Bird). São empregados, para isso, alguns métodos e técnicas nos processos de utilização das imagens adotadas. As imagens utilizadas abrangem a área teste, definida no município de Presidente Prudente. Para a detecção das alterações a serem atualizadas no banco de dados cadastrais foram utilizadas imagens pancromáticas e multiespectrais de épocas diferentes e empregaram-se técnicas de classificação de imagens para identificar e descrever visualmente os tipos de alvos alterados. De acordo com um limiar adotado, a partir das imagens e processos descritos, as alterações identificadas foram atualizadas no banco de dados cadastrais. As implicações para a seqüência adotada são apresentadas e discutidas nos capítulos desta pesquisa.<br>The urban cadastre updating is one of the most important questions about urban planning. In this area there are many details when compared to the rural area and urban expansion area that hinds the introduction cadastral updating approach could be generalized to urban areas. The demand of public administration and the reality of the cities must be considered in all process including Cadastre. In this work there is presented a urban area cadastral updating approach with the use of high resolution imagery (Quick Bird satellite). The methods and techniques are employed in the processes of use of the adopted images. The used images are of the city of Presidente Prudente. Images of different times were used for the change detection to be updated in the cadastral database. Image Classifications were used to identify and to describe visually the changes. In accordance with an adopted threshold, from the images and described processes, the identified changes were updated in the cadastral database. The implications for the adopted sequence are presented and discussed in the chapters of this report.
APA, Harvard, Vancouver, ISO, and other styles
8

Rodrigues, Silvia Cristina Martini. "Organização automática de bancos de mamografias no padrão de densidade BI-RADS." Universidade de São Paulo, 2004. http://www.teses.usp.br/teses/disponiveis/18/18133/tde-11112015-152323/.

Full text
Abstract:
Este trabalho apresenta um método computacional que classifica as mamografias no padrão de densidade BI-RADS, visando auxiliar a detecção precoce do câncer de mama, seja essa realizada por análise visual ou por auxílio computadorizado. A classificação das mamografias em bancos padronizados objetiva eliminar conflitos entre laudos mamográficos de diferentes profissionais, bem como quanto à conduta médica a ser seguida. Entretanto, o estabelecimento de bancos feito visualmente e principalmente em períodos diferentes dificulta sua uniformização, proporcionando uma classificação muito subjetiva e relativamente grosseira em conseqüência a grande variação entre e inter observadores. O método desenvolvido permitiu classificar as imagens independentemente da subjetividade própria à observação visual de quem organizou o banco ou da técnica de exposição aos raios X utilizada. Os resultados foram superiores a 92% mesmo para bancos de imagens totalmente diferentes. Esses resultados foram obtidos respeitando-se as possíveis diferenças de interpretações de diversas equipes médicas. Além do estabelecimento de banco de mamografias com limiares entre as composições bem quantificadas, com esta ferramenta, tanto os estagiários poderão ser treinados para classificar as imagens no padrão de densidades do BI-RADS, respeitando as particularidades locais, quanto os resultados dos CAD poderão ser comparados.<br>This thesis presents a computational method that classifies the mammography into the composition of the breast tissue density patterns described in the BI-RADS protocol, intended to help in the early detection of breast cancer, either if this detection happens to be realized by visual analysis or by computerized support. The classification of the mammography in standardized database intends to eliminate issues between mammography awards of distinct professionals and the correct medical conduct to be followed. However, the determination of database only visually, especially in different periods, difficult it\'s to standardize, causing an extremely subjective classification and relatively superficial in consequence of the large inter-and intraobserver variability. The method allows classifying the images independently of the subjective quality of the visual analysis from who organized the database or from the technique of the exposition to X-ray employed. The results were superior of 92% even to database totally distinct. These results were obtained respecting eventual differences of interpretation from several medical groups. Beside the establishment of mammography database with thresholding between the well quantified categories, this methodology will consent to probationers to be trained for classify the images according to the composition of the breast tissue density patterns described in the BI-RADS, respecting its local particularity. Likewise, with this methodology, the results from CAD would be compared.
APA, Harvard, Vancouver, ISO, and other styles
9

Matheus, Bruno Roberto Nepomuceno. "BancoWeb: base de imagens mamográficas para auxílio em avaliações de esquemas CAD." Universidade de São Paulo, 2010. http://www.teses.usp.br/teses/disponiveis/18/18152/tde-24062010-155737/.

Full text
Abstract:
Este trabalho teve como objetivo desenvolver uma base de imagens mamográficas online com acesso público para desenvolvimento, testes e avaliação comparativa de esquemas computadorizados de auxilio ao diagnóstico (CADs). A base contem imagens de vários hospitais, com grande variedade de laudos, também disponíveis na base assim como informações sobre dados clínicos (não confidenciais) dos pacientes. Uma interface detalhada foi criada para permitir o fácil acesso público, permitindo o uso de ferramentas de busca, recorte, analise estatística e inserção remota de imagens, entre outras. Testes comparativos com bases já existentes e amplamente usadas mostraram que a base desenvolvida tem quantidade e qualidade de imagens comparável ou superior as outras, além de oferecer uma quantidade de ferramentas muito maior.<br>This work has the objective of developing an online mamographic image database with public access for development, test and evaluation of computer-aided diagnosis (CAD). The database contains images from several hospitals, with great variety of medical reports, also available on the database together with other clinical information (not classified) from the patient. A detailed interface was developed to allow easy public access, allowing the use of tools for search, clipping, statistical analyses, remote insertion and others. Comparative test with other already existing databases shown that the presented database has quantity and quality comparable or superior to the others and offers a greater set of tools for the user.
APA, Harvard, Vancouver, ISO, and other styles
10

Souza, Luiz Eduardo Christovam de. "Organização e armazenamento de imagens multitemporais georreferenciadas para suporte ao processo de detecção de mudanças /." Presidente Prudente, 2018. http://hdl.handle.net/11449/180729.

Full text
Abstract:
Orientador: Maria de Lourdes Bueno Trindade Galo<br>Resumo: Atualmente o volume de dados produzidos tem atingido patamares nunca imaginados, sobretudo em decorrência da multiplicação do número de sensores e da popularização da internet, com a web 2.0 e as redes sociais. Dentre os diversos tipos de sensores existentes, os de imageamento, transportados principalmente por satélites, produzem vastos conjuntos de observações da superfície da Terra. A observação contínua da Terra por satélites possibilita o monitoramento de mudanças no uso e cobertura da terra. Contudo, em diversas pesquisas relacionadas a mudanças no planeta, são utilizados apenas pequenos fragmentos do imenso conjunto de dados existente, essencialmente devido a ainda haver uma lacuna científicatecnológica relacionada aos procedimentos de organização, armazenamento, análise e representação de grandes conjuntos de dados. Portanto, nessa pesquisa foi definida uma estrutura para organização, armazenamento e recuperação de dados espaço-temporais, com o propósito de fornecer suporte a detecção de mudanças na cobertura da terra. Para tanto, foi definida como aplicação a análise de séries temporais de Normalized Difference Vegetation Index (NDVI) derivadas de imagens adquiridas desde 1984 até 2017, pelos sensores Thematic Mapper (TM), Enhanced Thematic Mapper Plus (ETM+) e Operational Land Imager (OLI) para a região de Porto Velho, Rondônia. Foi construída uma série temporal de NDVI para a posição de cada pixel presente na área de estudo. Regiões de referência foram definidas par... (Resumo completo, clicar acesso eletrônico abaixo)<br>Abstract: Nowadays the size of datasets has been reaching levels never seen before, mainly due to new sensors and the widespread of the internet, with web 2.0 and social media. Among the various types of sensors, the imaging sensors, mainly carried by satellites, have produced big Earth observations datasets. The regular Earth observation by satellites enable to monitor Land Use/Cover Change (LUCC). However, in many researches related to LUCC, only small parts of the big Earth Observation datasets are normally used, because there is still a scientifictechnological gap related to the organization, storage, analysis and representation of big Earth Observations data. Therefore, in this research was defined a database for the organization, storage and retrieval of spatio-temporal data, to support a LUCC task. Therefore, the time series analysis of Normalized Difference Vegetation Index (NDVI) of images acquired from 1984 to 2017 by Thematic Mapper (TM), Enhanced Thematic Mapper Plus (ETM+) and Operational Land Imager (OLI) for the region of Porto Velho, Rondônia was defined as the application. To the position each of the pixel in the study area was built a NDVI time series. Reference areas were defined to retrieve reference time series that describe the land cover types and the change classes (anthropic and natural). The Fast Dynamic Time Warping (FastDTW) algorithm was used to measure the similarity between the time series, to be classified and reference ones. To find the time series clas... (Complete abstract click electronic access below)<br>Mestre
APA, Harvard, Vancouver, ISO, and other styles
More sources

Books on the topic "ImageNet Database"

1

Trevor, Paglen, and Barbican Art Gallery, eds. Trevor Paglen: From 'Apple' to 'Anomaly' : selections from the ImageNet database for object recognition. Barbican, 2019.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Deb, Sagarmay. Multimedia Systems and Content-Based Image Retrieval. Information Science Publishing, 2003.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "ImageNet Database"

1

He, Biao, Dongming Zhang, and Zili Li. "Tunnel ImageNet: A comprehensive annotated image database of tunnel defects for structural condition maintenance." In Tunnelling into a Sustainable Future – Methods and Technologies. CRC Press, 2025. https://doi.org/10.1201/9781003559047-532.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Alaeddine, Hmidi, and Malek Jihene. "A Comparative Study of Popular CNN Topologies Used for Imagenet Classification." In Deep Neural Networks for Multimodal Imaging and Biomedical Applications. IGI Global, 2020. http://dx.doi.org/10.4018/978-1-7998-3591-2.ch007.

Full text
Abstract:
Deep Learning is a relatively modern area that is a very important key in various fields such as computer vision with a trend of rapid exponential growth so that data are increasing. Since the introduction of AlexNet, the evolution of image analysis, recognition, and classification have become increasingly rapid and capable of replacing conventional algorithms used in vision tasks. This study focuses on the evolution (depth, width, multiple paths) presented in deep CNN architectures that are trained on the ImageNET database. In addition, an analysis of different characteristics of existing topologies is detailed in order to extract the various strategies used to obtain better performance.
APA, Harvard, Vancouver, ISO, and other styles
3

Dandotiya, Monika, and Madhukar Dubey. "A VGG-16 Framework for an Efficient Indoor-Outdoor." In SCRS CONFERENCE PROCEEDINGS ON INTELLIGENT SYSTEMS. Soft Computing Research Society, 2021. http://dx.doi.org/10.52458/978-93-91842-08-6-32.

Full text
Abstract:
Computer vision had reached a new level that allows robots from the limits of laboratories to explore the outside world. Even with progress in this area, robots are struggling to understand their location. The classification of the scene is an important step in understanding the scene. In many applications, a scene classifi- cation can be used such as a surveillance camera, self-driving, a household robot, and a database imaging system. Monitoring cameras are now everywhere installed. The accuracy of scene classification of indoor-outdoor techniques is weak. Using the Convolution Neural Net-work Model in VGG-16, this study attempts to im- prove accuracy. This research presents a new method for classifying images into classes using VGG-16. The algorithm’s outputs are validated using the SUN397 indoor-outdoor dataset, and outcomesdemonstrates that the suggested methodol- ogy outperforms existing technologies for indoor-outdoor scene classification. In this paper, Very Deep Convolutional Networks for Large-Scale Image Recognition” is what we implement. In ImageNet, a dataset of over 14 million images belonging to 1000 classes, the model achieves 92.7 percent top-5 test accuracy. It outperforms Alex Net by sequentially replacing large kernel-sized filters (11 and 5 in the first and second convolutional layers, respectively) with multiple 33 kernel-sized filters. We attain Training loss is 10percent and Training Accuracy is 96 percent in our projected work.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "ImageNet Database"

1

Jia Deng, Wei Dong, R. Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. "ImageNet: A large-scale hierarchical image database." In 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2009. http://dx.doi.org/10.1109/cvprw.2009.5206848.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Deng, Jia, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. "ImageNet: A large-scale hierarchical image database." In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops). IEEE, 2009. http://dx.doi.org/10.1109/cvpr.2009.5206848.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Souza, Victor, Luan Silva, Adam Santos, and Leandro Araújo. "Análise Comparativa de Redes Neurais Convolucionais no Reconhecimento de Cenas." In Computer on the Beach. Universidade do Vale do Itajaí, 2020. http://dx.doi.org/10.14210/cotb.v11n1.p419-426.

Full text
Abstract:
This paper aims to compare the convolutional neural networks(CNNs): ResNet50, InceptionV3, and InceptionResNetV2 tested withand without pre-trained weights on the ImageNet database in orderto solve the scene recognition problem. The results showed that thepre-trained ResNet50 achieved the best performance with an averageaccuracy of 99.82% in training and 85.53% in the test, while theworst result was attributed to the ResNet50 without pre-training,with 88.76% and 71.66% of average accuracy in training and testing,respectively. The main contribution of this work is the direct comparisonbetween the CNNs widely applied in the literature, that is,to enable a better selection of the algorithms in the various scenerecognition applications.
APA, Harvard, Vancouver, ISO, and other styles
4

Barcellos, William, and Adilson Gonzaga. "Periocular authentication in smartphones applying uLBP descriptor on CNN Feature Maps." In Workshop de Visão Computacional. Sociedade Brasileira de Computação - SBC, 2021. http://dx.doi.org/10.5753/wvc.2021.18890.

Full text
Abstract:
The outputs of CNN layers, called Activations, are composed of Feature Maps, which show textural information that can be extracted by a texture descriptor. Standard CNN feature extraction use Activations as feature vectors for object recognition. The goal of this work is to evaluate a new methodology of CNN feature extraction. In this paper, instead of using the Activations as a feature vector, we use a CNN as a feature extractor, and then we apply a texture descriptor directly on the Feature Maps. Thus, we use the extracted features obtained by the texture descriptor as a feature vector for authentication. To evaluate our proposed method, we use the AlexNet CNN previously trained on the ImageNet database as a feature extractor; then we apply the uniform LBP (uLBP) descriptor on the Feature Maps for texture extraction. We tested our proposed method on the VISOB dataset composed of periocular images taken from 3 different smartphones under 3 different lighting conditions. Our results show that the use of a texture descriptor on CNN Feature Maps achieves better performance than computer vision handcrafted methods or even by standard CNN feature extraction.
APA, Harvard, Vancouver, ISO, and other styles
5

Albuquerque, Amanda Cristina Fraga de, and Helyane Bronoski Borges. "Evaluation of Deep Learning Transfer Techniques for Mangrove Segmentation with Images of the Sentinel-2A." In Anais Estendidos da Conference on Graphics, Patterns and Images. Sociedade Brasileira de Computação - SBC, 2024. https://doi.org/10.5753/sibgrapi.est.2024.31659.

Full text
Abstract:
Fine-tuning techniques allow the use of weights from pre-trained networks in other models across different contexts, potentially improving training performance as it generally requires fewer computational resources and less data. Finetuning has become more widespread in the natural domain (RGB) with the availability of pre-trained model weights from the ImageNet database. However, pre-trained models in the same domain are not readily available for the remote sensing domain, such as in mangrove identification. Both nationally and in the state of Paraná, there are few studies employing deep learning for mangrove segmentation. Developing models using deep learning transfer can help establish automated monitoring systems. Thus, this study evaluated fine-tuning techniques for mangrove segmentation in Paraná using the U-Net model with pre-trained encoders in the same domain, remote sensing, and the natural domain. The dataset for training the U-Net was generated using bands from the Sentinel-2A satellite and annotations from the MapBiomas project maps. The fine-tuned networks discussed in this study accurately identified mangroves in Paraná, all achieving accuracies above 95.1% and F-scores greater than 92.6%.
APA, Harvard, Vancouver, ISO, and other styles
6

Mikhalevich, Yurij. "CLIP-Based Search Engine for Retrieval of Label-Free Images Using a Text Query." In 10th International Conference on Human Interaction and Emerging Technologies (IHIET 2023). AHFE International, 2023. http://dx.doi.org/10.54941/ahfe1004021.

Full text
Abstract:
In January 2021, OpenAI released the Contrastive Language-Image Pre-Training (CLIP) model, able to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the Internet. This model enables researchers to use natural language to reference learned visual concepts (or describe new ones), enabling the zero-shot transfer of the model to downstream tasks. One of the possible applications of CLIP is to look up images using natural language queries. This application is especially important in the context of the constantly growing amount of visual information created by people. This paper explores the application of the CLIP model to the image search problem. It proposes a practical and scalable implementation of the image search featuring the cache layer powered by SQLite 3 relational database management system (RDBMS) to enable performant repetitive image searches. The method allows efficient image retrieval using a text query when searching large image datasets. The method achieves 32.27% top-1 accuracy on the ImageNet-1k 1.28 million images train set and 55.15% top-1 accuracy on the CIFAR-100 10 thousand images test set. When applying the method, the image indexing time scales linearly with the number of images, and the image search time increases minorly. Indexing 50,000 images on Apple M1 Max CPU takes 19 minutes and 24 seconds while indexing 1,281,167 images on the same CPU takes 8 hours, 31 minutes, and 26 seconds. The query through 50,000 images on Apple M1 Max CPU executes in 4 seconds, while the same query through 1,281,167 images on the same CPU executes in 11 seconds.
APA, Harvard, Vancouver, ISO, and other styles
7

Almondes, Camila Catiely de Sá, and Flávio Henrique Duarte de Araújo. "Análise da Segmentação e Extração de Características na Detecção de COVID-19 em imagens de Raio-x de Tórax." In Encontro Unificado de Computação do Piauí. Sociedade Brasileira de Computação, 2021. http://dx.doi.org/10.5753/enucompi.2021.17747.

Full text
Abstract:
O COVID-19 afeta principalmente os pulmões, causando falta de ar, tosse e até falência de múltiplos órgãos, deixando as pessoas gravemente doentes. A radiografia de tórax se torna necessária para testes e para avaliar o pulmão e o progresso do vírus quanto aos seus efeitos. Este trabalho apresenta a avaliação dos descritores Dense Net201, VGG16, RESNET50 e Xception, e os classificadores Multi-layer Perceptron (MLP) e Random Forest (RF), com a utilização da base COVID-19 chest x-ray database para o diagnóstico do COVID-19. Para avaliar a segmentação foi utilizada a base Tuberculosis (TB) Chest X-ray Database. Os testes foram realizados em um conjunto de imagens segmentadas contendo 6012 imagens de infecções pulmonares, 3616 de COVID-19 e 10192 sem achados. Os cenários avaliados foram (Covid x Normal), (Covid x Opacidade Pulmonar), (Covid x Opacidade Pulmonar x Normal). Os melhores resultados foram alcançados com o descritor DenseNet201 e o classificador MLP no cenário (Covid x Normal), com Acurácia e Kappa de 0,99.
APA, Harvard, Vancouver, ISO, and other styles
8

Costa, Leonardo, Caio Menezes, Antony Santos, Matheus Araújo, and Gustavo Campos. "A comparative study involving classifiers and dimensionality reduction techniques applied to facial recognition." In Encontro Nacional de Inteligência Artificial e Computacional. Sociedade Brasileira de Computação - SBC, 2019. http://dx.doi.org/10.5753/eniac.2019.9338.

Full text
Abstract:
Neste trabalho é apresentado um estudo comparativo entre as técnicas Eigenfaces e Fisherfaces combinadas com os classificadores KNN, SVM e MLP. As técnicas Eigenfaces e Fisherfaces foram utilizadas para projeção das imagens dos bancos de imagens AT\&amp;T (The database of faces) e Extended Yale } em um novo espaço de forma a se obter uma redução da dimensionalidade desses dados. Os classificadores mencionados utilizaram os dados projetados para executar a tarefa de treinamento e posterior identificação das classes dos dados de teste. Os resultados foram bastante promissores em ambos os casos, porém a rede neural MLP com a técnica Fisherfaces obtiveram os melhores resultados.
APA, Harvard, Vancouver, ISO, and other styles
9

Zhuo, Li'an, Baochang Zhang, Hanlin Chen, et al. "CP-NAS: Child-Parent Neural Architecture Search for 1-bit CNNs." In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}. International Joint Conferences on Artificial Intelligence Organization, 2020. http://dx.doi.org/10.24963/ijcai.2020/144.

Full text
Abstract:
Neural architecture search (NAS) proves to be among the best approaches for many tasks by generating an application-adaptive neural architectures, which are still challenged by high computational cost and memory consumption. At the same time, 1-bit convolutional neural networks (CNNs) with binarized weights and activations show their potential for resource-limited embedded devices. One natural approach is to use 1-bit CNNs to reduce the computation and memory cost of NAS by taking advantage of the strengths of each in a unified framework. To this end, a Child-Parent model is introduced to a differentiable NAS to search the binarized architecture(Child) under the supervision of a full-precision model (Parent). In the search stage, the Child-Parent model uses an indicator generated by the parent and child model accuracy to evaluate the performance and abandon operations with less potential. In the training stage, a kernel level CP loss is introduced to optimize the binarized network. Extensive experiments demonstrate that the proposed CP-NAS achieves a comparable accuracy with traditional NAS on both the CIFAR and ImageNet databases. It achieves an accuracy of 95.27% on CIFAR-10, 64.3% on ImageNet with binarized weights and activations, and a 30% faster search than prior arts.
APA, Harvard, Vancouver, ISO, and other styles
10

Pereira, Fernando Roberto, and Lucas Ferrari De Oliveira. "Proposta de uma solução computacional para detecção de nódulos pulmonares." In XVIII Simpósio Brasileiro de Computação Aplicada à Saúde. Sociedade Brasileira de Computação - SBC, 2018. http://dx.doi.org/10.5753/sbcas.2018.3672.

Full text
Abstract:
O câncer ainda é um dos principais motivos de óbitos em todo o mundo. Só o câncer de pulmão causou mais de 1 milhão de mortes recentemente. A detecção precoce aumenta a probabilidade de cura, tentando auxiliar o diagnóstico é apresentada uma proposta de solução computacional para detecção de nódulos pulmonares em imagens de Tomografia Computadorizada de tórax. A base de dados de imagens utilizada foi Lung Imaging Database Consortium. A solução proposta utiliza a segmentação da área pulmonar, segmentação e rotulação de objetos candidatos a nódulos pulmonares, extração de características empregando descritores de textura e forma e classificação dos objetos candidatos a nódulos pulmonares. Na etapa de segmentação menos de 2% dos nódulos foram perdidos e a classificação ficou com 78,44% de sensibilidade e 88,59% de acurácia média. Os resultados indicam que com alguns ajustes a técnica pode ser promissora.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!