To see the other types of publications on this topic, follow the link: Multimodal embedding and retrieval.

Journal articles on the topic 'Multimodal embedding and retrieval'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Multimodal embedding and retrieval.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Kim, Donghyun, Kuniaki Saito, Kate Saenko, Stan Sclaroff, and Bryan Plummer. "MULE: Multimodal Universal Language Embedding." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (2020): 11254–61. http://dx.doi.org/10.1609/aaai.v34i07.6785.

Full text
Abstract:
Existing vision-language methods typically support two languages at a time at most. In this paper, we present a modular approach which can easily be incorporated into existing vision-language methods in order to support many languages. We accomplish this by learning a single shared Multimodal Universal Language Embedding (MULE) which has been visually-semantically aligned across all languages. Then we learn to relate MULE to visual data as if it were a single language. Our method is not architecture specific, unlike prior work which typically learned separate branches for each language, enabli
APA, Harvard, Vancouver, ISO, and other styles
2

Kim, Jongseok, Youngjae Yu, Hoeseong Kim, and Gunhee Kim. "Dual Compositional Learning in Interactive Image Retrieval." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 2 (2021): 1771–79. http://dx.doi.org/10.1609/aaai.v35i2.16271.

Full text
Abstract:
We present an approach named Dual Composition Network (DCNet) for interactive image retrieval that searches for the best target image for a natural language query and a reference image. To accomplish this task, existing methods have focused on learning a composite representation of the reference image and the text query to be as close to the embedding of the target image as possible. We refer this approach as Composition Network. In this work, we propose to close the loop with Correction Network that models the difference between the reference and target image in the embedding space and matche
APA, Harvard, Vancouver, ISO, and other styles
3

Wang, Di, Xinbo Gao, Xiumei Wang, Lihuo He, and Bo Yuan. "Multimodal Discriminative Binary Embedding for Large-Scale Cross-Modal Retrieval." IEEE Transactions on Image Processing 25, no. 10 (2016): 4540–54. http://dx.doi.org/10.1109/tip.2016.2592800.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Merkx, Danny, and Stefan L. Frank. "Learning semantic sentence representations from visually grounded language without lexical knowledge." Natural Language Engineering 25, no. 4 (2019): 451–66. http://dx.doi.org/10.1017/s1351324919000196.

Full text
Abstract:
AbstractCurrent approaches to learning semantic representations of sentences often use prior word-level knowledge. The current study aims to leverage visual information in order to capture sentence level semantics without the need for word embeddings. We use a multimodal sentence encoder trained on a corpus of images with matching text captions to produce visually grounded sentence embeddings. Deep Neural Networks are trained to map the two modalities to a common embedding space such that for an image the corresponding caption can be retrieved and vice versa. We show that our model achieves re
APA, Harvard, Vancouver, ISO, and other styles
5

Ota, Kosuke, Keiichiro Shirai, Hidetoshi Miyao, and Minoru Maruyama. "Multimodal Analogy-Based Image Retrieval by Improving Semantic Embeddings." Journal of Advanced Computational Intelligence and Intelligent Informatics 26, no. 6 (2022): 995–1003. http://dx.doi.org/10.20965/jaciii.2022.p0995.

Full text
Abstract:
In this work, we study the application of multimodal analogical reasoning to image retrieval. Multimodal analogy questions are given in a form of tuples of words and images, e.g., “cat”:“dog”::[an image of a cat sitting on a bench]:?, to search for an image of a dog sitting on a bench. Retrieving desired images given these tuples can be seen as a task of finding images whose relation between the query image is close to that of query words. One way to achieve the task is building a common vector space that exhibits analogical regularities. To learn such an embedding, we propose a quadruple neur
APA, Harvard, Vancouver, ISO, and other styles
6

Qi, Jidong. "Neurophysiological and psychophysical references for trends in supervised VQA multimodal deep learning: An interdisciplinary meta-analysis." Applied and Computational Engineering 30, no. 1 (2024): 189–201. http://dx.doi.org/10.54254/2755-2721/30/20230096.

Full text
Abstract:
Leading trends in multimodal deep learning for visual-question answering include Multimodal joint-embedding model, multimodal attention-based model, and multimodal external knowledge-based model. Several mechanisms and strategies are used in these models, including representation fusion methods, co-attention mechanisms, and knowledge base retrieval mechanisms. While a variety of works have comprehensively reviewed these strategies, a key gap in research is that there is no interdisciplinary analysis that connects these mechanisms with discoveries on human. As discussions of Neuro-AI continues
APA, Harvard, Vancouver, ISO, and other styles
7

Lin, Kaiyi, Xing Xu, Lianli Gao, Zheng Wang, and Heng Tao Shen. "Learning Cross-Aligned Latent Embeddings for Zero-Shot Cross-Modal Retrieval." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (2020): 11515–22. http://dx.doi.org/10.1609/aaai.v34i07.6817.

Full text
Abstract:
Zero-Shot Cross-Modal Retrieval (ZS-CMR) is an emerging research hotspot that aims to retrieve data of new classes across different modality data. It is challenging for not only the heterogeneous distributions across different modalities, but also the inconsistent semantics across seen and unseen classes. A handful of recently proposed methods typically borrow the idea from zero-shot learning, i.e., exploiting word embeddings of class labels (i.e., class-embeddings) as common semantic space, and using generative adversarial network (GAN) to capture the underlying multimodal data structures, as
APA, Harvard, Vancouver, ISO, and other styles
8

Mithun, Niluthpol C., Juncheng Li, Florian Metze, and Amit K. Roy-Chowdhury. "Joint embeddings with multimodal cues for video-text retrieval." International Journal of Multimedia Information Retrieval 8, no. 1 (2019): 3–18. http://dx.doi.org/10.1007/s13735-018-00166-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Yang, Bang, Yong Dai, Xuxin Cheng, Yaowei Li, Asif Raza, and Yuexian Zou. "Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 6 (2024): 6458–66. http://dx.doi.org/10.1609/aaai.v38i6.28466.

Full text
Abstract:
While vision-language pre-trained models (VL-PTMs) have advanced multimodal research in recent years, their mastery in a few languages like English restricts their applicability in broader communities. To this end, there is an increasing interest in developing multilingual VL models via a joint-learning setup, which, however, could be unrealistic due to expensive costs and data availability. In this work, we propose to extend VL-PTMs' language capacity by continual language learning (CLL), where a model needs to update its linguistic knowledge incrementally without suffering from catastrophic
APA, Harvard, Vancouver, ISO, and other styles
10

Xu, Tong, Peilun Zhou, Linkang Hu, Xiangnan He, Yao Hu, and Enhong Chen. "Socializing the Videos: A Multimodal Approach for Social Relation Recognition." ACM Transactions on Multimedia Computing, Communications, and Applications 17, no. 1 (2021): 1–23. http://dx.doi.org/10.1145/3416493.

Full text
Abstract:
As a crucial task for video analysis, social relation recognition for characters not only provides semantically rich description of video content but also supports intelligent applications, e.g., video retrieval and visual question answering. Unfortunately, due to the semantic gap between visual and semantic features, traditional solutions may fail to reveal the accurate relations among characters. At the same time, the development of social media platforms has now promoted the emergence of crowdsourced comments, which may enhance the recognition task with semantic and descriptive cues. To tha
APA, Harvard, Vancouver, ISO, and other styles
11

Xu, Xing, Jialin Tian, Kaiyi Lin, Huimin Lu, Jie Shao, and Heng Tao Shen. "Zero-shot Cross-modal Retrieval by Assembling AutoEncoder and Generative Adversarial Network." ACM Transactions on Multimedia Computing, Communications, and Applications 17, no. 1s (2021): 1–17. http://dx.doi.org/10.1145/3424341.

Full text
Abstract:
Conventional cross-modal retrieval models mainly assume the same scope of the classes for both the training set and the testing set. This assumption limits their extensibility on zero-shot cross-modal retrieval (ZS-CMR), where the testing set consists of unseen classes that are disjoint with seen classes in the training set. The ZS-CMR task is more challenging due to the heterogeneous distributions of different modalities and the semantic inconsistency between seen and unseen classes. A few of recently proposed approaches are inspired by zero-shot learning to estimate the distribution underlyi
APA, Harvard, Vancouver, ISO, and other styles
12

Peng, Min, Chongyang Wang, Yu Shi, and Xiang-Dong Zhou. "Efficient End-to-End Video Question Answering with Pyramidal Multimodal Transformer." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 2 (2023): 2038–46. http://dx.doi.org/10.1609/aaai.v37i2.25296.

Full text
Abstract:
This paper presents a new method for end-to-end Video Question Answering (VideoQA), aside from the current popularity of using large-scale pre-training with huge feature extractors. We achieve this with a pyramidal multimodal transformer (PMT) model, which simply incorporates a learnable word embedding layer, a few convolutional and transformer layers. We use the anisotropic pyramid to fulfill video-language interactions across different spatio-temporal scales. In addition to the canonical pyramid, which includes both bottom-up and top-down pathways with lateral connections, novel strategies a
APA, Harvard, Vancouver, ISO, and other styles
13

Khan, Arijit. "Knowledge Graphs Querying." ACM SIGMOD Record 52, no. 2 (2023): 18–29. http://dx.doi.org/10.1145/3615952.3615956.

Full text
Abstract:
Knowledge graphs (KGs) such as DBpedia, Freebase, YAGO, Wikidata, and NELL were constructed to store large-scale, real-world facts as (subject, predicate, object) triples - that can also be modeled as a graph, where a node (a subject or an object) represents an entity with attributes, and a directed edge (a predicate) is a relationship between two entities. Querying KGs is critical in web search, question answering (QA), semantic search, personal assistants, fact checking, and recommendation. While significant progress has been made on KG construction and curation, thanks to deep learning rece
APA, Harvard, Vancouver, ISO, and other styles
14

Chen, Weijia, Zhijun Lu, Lijue You, Lingling Zhou, Jie Xu, and Ken Chen. "Artificial Intelligence–Based Multimodal Risk Assessment Model for Surgical Site Infection (AMRAMS): Development and Validation Study." JMIR Medical Informatics 8, no. 6 (2020): e18186. http://dx.doi.org/10.2196/18186.

Full text
Abstract:
Background Surgical site infection (SSI) is one of the most common types of health care–associated infections. It increases mortality, prolongs hospital length of stay, and raises health care costs. Many institutions developed risk assessment models for SSI to help surgeons preoperatively identify high-risk patients and guide clinical intervention. However, most of these models had low accuracies. Objective We aimed to provide a solution in the form of an Artificial intelligence–based Multimodal Risk Assessment Model for Surgical site infection (AMRAMS) for inpatients undergoing operations, us
APA, Harvard, Vancouver, ISO, and other styles
15

Romberg, Stefan, Rainer Lienhart, and Eva Hörster. "Multimodal Image Retrieval." International Journal of Multimedia Information Retrieval 1, no. 1 (2012): 31–44. http://dx.doi.org/10.1007/s13735-012-0006-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Zou, Zhuo. "Performance analysis of using multimodal embedding and word embedding transferred to sentiment classification." Applied and Computational Engineering 5, no. 1 (2023): 417–22. http://dx.doi.org/10.54254/2755-2721/5/20230610.

Full text
Abstract:
Multimodal machine learning is one of artificial intelligence's most important research topics. Contrastive Language-Image Pretraining (CLIP) is one of the applications of multimodal machine Learning and is well applied to computer vision. However, there is a research gap in applying CLIP in natural language processing. Therefore, based on IMDB, this paper applies the multimodal features of CLIP and three other pre-trained word vectors, Glove, Word2vec, and BERT, to compare their effects on sentiment classification of natural language processing, to test the performance of CLIP multimodal feat
APA, Harvard, Vancouver, ISO, and other styles
17

Dash, Sandeep Kumar, Saurav Saha, Partha Pakray, and Alexander Gelbukh. "Generating image captions through multimodal embedding." Journal of Intelligent & Fuzzy Systems 36, no. 5 (2019): 4787–96. http://dx.doi.org/10.3233/jifs-179027.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Qi, Fan, Xiaoshan Yang, Tianzhu Zhang, and Changsheng Xu. "Discriminative multimodal embedding for event classification." Neurocomputing 395 (June 2020): 160–69. http://dx.doi.org/10.1016/j.neucom.2017.11.078.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Lee, Jin Young. "Deep multimodal embedding for video captioning." Multimedia Tools and Applications 78, no. 22 (2019): 31793–805. http://dx.doi.org/10.1007/s11042-019-08011-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Kitanovski, Ivan, Gjorgji Strezoski, Ivica Dimitrovski, Gjorgji Madjarov, and Suzana Loskovska. "Multimodal medical image retrieval system." Multimedia Tools and Applications 76, no. 2 (2016): 2955–78. http://dx.doi.org/10.1007/s11042-016-3261-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Xu, Hong. "Multimodal bird information retrieval system." Applied and Computational Engineering 53, no. 1 (2024): 96–102. http://dx.doi.org/10.54254/2755-2721/53/20241282.

Full text
Abstract:
Multimodal bird information retrieval system can help people popularize bird knowledge and help bird conservation. In this paper, we use the self-built bird dataset, the ViT-B/32 model in CLIP model as the training model, python as the development language, and PyQT5 to complete the interface development. The system mainly realizes the uploading and displaying of bird pictures, the multimodal retrieval function of bird information, and the introduction of related bird information. The results of the trial run show that the system can accomplish the multimodal retrieval of bird information, ret
APA, Harvard, Vancouver, ISO, and other styles
22

Yang, Xi, Xinbo Gao, and Qi Tian. "Polar Embedding for Aurora Image Retrieval." IEEE Transactions on Image Processing 24, no. 11 (2015): 3332–44. http://dx.doi.org/10.1109/tip.2015.2442913.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Tam, G. K. L., and R. W. H. Lau. "Embedding Retrieval of Articulated Geometry Models." IEEE Transactions on Pattern Analysis and Machine Intelligence 34, no. 11 (2012): 2134–46. http://dx.doi.org/10.1109/tpami.2012.17.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Zhou, Wengang, Houqiang Li, Jian Sun, and Qi Tian. "Collaborative Index Embedding for Image Retrieval." IEEE Transactions on Pattern Analysis and Machine Intelligence 40, no. 5 (2018): 1154–66. http://dx.doi.org/10.1109/tpami.2017.2676779.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Wang, Can, Jun Zhao, Xiaofei He, Chun Chen, and Jiajun Bu. "Image retrieval using nonlinear manifold embedding." Neurocomputing 72, no. 16-18 (2009): 3922–29. http://dx.doi.org/10.1016/j.neucom.2009.04.011.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Kulvinder Singh, Et al. "Enhancing Multimodal Information Retrieval Through Integrating Data Mining and Deep Learning Techniques." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 9 (2023): 560–69. http://dx.doi.org/10.17762/ijritcc.v11i9.8844.

Full text
Abstract:
Multimodal information retrieval, the task of re trieving relevant information from heterogeneous data sources such as text, images, and videos, has gained significant attention in recent years due to the proliferation of multimedia content on the internet. This paper proposes an approach to enhance multimodal information retrieval by integrating data mining and deep learning techniques. Traditional information retrieval systems often struggle to effectively handle multimodal data due to the inherent complexity and diversity of such data sources. In this study, we leverage data mining techniqu
APA, Harvard, Vancouver, ISO, and other styles
27

Tang, Zhenchao, Jiehui Huang, Guanxing Chen, and Calvin Yu-Chian Chen. "Comprehensive View Embedding Learning for Single-Cell Multimodal Integration." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 14 (2024): 15292–300. http://dx.doi.org/10.1609/aaai.v38i14.29453.

Full text
Abstract:
Motivation: Advances in single-cell measurement techniques provide rich multimodal data, which helps us to explore the life state of cells more deeply. However, multimodal integration, or, learning joint embeddings from multimodal data remains a current challenge. The difficulty in integrating unpaired single-cell multimodal data is that different modalities have different feature spaces, which easily leads to information loss in joint embedding. And few existing methods have fully exploited and fused the information in single-cell multimodal data. Result: In this study, we propose CoVEL, a de
APA, Harvard, Vancouver, ISO, and other styles
28

Wang, Shiping, and Wenzhong Guo. "Sparse Multigraph Embedding for Multimodal Feature Representation." IEEE Transactions on Multimedia 19, no. 7 (2017): 1454–66. http://dx.doi.org/10.1109/tmm.2017.2663324.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Hama, Kenta, Takashi Matsubara, Kuniaki Uehara, and Jianfei Cai. "Exploring Uncertainty Measures for Image-caption Embedding-and-retrieval Task." ACM Transactions on Multimedia Computing, Communications, and Applications 17, no. 2 (2021): 1–19. http://dx.doi.org/10.1145/3425663.

Full text
Abstract:
With the significant development of black-box machine learning algorithms, particularly deep neural networks, the practical demand for reliability assessment is rapidly increasing. On the basis of the concept that “Bayesian deep learning knows what it does not know,” the uncertainty of deep neural network outputs has been investigated as a reliability measure for classification and regression tasks. By considering an embedding task as a regression task, several existing studies have quantified the uncertainty of embedded features and improved the retrieval performance of cutting-edge models by
APA, Harvard, Vancouver, ISO, and other styles
30

Cao, Yu, Shawn Steffey, Jianbiao He, et al. "Medical Image Retrieval: A Multimodal Approach." Cancer Informatics 13s3 (January 2014): CIN.S14053. http://dx.doi.org/10.4137/cin.s14053.

Full text
Abstract:
Medical imaging is becoming a vital component of war on cancer. Tremendous amounts of medical image data are captured and recorded in a digital format during cancer care and cancer research. Facing such an unprecedented volume of image data with heterogeneous image modalities, it is necessary to develop effective and efficient content-based medical image retrieval systems for cancer clinical practice and research. While substantial progress has been made in different areas of content-based image retrieval (CBIR) research, direct applications of existing CBIR techniques to the medical images pr
APA, Harvard, Vancouver, ISO, and other styles
31

Rafailidis, D., S. Manolopoulou, and P. Daras. "A unified framework for multimodal retrieval." Pattern Recognition 46, no. 12 (2013): 3358–70. http://dx.doi.org/10.1016/j.patcog.2013.05.023.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Qiu, Dong, Haihuan Jiang, and Shuqiao Chen. "Fuzzy Information Retrieval Based on Continuous Bag-of-Words Model." Symmetry 12, no. 2 (2020): 225. http://dx.doi.org/10.3390/sym12020225.

Full text
Abstract:
In this paper, we study the feasibility of performing fuzzy information retrieval by word embedding. We propose a fuzzy information retrieval approach to capture the relationships between words and query language, which combines some techniques of deep learning and fuzzy set theory. We try to leverage large scale data and the continuous-bag-of words model to find the relevant feature of words and obtain word embedding. To enhance retrieval effectiveness, we measure the relativity among words by word embedding, with the property of symmetry. Experimental results show that the recall ratio, prec
APA, Harvard, Vancouver, ISO, and other styles
33

Nguyen, Huy Manh, Tomo Miyazaki, Yoshihiro Sugaya, and Shinichiro Omachi. "Multiple Visual-Semantic Embedding for Video Retrieval from Query Sentence." Applied Sciences 11, no. 7 (2021): 3214. http://dx.doi.org/10.3390/app11073214.

Full text
Abstract:
Visual-semantic embedding aims to learn a joint embedding space where related video and sentence instances are located close to each other. Most existing methods put instances in a single embedding space. However, they struggle to embed instances due to the difficulty of matching visual dynamics in videos to textual features in sentences. A single space is not enough to accommodate various videos and sentences. In this paper, we propose a novel framework that maps instances into multiple individual embedding spaces so that we can capture multiple relationships between instances, leading to com
APA, Harvard, Vancouver, ISO, and other styles
34

Huang, Chuan Bo, and Li Xiang. "Image Retrieval Based on Semi-Supervised Orthogonal Discriminant Embedding." Applied Mechanics and Materials 347-350 (August 2013): 3532–36. http://dx.doi.org/10.4028/www.scientific.net/amm.347-350.3532.

Full text
Abstract:
An retrieval algorithm based on dimensionality reduction is proposed to effectively extract the features to improve the performance of image retrieval. Firstly, the most important properties of the subspaces with respect to image retrieval is captured by intelligently utilizing the similarity and dissimilarity information of semantic and geometric structure in image database. Secondly, We propose Semi-supervised Orthogonal Discriminant Embedding Label Propagation method (SODELP) for image retrieval. The experimental results show that our method has the discrimination power against colour, text
APA, Harvard, Vancouver, ISO, and other styles
35

Kumari, Sneha, Rajiv Pandey, Amit Singh, and Himanshu Pathak. "SPARQL: Semantic Information Retrieval by Embedding Prepositions." International Journal of Network Security & Its Applications 6, no. 1 (2014): 49–57. http://dx.doi.org/10.5121/ijnsa.2014.6105.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Yu, Mengyang, Li Liu, and Ling Shao. "Binary Set Embedding for Cross-Modal Retrieval." IEEE Transactions on Neural Networks and Learning Systems 28, no. 12 (2017): 2899–910. http://dx.doi.org/10.1109/tnnls.2016.2609463.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Almasri, Feras, and Olivier Debeir. "Schematics Retrieval Using Whole-Graph Embedding Similarity." Electronics 13, no. 7 (2024): 1176. http://dx.doi.org/10.3390/electronics13071176.

Full text
Abstract:
This paper addresses the pressing environmental concern of plastic waste, particularly in the biopharmaceutical production sector, where single-use assemblies (SUAs) significantly contribute to this issue. To address and mitigate this problem, we propose a unique approach centered around the standardization and optimization of SUA drawings through digitization and structured representation. Leveraging the non-Euclidean properties of SUA drawings, we employ a graph-based representation, utilizing graph convolutional networks (GCNs) to capture complex structural relationships. Introducing a nove
APA, Harvard, Vancouver, ISO, and other styles
38

Mollenhauer, Hilton H. "Stain contamination and embedding in electron microscopy." Proceedings, annual meeting, Electron Microscopy Society of America 44 (August 1986): 50–53. http://dx.doi.org/10.1017/s0424820100141986.

Full text
Abstract:
Many factors (e.g., resolution of microscope, type of tissue, and preparation of sample) affect electron microscopical images and alter the amount of information that can be retrieved from a specimen. Of interest in this report are those factors associated with the evaluation of epoxy embedded tissues. In this context, informational retrieval is dependant, in part, on the ability to “see” sample detail (e.g., contrast) and, in part, on tue quality of sample preservation. Two aspects of this problem will be discussed: 1) epoxy resins and their effect on image contrast, information retrieval, an
APA, Harvard, Vancouver, ISO, and other styles
39

Qiao, Ya-nan, Qinghe Du, and Di-fang Wan. "A study on query terms proximity embedding for information retrieval." International Journal of Distributed Sensor Networks 13, no. 2 (2017): 155014771769489. http://dx.doi.org/10.1177/1550147717694891.

Full text
Abstract:
Information retrieval is applied widely to models and algorithms in wireless networks for cyber-physical systems. Query terms proximity has proved that it is a very useful information to improve the performance of information retrieval systems. Query terms proximity cannot retrieve documents independently, and it must be incorporated into original information retrieval models. This article proposes the concept of query term proximity embedding, which is a new method to incorporate query term proximity into original information retrieval models. Moreover, term-field-convolutions frequency frame
APA, Harvard, Vancouver, ISO, and other styles
40

P. Bhopale, Bhopale, and Ashish Tiwari. "LEVERAGING NEURAL NETWORK PHRASE EMBEDDING MODEL FOR QUERY REFORMULATION IN AD-HOC BIOMEDICAL INFORMATION RETRIEVAL." Malaysian Journal of Computer Science 34, no. 2 (2021): 151–70. http://dx.doi.org/10.22452/mjcs.vol34no2.2.

Full text
Abstract:
This study presents a spark enhanced neural network phrase embedding model to leverage query representation for relevant biomedical literature retrieval. Information retrieval for clinical decision support demands high precision. In recent years, word embeddings have been evolved as a solution to such requirements. It represents vocabulary words in low-dimensional vectors in the context of their similar words; however, it is inadequate to deal with semantic phrases or multi-word units. Learning vector embeddings for phrases by maintaining word meanings is a challenging task. This study propose
APA, Harvard, Vancouver, ISO, and other styles
41

Dong, Bin, Songlei Jian, and Kai Lu. "Learning Multimodal Representations by Symmetrically Transferring Local Structures." Symmetry 12, no. 9 (2020): 1504. http://dx.doi.org/10.3390/sym12091504.

Full text
Abstract:
Multimodal representations play an important role in multimodal learning tasks, including cross-modal retrieval and intra-modal clustering. However, existing multimodal representation learning approaches focus on building one common space by aligning different modalities and ignore the complementary information across the modalities, such as the intra-modal local structures. In other words, they only focus on the object-level alignment and ignore structure-level alignment. To tackle the problem, we propose a novel symmetric multimodal representation learning framework by transferring local str
APA, Harvard, Vancouver, ISO, and other styles
42

Wang, Zhen, Liu Liu, Yiqun Duan, and Dacheng Tao. "Continual Learning through Retrieval and Imagination." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (2022): 8594–602. http://dx.doi.org/10.1609/aaai.v36i8.20837.

Full text
Abstract:
Continual learning is an intellectual ability of artificial agents to learn new streaming labels from sequential data. The main impediment to continual learning is catastrophic forgetting, a severe performance degradation on previously learned tasks. Although simply replaying all previous data or continuously adding the model parameters could alleviate the issue, it is impractical in real-world applications due to the limited available resources. Inspired by the mechanism of the human brain to deepen its past impression, we propose a novel framework, Deep Retrieval and Imagination (DRI), which
APA, Harvard, Vancouver, ISO, and other styles
43

Zhang, Guihao, and Jiangzhong Cao. "Feature Fusion Based on Transformer for Cross-modal Retrieval." Journal of Physics: Conference Series 2558, no. 1 (2023): 012012. http://dx.doi.org/10.1088/1742-6596/2558/1/012012.

Full text
Abstract:
Abstract With the popularity of the Internet and the rapid growth of multimodal data, multimodal retrieval has gradually become a hot area of research. As one of the important branches of multimodal retrieval, image-text retrieval aims to design a model to learn and align two modal data, image and text, in order to build a bridge of semantic association between the two heterogeneous data, so as to achieve unified alignment and retrieval. The current mainstream image-text cross-modal retrieval approaches have made good progress by designing a deep learning-based model to find potential associat
APA, Harvard, Vancouver, ISO, and other styles
44

Moon, Jucheol, Nhat Anh Le, Nelson Hebert Minaya, and Sang-Il Choi. "Multimodal Few-Shot Learning for Gait Recognition." Applied Sciences 10, no. 21 (2020): 7619. http://dx.doi.org/10.3390/app10217619.

Full text
Abstract:
A person’s gait is a behavioral trait that is uniquely associated with each individual and can be used to recognize the person. As information about the human gait can be captured by wearable devices, a few studies have led to the proposal of methods to process gait information for identification purposes. Despite recent advances in gait recognition, an open set gait recognition problem presents challenges to current approaches. To address the open set gait recognition problem, a system should be able to deal with unseen subjects who have not included in the training dataset. In this paper, we
APA, Harvard, Vancouver, ISO, and other styles
45

Zhuang, Yueting, Jun Song, Fei Wu, Xi Li, Zhongfei Zhang, and Yong Rui. "Multimodal Deep Embedding via Hierarchical Grounded Compositional Semantics." IEEE Transactions on Circuits and Systems for Video Technology 28, no. 1 (2018): 76–89. http://dx.doi.org/10.1109/tcsvt.2016.2606648.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Huang, Feiran, Xiaoming Zhang, Jie Xu, Chaozhuo Li, and Zhoujun Li. "Network embedding by fusing multimodal contents and links." Knowledge-Based Systems 171 (May 2019): 44–55. http://dx.doi.org/10.1016/j.knosys.2019.02.003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Kompus, Kristiina, Tom Eichele, Kenneth Hugdahl, and Lars Nyberg. "Multimodal Imaging of Incidental Retrieval: The Low Route to Memory." Journal of Cognitive Neuroscience 23, no. 4 (2011): 947–60. http://dx.doi.org/10.1162/jocn.2010.21494.

Full text
Abstract:
Memories of past episodes frequently come to mind incidentally, without directed search. It has remained unclear how incidental retrieval processes are initiated in the brain. Here we used fMRI and ERP recordings to find brain activity that specifically correlates with incidental retrieval, as compared to intentional retrieval. Intentional retrieval was associated with increased activation in dorsolateral prefrontal cortex. By contrast, incidental retrieval was associated with a reduced fMRI signal in posterior brain regions, including extrastriate and parahippocampal cortex, and a modulation
APA, Harvard, Vancouver, ISO, and other styles
48

UbaidullahBokhari, Mohammad, and Faraz Hasan. "Multimodal Information Retrieval: Challenges and Future Trends." International Journal of Computer Applications 74, no. 14 (2013): 9–12. http://dx.doi.org/10.5120/12951-9967.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Yamaguchi, Masataka. "2. Multimodal Retrieval between Vision and Language." Journal of The Institute of Image Information and Television Engineers 72, no. 9 (2018): 655–58. http://dx.doi.org/10.3169/itej.72.655.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Calumby, Rodrigo Tripodi. "Diversity-oriented Multimodal and Interactive Information Retrieval." ACM SIGIR Forum 50, no. 1 (2016): 86. http://dx.doi.org/10.1145/2964797.2964811.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!