Zaloguj się

Gotowe bibliografie tematyczne / Scalability in Cross-Modal Retrieval / Artykuły w czasopismach

Kliknij ten link, aby zobaczyć inne rodzaje publikacji na ten temat: Scalability in Cross-Modal Retrieval.

Artykuły w czasopismach na temat „Scalability in Cross-Modal Retrieval”

Autor: Grafiati

Data publikacji: 6 września 2023

Data aktualizacji: 31 lipca 2025

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Sprawdź 50 najlepszych artykułów w czasopismach naukowych na temat „Scalability in Cross-Modal Retrieval”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Przeglądaj artykuły w czasopismach z różnych dziedzin i twórz odpowiednie bibliografie.

1

Yang, Bo, Chen Wang, Xiaoshuang Ma, Beiping Song, Zhuang Liu, and Fangde Sun. "Zero-Shot Sketch-Based Remote-Sensing Image Retrieval Based on Multi-Level and Attention-Guided Tokenization." Remote Sensing 16, no. 10 (2024): 1653. http://dx.doi.org/10.3390/rs16101653.

Pełny tekst źródła

Streszczenie:

Effectively and efficiently retrieving images from remote-sensing databases is a critical challenge in the realm of remote-sensing big data. Utilizing hand-drawn sketches as retrieval inputs offers intuitive and user-friendly advantages, yet the potential of multi-level feature integration from sketches remains underexplored, leading to suboptimal retrieval performance. To address this gap, our study introduces a novel zero-shot, sketch-based retrieval method for remote-sensing images, leveraging multi-level feature extraction, self-attention-guided tokenization and filtering, and cross-modali

Style APA, Harvard, Vancouver, ISO itp.

2

Hu, Peng, Hongyuan Zhu, Xi Peng, and Jie Lin. "Semi-Supervised Multi-Modal Learning with Balanced Spectral Decomposition." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 01 (2020): 99–106. http://dx.doi.org/10.1609/aaai.v34i01.5339.

Pełny tekst źródła

Streszczenie:

Cross-modal retrieval aims to retrieve the relevant samples across different modalities, of which the key problem is how to model the correlations among different modalities while narrowing the large heterogeneous gap. In this paper, we propose a Semi-supervised Multimodal Learning Network method (SMLN) which correlates different modalities by capturing the intrinsic structure and discriminative correlation of the multimedia data. To be specific, the labeled and unlabeled data are used to construct a similarity matrix which integrates the cross-modal correlation, discrimination, and intra-moda

Style APA, Harvard, Vancouver, ISO itp.

3

Rasheed, Ali Salim, Davood Zabihzadeh, and Sumia Abdulhussien Razooqi Al-Obaidi. "Large-Scale Multi-modal Distance Metric Learning with Application to Content-Based Information Retrieval and Image Classification." International Journal of Pattern Recognition and Artificial Intelligence 34, no. 13 (2020): 2050034. http://dx.doi.org/10.1142/s0218001420500342.

Pełny tekst źródła

Streszczenie:

Metric learning algorithms aim to make the conceptually related data items closer and keep dissimilar ones at a distance. The most common approach for metric learning on the Mahalanobis method. Despite its success, this method is limited to find a linear projection and also suffer from scalability respecting both the dimensionality and the size of input data. To address these problems, this paper presents a new scalable metric learning algorithm for multi-modal data. Our method learns an optimal metric for any feature set of the multi-modal data in an online fashion. We also combine the learne

Style APA, Harvard, Vancouver, ISO itp.

4

Popov, S. E., V. P. Potapov, and R. Y. Zamaraev. "On an Approach to Developing Information and Reference Systems Based on Large Language Models." Vestnik NSU. Series: Information Technologies 23, no. 1 (2025): 46–66. https://doi.org/10.25205/1818-7900-2025-23-1-46-66.

Pełny tekst źródła

Streszczenie:

The aim of this study is to develop a corporate context-aware question-answering system in the form of a chatbot to support territorial management by providing fast and accurate access to relevant information. The system is built upon large language models leveraging the Retrieval-Augmented Generation (RAG) approach, combined with modern data processing and retrieval techniques. The knowledge base of the system incorporates textual and tabular data extracted from official reports on environmental conditions. A PostgreSQL database with the pgvector extension was employed to store and retrieve l

Style APA, Harvard, Vancouver, ISO itp.

5

Zalkow, Frank, and Meinard Müller. "Learning Low-Dimensional Embeddings of Audio Shingles for Cross-Version Retrieval of Classical Music." Applied Sciences 10, no. 1 (2019): 19. http://dx.doi.org/10.3390/app10010019.

Pełny tekst źródła

Streszczenie:

Cross-version music retrieval aims at identifying all versions of a given piece of music using a short query audio fragment. One previous approach, which is particularly suited for Western classical music, is based on a nearest neighbor search using short sequences of chroma features, also referred to as audio shingles. From the viewpoint of efficiency, indexing and dimensionality reduction are important aspects. In this paper, we extend previous work by adapting two embedding techniques; one is based on classical principle component analysis, and the other is based on neural networks with tri

Style APA, Harvard, Vancouver, ISO itp.

6

Huang, Xiaobing, Tian Zhao, and Yu Cao. "PIR." International Journal of Multimedia Data Engineering and Management 5, no. 3 (2014): 1–27. http://dx.doi.org/10.4018/ijmdem.2014070101.

Pełny tekst źródła

Streszczenie:

Multimedia Information Retrieval (MIR) is a problem domain that includes programming tasks such as salient feature extraction, machine learning, indexing, and retrieval. There are a variety of implementations and algorithms for these tasks in different languages and frameworks, which are difficult to compose and reuse due to the interface and language incompatibility. Due to this low reusability, researchers often have to implement their experiments from scratch and the resulting programs cannot be easily adapted to parallel and distributed executions, which is important for handling large dat

Style APA, Harvard, Vancouver, ISO itp.

7

An, Duo, Alan Chiu, James A. Flanders, et al. "Designing a retrievable and scalable cell encapsulation device for potential treatment of type 1 diabetes." Proceedings of the National Academy of Sciences 115, no. 2 (2017): E263—E272. http://dx.doi.org/10.1073/pnas.1708806115.

Pełny tekst źródła

Streszczenie:

Cell encapsulation has been shown to hold promise for effective, long-term treatment of type 1 diabetes (T1D). However, challenges remain for its clinical applications. For example, there is an unmet need for an encapsulation system that is capable of delivering sufficient cell mass while still allowing convenient retrieval or replacement. Here, we report a simple cell encapsulation design that is readily scalable and conveniently retrievable. The key to this design was to engineer a highly wettable, Ca2+-releasing nanoporous polymer thread that promoted uniform in situ cross-linking and stron

Style APA, Harvard, Vancouver, ISO itp.

8

Zhang, Zhen, Xu Wu, and Shuang Wei. "Cross-Domain Access Control Model in Industrial IoT Environment." Applied Sciences 13, no. 8 (2023): 5042. http://dx.doi.org/10.3390/app13085042.

Pełny tekst źródła

Streszczenie:

The Industrial Internet of Things (IIoT) accelerates smart manufacturing and boosts production efficiency through heterogeneous industrial equipment, intelligent sensors, and actuators. The Industrial Internet of Things is transforming from a traditional factory model to a new manufacturing mode, which allows cross-domain data-sharing among multiple system departments to enable smart manufacturing. A complete industrial product comes from the combined efforts of many different departments. Therefore, secure and reliable cross-domain access control has become the key to ensuring the security of

Style APA, Harvard, Vancouver, ISO itp.

9

Ievgen, Gartman. "Architectural Features of Extended Retrieval Generation with External Memory." International Journal of Engineering and Computer Science 14, no. 06 (2025): 27355–61. https://doi.org/10.18535/ijecs.v14i06.5163.

Pełny tekst źródła

Streszczenie:

This article examines the RoCR framework, a Retrieval-Augmented Generation (RAG) system optimized for edge deployment in latency-sensitive environments such as real-time search, product recommendation, and dynamic content generation in eCommerce platforms. RoCR leverages Compute-in-Memory (CiM) architectures to enable fast, energy-efficient inference at scale. At the core of the solution is the CiM-Retriever, a module optimized for performing max inner product search (MIPS). Two architectural variants of the generator are analyzed—decoder-only (RA-T) and encoder–decoder with kNN cross-attentio

Style APA, Harvard, Vancouver, ISO itp.

10

DS, Chakrapani. "AN EFFICIENT DATA SECURITY IN MEDICAL REPORT USING BLOCKCHAIN TECHNOLOGY." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 05 (2024): 1–5. http://dx.doi.org/10.55041/ijsrem34688.

Pełny tekst źródła

Streszczenie:

The 'SwiftApply Assistant' addresses the common challenge of time consuming data entry across a multitude of applications. This innovative tool simplifies interactions with forms such as job applications, admissions, claims, scholarships, and healthcare enrollment by automating the process of filling HTML forms. Noteworthy features include its adaptability to various life stages and a user centric design aimed at streamlining the application process while offering control and customization options. With cross platform accessibility ensuring convenience and scalability to adapt to emerging appl

Style APA, Harvard, Vancouver, ISO itp.

11

Tamchyna, Aleš, Ondřej Dušek, Rudolf Rosa, and Pavel Pecina. "MTMonkey: A Scalable Infrastructure for a Machine Translation Web Service." Prague Bulletin of Mathematical Linguistics 100, no. 1 (2013): 31–40. http://dx.doi.org/10.2478/pralin-2013-0009.

Pełny tekst źródła

Streszczenie:

Abstract We present a web service which handles and distributes JSON-encoded HTTP requests for machine translation (MT) among multiple machines running an MT system, including text pre- and post-processing. It is currently used to provide MT between several languages for cross-lingual information retrieval in the EU FP7 Khresmoi project. The software consists of an application server and remote workers which handle text processing and communicate translation requests to MT systems. The communication between the application server and the workers is based on the XML-RPC protocol. We present the

Style APA, Harvard, Vancouver, ISO itp.

12

Kim, Sehyung, Chanhyeong Yang, Jihwan Park, Taehoon Song, and Hyunwoo J. Kim. "Super-Class Guided Transformer for Zero-Shot Attribute Classification." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 17 (2025): 17921–29. https://doi.org/10.1609/aaai.v39i17.33971.

Pełny tekst źródła

Streszczenie:

Attribute classification is crucial for identifying specific characteristics within image regions. Vision-Language Models (VLMs) have been effective in zero-shot tasks by leveraging their general knowledge from large-scale datasets. Recent studies demonstrate that transformer-based models with class-wise queries can effectively address zero-shot multi-label classification. However, poor utilization of the relationship between seen and unseen attributes makes the model lack generalizability. Additionally, attribute classification generally involves many attributes, making maintaining the model’

Style APA, Harvard, Vancouver, ISO itp.

13

Vardhineedi, Padma naresh, and Aditya Dayal Tyagi. "Operational Database Architecture for Biometrics: Design and Implementation Following IEEE AutoTest Standards." International Journal of Research in all Subjects in Multi Languages 13, no. 3 (2025): 68–88. https://doi.org/10.63345/ijrsml.v13.i3.5.

Pełny tekst źródła

Streszczenie:

Database architectures for biometrics have undergone revolutionary changes in the last decade, with growing demands for scalability, security, and efficiency in processing large-scale biometric data. While tremendous progress has been made, there are still some research gaps in optimizing the performance, security, and integration of biometric databases, especially in the context of emerging technologies such as cloud computing, machine learning, and blockchain. While existing research has been concentrated on improving data retrieval times, system reliability, and security of sensitive biomet

Style APA, Harvard, Vancouver, ISO itp.

14

Zhang, Chengyuan, Jiayu Song, Xiaofeng Zhu, Lei Zhu, and Shichao Zhang. "HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal Retrieval." ACM Transactions on Multimedia Computing, Communications, and Applications 17, no. 1s (2021): 1–22. http://dx.doi.org/10.1145/3412847.

Pełny tekst źródła

Streszczenie:

The purpose of cross-modal retrieval is to find the relationship between different modal samples and to retrieve other modal samples with similar semantics by using a certain modal sample. As the data of different modalities presents heterogeneous low-level feature and semantic-related high-level features, the main problem of cross-modal retrieval is how to measure the similarity between different modalities. In this article, we present a novel cross-modal retrieval method, named Hybrid Cross-Modal Similarity Learning model (HCMSL for short). It aims to capture sufficient semantic information

Style APA, Harvard, Vancouver, ISO itp.

15

Song, Jiayu, Yuxuan Hu, Lei Zhu, Chengyuan Zhang, Jian Zhang, and Shichao Zhang. "Soft Contrastive Cross-Modal Retrieval." Applied Sciences 14, no. 5 (2024): 1944. http://dx.doi.org/10.3390/app14051944.

Pełny tekst źródła

Streszczenie:

Cross-modal retrieval plays a key role in the Natural Language Processing area, which aims to retrieve one modality to another efficiently. Despite the notable achievements of existing cross-modal retrieval methodologies, the complexity of the embedding space increases with more complex models, leading to less interpretable and potentially overfitting representations. Most existing methods realize outstanding results based on datasets without any error or noise, but that is extremely ideal and leads to trained models lacking robustness. To solve these problems, in this paper, we propose a nove

Style APA, Harvard, Vancouver, ISO itp.

16

Yang, Xianben, and Wei Zhang. "Graph Convolutional Networks for Cross-Modal Information Retrieval." Wireless Communications and Mobile Computing 2022 (January 6, 2022): 1–8. http://dx.doi.org/10.1155/2022/6133142.

Pełny tekst źródła

Streszczenie:

In recent years, due to the wide application of deep learning and more modal research, the corresponding image retrieval system has gradually extended from traditional text retrieval to visual retrieval combined with images and has become the field of computer vision and natural language understanding and one of the important cross-research hotspots. This paper focuses on the research of graph convolutional networks for cross-modal information retrieval and has a general understanding of cross-modal information retrieval and the related theories of convolutional networks on the basis of litera

Style APA, Harvard, Vancouver, ISO itp.

17

Devezas, José. "Graph-based entity-oriented search." ACM SIGIR Forum 55, no. 1 (2021): 1–2. http://dx.doi.org/10.1145/3476415.3476430.

Pełny tekst źródła

Streszczenie:

Entity-oriented search has revolutionized search engines. In the era of Google Knowledge Graph and Microsoft Satori, users demand an effortless process of search. Whether they express an information need through a keyword query, expecting documents and entities, or through a clicked entity, expecting related entities, there is an inherent need for the combination of corpora and knowledge bases to obtain an answer. Such integration frequently relies on independent signals extracted from inverted indexes, and from quad indexes indirectly accessed through queries to a triplestore. However, relyin

Style APA, Harvard, Vancouver, ISO itp.

18

Wu, Yiling, Shuhui Wang, and Qingming Huang. "Multi-modal semantic autoencoder for cross-modal retrieval." Neurocomputing 331 (February 2019): 165–75. http://dx.doi.org/10.1016/j.neucom.2018.11.042.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

19

Huang, Hailang, Zhijie Nie, Ziqiao Wang, and Ziyu Shang. "Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 16 (2024): 18298–306. http://dx.doi.org/10.1609/aaai.v38i16.29789.

Pełny tekst źródła

Streszczenie:

Current image-text retrieval methods have demonstrated impressive performance in recent years. However, they still face two problems: the inter-modal matching missing problem and the intra-modal semantic loss problem. These problems can significantly affect the accuracy of image-text retrieval. To address these challenges, we propose a novel method called Cross-modal and Uni-modal Soft-label Alignment (CUSA). Our method leverages the power of uni-modal pre-trained models to provide soft-label supervision signals for the image-text retrieval model. Additionally, we introduce two alignment techn

Style APA, Harvard, Vancouver, ISO itp.

20

Liu, Huan, Jiang Xiong, Nian Zhang, Fuming Liu, and Xitao Zou. "Quadruplet-Based Deep Cross-Modal Hashing." Computational Intelligence and Neuroscience 2021 (July 2, 2021): 1–10. http://dx.doi.org/10.1155/2021/9968716.

Pełny tekst źródła

Streszczenie:

Recently, benefitting from the storage and retrieval efficiency of hashing and the powerful discriminative feature extraction capability of deep neural networks, deep cross-modal hashing retrieval has drawn more and more attention. To preserve the semantic similarities of cross-modal instances during the hash mapping procedure, most existing deep cross-modal hashing methods usually learn deep hashing networks with a pairwise loss or a triplet loss. However, these methods may not fully explore the similarity relation across modalities. To solve this problem, in this paper, we introduce a quadru

Style APA, Harvard, Vancouver, ISO itp.

21

Li, Tieying, Lingdu Kong, Xiaochun Yang, Bin Wang, and Jiaxing Xu. "Bridging Modalities: A Survey of Cross-Modal Image-Text Retrieval." Chinese Journal of Information Fusion 1, no. 1 (2024): 79–92. http://dx.doi.org/10.62762/cjif.2024.361895.

Pełny tekst źródła

Streszczenie:

The rapid advancement of Internet technology, driven by social media and e-commerce platforms, has facilitated the generation and sharing of multimodal data, leading to increased interest in efficient cross-modal retrieval systems. Cross-modal image-text retrieval, encompassing tasks such as image query text (IqT) retrieval and text query image (TqI) retrieval, plays a crucial role in semantic searches across modalities. This paper presents a comprehensive survey of cross-modal image-text retrieval, addressing the limitations of previous studies that focused on single perspectives such as subs

Style APA, Harvard, Vancouver, ISO itp.

22

Wang, Suping, Ligu Zhu, Lei Shi, Hao Mo, and Songfu Tan. "A Survey of Full-Cycle Cross-Modal Retrieval: From a Representation Learning Perspective." Applied Sciences 13, no. 7 (2023): 4571. http://dx.doi.org/10.3390/app13074571.

Pełny tekst źródła

Streszczenie:

Cross-modal retrieval aims to elucidate information fusion, imitate human learning, and advance the field. Although previous reviews have primarily focused on binary and real-value coding methods, there is a scarcity of techniques grounded in deep representation learning. In this paper, we concentrated on harmonizing cross-modal representation learning and the full-cycle modeling of high-level semantic associations between vision and language, diverging from traditional statistical methods. We systematically categorized and summarized the challenges and open issues in implementing current tech

Style APA, Harvard, Vancouver, ISO itp.

23

Zhong, Fangming, Guangze Wang, Zhikui Chen, Feng Xia, and Geyong Min. "Cross-Modal Retrieval for CPSS Data." IEEE Access 8 (2020): 16689–701. http://dx.doi.org/10.1109/access.2020.2967594.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

24

Dutta, Titir, and Soma Biswas. "Generalized Zero-Shot Cross-Modal Retrieval." IEEE Transactions on Image Processing 28, no. 12 (2019): 5953–62. http://dx.doi.org/10.1109/tip.2019.2923287.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

25

Feng, Fangxiang, Xiaojie Wang, Ruifan Li, and Ibrar Ahmad. "Correspondence Autoencoders for Cross-Modal Retrieval." ACM Transactions on Multimedia Computing, Communications, and Applications 12, no. 1s (2015): 1–22. http://dx.doi.org/10.1145/2808205.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

26

Liu, Zhuokun, Huaping Liu, Wenmei Huang, Bowen Wang, and Fuchun Sun. "Audiovisual cross-modal material surface retrieval." Neural Computing and Applications 32, no. 18 (2019): 14301–9. http://dx.doi.org/10.1007/s00521-019-04476-3.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

27

Yu, Zheng, and Wenmin Wang. "Learning DALTS for cross‐modal retrieval." CAAI Transactions on Intelligence Technology 4, no. 1 (2019): 9–16. http://dx.doi.org/10.1049/trit.2018.1051.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

28

Yang, Xiaohan, Zhen Wang, Nannan Wu, Guokun Li, Chuang Feng, and Pingping Liu. "Unsupervised Deep Relative Neighbor Relationship Preserving Cross-Modal Hashing." Mathematics 10, no. 15 (2022): 2644. http://dx.doi.org/10.3390/math10152644.

Pełny tekst źródła

Streszczenie:

The image-text cross-modal retrieval task, which aims to retrieve the relevant image from text and vice versa, is now attracting widespread attention. To quickly respond to the large-scale task, we propose an Unsupervised Deep Relative Neighbor Relationship Preserving Cross-Modal Hashing (DRNPH) to achieve cross-modal retrieval in the common Hamming space, which has the advantages of storage and efficiency. To fulfill the nearest neighbor search in the Hamming space, we demand to reconstruct both the original intra- and inter-modal neighbor matrix according to the binary feature vectors. Thus,

Style APA, Harvard, Vancouver, ISO itp.

29

Zou, Qiang, Shuli Cheng, Anyu Du, and Jiayi Chen. "Text-Enhanced Graph Attention Hashing for Cross-Modal Retrieval." Entropy 26, no. 11 (2024): 911. http://dx.doi.org/10.3390/e26110911.

Pełny tekst źródła

Streszczenie:

Deep hashing technology, known for its low-cost storage and rapid retrieval, has become a focal point in cross-modal retrieval research as multimodal data continue to grow. However, existing supervised methods often overlook noisy labels and multiscale features in different modal datasets, leading to higher information entropy in the generated hash codes and features, which reduces retrieval performance. The variation in text annotation information across datasets further increases the information entropy during text feature extraction, resulting in suboptimal outcomes. Consequently, reducing

Style APA, Harvard, Vancouver, ISO itp.

30

Li, Guokun, Zhen Wang, Shibo Xu, et al. "Deep Adversarial Learning Triplet Similarity Preserving Cross-Modal Retrieval Algorithm." Mathematics 10, no. 15 (2022): 2585. http://dx.doi.org/10.3390/math10152585.

Pełny tekst źródła

Streszczenie:

The cross-modal retrieval task can return different modal nearest neighbors, such as image or text. However, inconsistent distribution and diverse representation make it hard to directly measure the similarity relationship between different modal samples, which causes a heterogeneity gap. To bridge the above-mentioned gap, we propose the deep adversarial learning triplet similarity preserving cross-modal retrieval algorithm to map different modal samples into the common space, allowing their feature representation to preserve both the original inter- and intra-modal semantic similarity relatio

Style APA, Harvard, Vancouver, ISO itp.

31

Zou, Fuhao, Xingqiang Bai, Chaoyang Luan, Kai Li, Yunfei Wang, and Hefei Ling. "Semi-supervised cross-modal learning for cross modal retrieval and image annotation." World Wide Web 22, no. 2 (2018): 825–41. http://dx.doi.org/10.1007/s11280-018-0581-2.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

32

Cai, Rui, Zhiyu Dong, Jianfeng Dong, and Xun Wang. "Dynamic Adapter with Semantics Disentangling for Cross-lingual Cross-modal Retrieval." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 2 (2025): 1907–16. https://doi.org/10.1609/aaai.v39i2.32186.

Pełny tekst źródła

Streszczenie:

Existing cross-modal retrieval methods typically rely on large-scale vision-language pair data. This makes it challenging to efficiently develop a cross-modal retrieval model for under-resourced languages of interest. Therefore, Cross-lingual Cross-modal Retrieval (CCR), which aims to align vision and the low-resource language (the target language) without using any human-labeled target-language data, has gained increasing attention. As a general parameter-efficient way, a common solution is to utilize adapter modules to transfer the vision-language alignment ability of Vision-Language Pretrai

Style APA, Harvard, Vancouver, ISO itp.

33

Bhatt, Nikita, and Amit Ganatra. "Improvement of deep cross-modal retrieval by generating real-valued representation." PeerJ Computer Science 7 (April 27, 2021): e491. http://dx.doi.org/10.7717/peerj-cs.491.

Pełny tekst źródła

Streszczenie:

The cross-modal retrieval (CMR) has attracted much attention in the research community due to flexible and comprehensive retrieval. The core challenge in CMR is the heterogeneity gap, which is generated due to different statistical properties of multi-modal data. The most common solution to bridge the heterogeneity gap is representation learning, which generates a common sub-space. In this work, we propose a framework called “Improvement of Deep Cross-Modal Retrieval (IDCMR)”, which generates real-valued representation. The IDCMR preserves both intra-modal and inter-modal similarity. The intra

Style APA, Harvard, Vancouver, ISO itp.

34

Zheng, Qibin, Xiaoguang Ren, Yi Liu, and Wei Qin. "Abstraction and Association: Cross-Modal Retrieval Based on Consistency between Semantic Structures." Mathematical Problems in Engineering 2020 (May 7, 2020): 1–17. http://dx.doi.org/10.1155/2020/2503137.

Pełny tekst źródła

Streszczenie:

Cross-modal retrieval aims to find relevant data of different modalities, such as images and text. In order to bridge the modality gap, most existing methods require a lot of coupled sample pairs as training data. To reduce the demands for training data, we propose a cross-modal retrieval framework that utilizes both coupled and uncoupled samples. The framework consists of two parts: Abstraction that aims to provide high-level single-modal representations with uncoupled samples; then, Association links different modalities through a few coupled training samples. Moreover, under this framework,

Style APA, Harvard, Vancouver, ISO itp.

35

Geigle, Gregor, Jonas Pfeiffer, Nils Reimers, Ivan Vulić, and Iryna Gurevych. "Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for Improved Cross-Modal Retrieval." Transactions of the Association for Computational Linguistics 10 (2022): 503–21. http://dx.doi.org/10.1162/tacl_a_00473.

Pełny tekst źródła

Streszczenie:

Abstract Current state-of-the-art approaches to cross- modal retrieval process text and visual input jointly, relying on Transformer-based architectures with cross-attention mechanisms that attend over all words and objects in an image. While offering unmatched retrieval performance, such models: 1) are typically pretrained from scratch and thus less scalable, 2) suffer from huge retrieval latency and inefficiency issues, which makes them impractical in realistic applications. To address these crucial gaps towards both improved and efficient cross- modal retrieval, we propose a novel fine-tuni

Style APA, Harvard, Vancouver, ISO itp.

36

Su, Chao, Huiming Zheng, Dezhong Peng, and Xu Wang. "DiCA: Disambiguated Contrastive Alignment for Cross-Modal Retrieval with Partial Labels." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 19 (2025): 20610–18. https://doi.org/10.1609/aaai.v39i19.34271.

Pełny tekst źródła

Streszczenie:

Cross-modal retrieval aims to retrieve relevant data across different modalities. Driven by costly massive labeled data, existing cross-modal retrieval methods achieve encouraging results. To reduce annotation costs while maintaining performance, this paper focuses on an untouched but challenging problem, i.e., cross-modal retrieval with partial labels (PLCMR). PLCMR faces the dual challenges of annotation ambiguity and modality gap. To address these challenges, we propose a novel method termed disambiguated contrastive alignment (DiCA) for cross-modal retrieval with partial labels. Specifical

Style APA, Harvard, Vancouver, ISO itp.

37

Yang, Fan, Zheng Wang, Jing Xiao, and Shin'ichi Satoh. "Mining on Heterogeneous Manifolds for Zero-Shot Cross-Modal Image Retrieval." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (2020): 12589–96. http://dx.doi.org/10.1609/aaai.v34i07.6949.

Pełny tekst źródła

Streszczenie:

Most recent approaches for the zero-shot cross-modal image retrieval map images from different modalities into a uniform feature space to exploit their relevance by using a pre-trained model. Based on the observation that manifolds of zero-shot images are usually deformed and incomplete, we argue that the manifolds of unseen classes are inevitably distorted during the training of a two-stream model that simply maps images from different modalities into a uniform space. This issue directly leads to poor cross-modal retrieval performance. We propose a bi-directional random walk scheme to mining

Style APA, Harvard, Vancouver, ISO itp.

38

Liu, Li, Xiao Dong, and Tianshi Wang. "Semi-Supervised Cross-Modal Retrieval Based on Discriminative Comapping." Complexity 2020 (July 18, 2020): 1–13. http://dx.doi.org/10.1155/2020/1462429.

Pełny tekst źródła

Streszczenie:

Most cross-modal retrieval methods based on subspace learning just focus on learning the projection matrices that map different modalities to a common subspace and pay less attention to the retrieval task specificity and class information. To address the two limitations and make full use of unlabelled data, we propose a novel semi-supervised method for cross-modal retrieval named modal-related retrieval based on discriminative comapping (MRRDC). The projection matrices are obtained to map multimodal data into a common subspace for different tasks. In the process of projection matrix learning,

Style APA, Harvard, Vancouver, ISO itp.

39

Choo, Yeon-Seung, Boeun Kim, Hyun-Sik Kim, and Yong-Suk Park. "Supervised Contrastive Learning for 3D Cross-Modal Retrieval." Applied Sciences 14, no. 22 (2024): 10322. http://dx.doi.org/10.3390/app142210322.

Pełny tekst źródła

Streszczenie:

Interoperability between different virtual platforms requires the ability to search and transfer digital assets across platforms. Digital assets in virtual platforms are represented in different forms or modalities, such as images, meshes, and point clouds. The cross-modal retrieval of three-dimensional (3D) object representations is challenging due to data representation diversity, making common feature space discovery difficult. Recent studies have been focused on obtaining feature consistency within the same classes and modalities using cross-modal center loss. However, center features are

Style APA, Harvard, Vancouver, ISO itp.

40

Zhang, Guihao, and Jiangzhong Cao. "Feature Fusion Based on Transformer for Cross-modal Retrieval." Journal of Physics: Conference Series 2558, no. 1 (2023): 012012. http://dx.doi.org/10.1088/1742-6596/2558/1/012012.

Pełny tekst źródła

Streszczenie:

Abstract With the popularity of the Internet and the rapid growth of multimodal data, multimodal retrieval has gradually become a hot area of research. As one of the important branches of multimodal retrieval, image-text retrieval aims to design a model to learn and align two modal data, image and text, in order to build a bridge of semantic association between the two heterogeneous data, so as to achieve unified alignment and retrieval. The current mainstream image-text cross-modal retrieval approaches have made good progress by designing a deep learning-based model to find potential associat

Style APA, Harvard, Vancouver, ISO itp.

41

Wang, Yabing, Fan Wang, Jianfeng Dong, and Hao Luo. "CL2CM: Improving Cross-Lingual Cross-Modal Retrieval via Cross-Lingual Knowledge Transfer." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 6 (2024): 5651–59. http://dx.doi.org/10.1609/aaai.v38i6.28376.

Pełny tekst źródła

Streszczenie:

Cross-lingual cross-modal retrieval has garnered increasing attention recently, which aims to achieve the alignment between vision and target language (V-T) without using any annotated V-T data pairs. Current methods employ machine translation (MT) to construct pseudo-parallel data pairs, which are then used to learn a multi-lingual and multi-modal embedding space that aligns visual and target-language representations. However, the large heterogeneous gap between vision and text, along with the noise present in target language translations, poses significant challenges in effectively aligning

Style APA, Harvard, Vancouver, ISO itp.

42

Guo, Jiaen, Haibin Wang, Bo Dan, and Yu Lu. "Deep Supervised Cross-modal Hashing for Ship Image Retrieval." Journal of Physics: Conference Series 2320, no. 1 (2022): 012023. http://dx.doi.org/10.1088/1742-6596/2320/1/012023.

Pełny tekst źródła

Streszczenie:

Abstract The retrieval of multimodal ship images obtained by remote sensing satellites is an important content of remote sensing data analysis, which is of great significance to improve the ability of marine monitoring. In this paper, We propose a novel cross-modal ship image retrieval method, called Deep Supervised Cross-modal Hashing(DSCMH). It consists of a feature learning part and a hash learning part used for feature extraction and hash code generation separately, both two parts have modality-invariant constraints to keep the cross-modal invariability, and the label information is also b

Style APA, Harvard, Vancouver, ISO itp.

43

Zheng, Fuzhong, Weipeng Li, Xu Wang, Luyao Wang, Xiong Zhang, and Haisu Zhang. "A Cross-Attention Mechanism Based on Regional-Level Semantic Features of Images for Cross-Modal Text-Image Retrieval in Remote Sensing." Applied Sciences 12, no. 23 (2022): 12221. http://dx.doi.org/10.3390/app122312221.

Pełny tekst źródła

Streszczenie:

With the rapid development of remote sensing (RS) observation technology over recent years, the high-level semantic association-based cross-modal retrieval of RS images has drawn some attention. However, few existing studies on cross-modal retrieval of RS images have addressed the issue of mutual interference between semantic features of images caused by “multi-scene semantics”. Therefore, we proposed a novel cross-attention (CA) model, called CABIR, based on regional-level semantic features of RS images for cross-modal text-image retrieval. This technique utilizes the CA mechanism to implemen

Style APA, Harvard, Vancouver, ISO itp.

44

He, Yanzhong, Yanjiao Zhang, and Lin Zhu. "Improving chinese cross-modal retrieval with multi-modal transportation data." Journal of Physics: Conference Series 2813, no. 1 (2024): 012014. http://dx.doi.org/10.1088/1742-6596/2813/1/012014.

Pełny tekst źródła

Streszczenie:

Abstract As societal development progresses and individual travel needs evolve, the demand for acquiring multi-modal transportation status information has steadily increased. The present domain of transportation status encompasses a wealth of multi-modal information, including vehicle trajectory data, traffic condition visual imagery, and textual information. Acquiring multi-modal transportation status information facilitates a rapid understanding of the prevailing traffic conditions at a given location. In this study, we investigate multi-modal transportation status data encompassing trajecto

Style APA, Harvard, Vancouver, ISO itp.

45

Huang, Xin, Yuxin Peng, and Mingkuan Yuan. "MHTN: Modal-Adversarial Hybrid Transfer Network for Cross-Modal Retrieval." IEEE Transactions on Cybernetics 50, no. 3 (2020): 1047–59. http://dx.doi.org/10.1109/tcyb.2018.2879846.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

46

刘, 志虎. "Label Consistency Hashing for Cross-Modal Retrieval." Computer Science and Application 11, no. 04 (2021): 1104–12. http://dx.doi.org/10.12677/csa.2021.114114.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

47

Jiang, Zining, Zhenyu Weng, Runhao Li, Huiping Zhuang, and Zhiping Lin. "Online weighted hashing for cross-modal retrieval." Pattern Recognition 161 (May 2025): 111232. https://doi.org/10.1016/j.patcog.2024.111232.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

48

Gou, Tingting, Libo Liu, Qian Liu, and Zhen Deng. "A New Approach to Cross-Modal Retrieval." Journal of Physics: Conference Series 1288 (August 2019): 012044. http://dx.doi.org/10.1088/1742-6596/1288/1/012044.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

49

Hu, Peng, Dezhong Peng, Xu Wang, and Yong Xiang. "Multimodal adversarial network for cross-modal retrieval." Knowledge-Based Systems 180 (September 2019): 38–50. http://dx.doi.org/10.1016/j.knosys.2019.05.017.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

50

Cao, Wenming, Qiubin Lin, Zhihai He, and Zhiquan He. "Hybrid representation learning for cross-modal retrieval." Neurocomputing 345 (June 2019): 45–57. http://dx.doi.org/10.1016/j.neucom.2018.10.082.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!