Academic literature on the topic 'Multi-modal image translation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Multi-modal image translation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Multi-modal image translation"

1

Yang, Pengcheng, Boxing Chen, Pei Zhang, and Xu Sun. "Visual Agreement Regularized Training for Multi-Modal Machine Translation." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (2020): 9418–25. http://dx.doi.org/10.1609/aaai.v34i05.6484.

Full text
Abstract:
Multi-modal machine translation aims at translating the source sentence into a different language in the presence of the paired image. Previous work suggests that additional visual information only provides dispensable help to translation, which is needed in several very special cases such as translating ambiguous words. To make better use of visual information, this work presents visual agreement regularized training. The proposed approach jointly trains the source-to-target and target-to-source translation models and encourages them to share the same focus on the visual information when gene
APA, Harvard, Vancouver, ISO, and other styles
2

Kaur, Jagroop, and Gurpreet Singh Josan. "English to Hindi Multi Modal Image Caption Translation." Journal of scientific research 64, no. 02 (2020): 274–81. http://dx.doi.org/10.37398/jsr.2020.640238.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Xiaobin Guo, Xiaobin Guo. "Image Visual Attention Mechanism-based Global and Local Semantic Information Fusion for Multi-modal English Machine Translation." 電腦學刊 33, no. 2 (2022): 037–50. http://dx.doi.org/10.53106/199115992022043302004.

Full text
Abstract:
<p>Machine translation is a hot research topic at present. Traditional machine translation methods are not effective because they require a large number of training samples. Image visual semantic information can improve the effect of the text machine translation model. Most of the existing works fuse the whole image visual semantic information into the translation model, but the image may contain different semantic objects. These different local semantic objects have different effects on the words prediction of the decoder. Therefore, this paper proposes a multi-modal machine translation
APA, Harvard, Vancouver, ISO, and other styles
4

Xiaobin Guo, Xiaobin Guo. "Image Visual Attention Mechanism-based Global and Local Semantic Information Fusion for Multi-modal English Machine Translation." 電腦學刊 33, no. 2 (2022): 037–50. http://dx.doi.org/10.53106/199115992022043302004.

Full text
Abstract:
<p>Machine translation is a hot research topic at present. Traditional machine translation methods are not effective because they require a large number of training samples. Image visual semantic information can improve the effect of the text machine translation model. Most of the existing works fuse the whole image visual semantic information into the translation model, but the image may contain different semantic objects. These different local semantic objects have different effects on the words prediction of the decoder. Therefore, this paper proposes a multi-modal machine translation
APA, Harvard, Vancouver, ISO, and other styles
5

Shi, Xiayang, Jiaqi Yuan, Yuanyuan Huang, Zhenqiang Yu, Pei Cheng, and Xinyi Liu. "Reference Context Guided Vector to Achieve Multimodal Machine Translation." Journal of Physics: Conference Series 2171, no. 1 (2022): 012076. http://dx.doi.org/10.1088/1742-6596/2171/1/012076.

Full text
Abstract:
Abstract Traditional machine translation mainly realizes the introduction of static images from other modal information to improve translation quality. In processing, a variety of methods are combined to improve the data and features, so that the translation result is close to the upper limit, and some even need to rely on the sensitivity of the sample distance algorithm to the data. At the same time, multi-modal MT will cause problems such as lack of semantic interaction in the attention mechanism in the same corpus, or excessive encoding of the same text image information and corpus irreleva
APA, Harvard, Vancouver, ISO, and other styles
6

Calixto, Iacer, and Qun Liu. "An error analysis for image-based multi-modal neural machine translation." Machine Translation 33, no. 1-2 (2019): 155–77. http://dx.doi.org/10.1007/s10590-019-09226-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Gómez, Jose L., Gabriel Villalonga, and Antonio M. López. "Co-Training for Deep Object Detection: Comparing Single-Modal and Multi-Modal Approaches." Sensors 21, no. 9 (2021): 3185. http://dx.doi.org/10.3390/s21093185.

Full text
Abstract:
Top-performing computer vision models are powered by convolutional neural networks (CNNs). Training an accurate CNN highly depends on both the raw sensor data and their associated ground truth (GT). Collecting such GT is usually done through human labeling, which is time-consuming and does not scale as we wish. This data-labeling bottleneck may be intensified due to domain shifts among image sensors, which could force per-sensor data labeling. In this paper, we focus on the use of co-training, a semi-supervised learning (SSL) method, for obtaining self-labeled object bounding boxes (BBs), i.e.
APA, Harvard, Vancouver, ISO, and other styles
8

Rodrigues, Ana, Bruna Sousa, Amílcar Cardoso, and Penousal Machado. "“Found in Translation”: An Evolutionary Framework for Auditory–Visual Relationships." Entropy 24, no. 12 (2022): 1706. http://dx.doi.org/10.3390/e24121706.

Full text
Abstract:
The development of computational artifacts to study cross-modal associations has been a growing research topic, as they allow new degrees of abstraction. In this context, we propose a novel approach to the computational exploration of relationships between music and abstract images, grounded by findings from cognitive sciences (emotion and perception). Due to the problem’s high-level nature, we rely on evolutionary programming techniques to evolve this audio–visual dialogue. To articulate the complexity of the problem, we develop a framework with four modules: (i) vocabulary set, (ii) music ge
APA, Harvard, Vancouver, ISO, and other styles
9

Lu, Chien-Yu, Min-Xin Xue, Chia-Che Chang, Che-Rung Lee, and Li Su. "Play as You Like: Timbre-Enhanced Multi-Modal Music Style Transfer." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 1061–68. http://dx.doi.org/10.1609/aaai.v33i01.33011061.

Full text
Abstract:
Style transfer of polyphonic music recordings is a challenging task when considering the modeling of diverse, imaginative, and reasonable music pieces in the style different from their original one. To achieve this, learning stable multi-modal representations for both domain-variant (i.e., style) and domaininvariant (i.e., content) information of music in an unsupervised manner is critical. In this paper, we propose an unsupervised music style transfer method without the need for parallel data. Besides, to characterize the multi-modal distribution of music pieces, we employ the Multi-modal Uns
APA, Harvard, Vancouver, ISO, and other styles
10

Islam, Kh Tohidul, Sudanthi Wijewickrema, and Stephen O’Leary. "A rotation and translation invariant method for 3D organ image classification using deep convolutional neural networks." PeerJ Computer Science 5 (March 4, 2019): e181. http://dx.doi.org/10.7717/peerj-cs.181.

Full text
Abstract:
Three-dimensional (3D) medical image classification is useful in applications such as disease diagnosis and content-based medical image retrieval. It is a challenging task due to several reasons. First, image intensity values are vastly different depending on the image modality. Second, intensity values within the same image modality may vary depending on the imaging machine and artifacts may also be introduced in the imaging process. Third, processing 3D data requires high computational power. In recent years, significant research has been conducted in the field of 3D medical image classifica
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Multi-modal image translation"

1

Liu, Yahui. "Exploring Multi-Domain and Multi-Modal Representations for Unsupervised Image-to-Image Translation." Doctoral thesis, Università degli studi di Trento, 2022. http://hdl.handle.net/11572/342634.

Full text
Abstract:
Unsupervised image-to-image translation (UNIT) is a challenging task in the image manipulation field, where input images in a visual domain are mapped into another domain with desired visual patterns (also called styles). An ideal direction in this field is to build a model that can map an input image in a domain to multiple target domains and generate diverse outputs in each target domain, which is termed as multi-domain and multi-modal unsupervised image-to-image translation (MMUIT). Recent studies have shown remarkable results in UNIT but they suffer from four main limitations: (1) State-of
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Multi-modal image translation"

1

Gobeill, Julien, Henning Müller, and Patrick Ruch. "Translation by Text Categorisation: Medical Image Retrieval in ImageCLEFmed 2006." In Evaluation of Multilingual and Multi-modal Information Retrieval. Springer Berlin Heidelberg, 2007. http://dx.doi.org/10.1007/978-3-540-74999-8_88.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Ren, Mengwei, Heejong Kim, Neel Dey, and Guido Gerig. "Q-space Conditioned Translation Networks for Directional Synthesis of Diffusion Weighted Images from Multi-modal Structural MRI." In Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87234-2_50.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Multi-modal image translation"

1

Chen, Zekang, Jia Wei, and Rui Li. "Unsupervised Multi-Modal Medical Image Registration via Discriminator-Free Image-to-Image Translation." In Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}. International Joint Conferences on Artificial Intelligence Organization, 2022. http://dx.doi.org/10.24963/ijcai.2022/117.

Full text
Abstract:
In clinical practice, well-aligned multi-modal images, such as Magnetic Resonance (MR) and Computed Tomography (CT), together can provide complementary information for image-guided therapies. Multi-modal image registration is essential for the accurate alignment of these multi-modal images. However, it remains a very challenging task due to complicated and unknown spatial correspondence between different modalities. In this paper, we propose a novel translation-based unsupervised deformable image registration approach to convert the multi-modal registration problem to a mono-modal one. Specifi
APA, Harvard, Vancouver, ISO, and other styles
2

Vishnu Kumar, V. H., and N. Lalithamani. "English to Tamil Multi-Modal Image Captioning Translation." In 2022 IEEE World Conference on Applied Intelligence and Computing (AIC). IEEE, 2022. http://dx.doi.org/10.1109/aic55036.2022.9848810.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Huang, Ping, Shiliang Sun, and Hao Yang. "Image-Assisted Transformer in Zero-Resource Multi-Modal Translation." In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. http://dx.doi.org/10.1109/icassp39728.2021.9413389.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Arar, Moab, Yiftach Ginger, Dov Danon, Amit H. Bermano, and Daniel Cohen-Or. "Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation." In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020. http://dx.doi.org/10.1109/cvpr42600.2020.01342.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Laskar, Sahinur Rahman, Rohit Pratap Singh, Partha Pakray, and Sivaji Bandyopadhyay. "English to Hindi Multi-modal Neural Machine Translation and Hindi Image Captioning." In Proceedings of the 6th Workshop on Asian Translation. Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/d19-5205.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Cortinhal, Tiago, Fatih Kurnaz, and Eren Erdal Aksoy. "Semantics-aware Multi-modal Domain Translation: From LiDAR Point Clouds to Panoramic Color Images." In 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). IEEE, 2021. http://dx.doi.org/10.1109/iccvw54120.2021.00338.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!