Journal articles on the topic 'Pre-training corpora'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 journal articles for your research on the topic 'Pre-training corpora.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.
Sun, Yu, Shuohuan Wang, Yukun Li, et al. "ERNIE 2.0: A Continual Pre-Training Framework for Language Understanding." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (2020): 8968–75. http://dx.doi.org/10.1609/aaai.v34i05.6428.
Full textMoodaley, Wayne, and Arnesh Telukdarie. "A Conceptual Framework for Subdomain Specific Pre-Training of Large Language Models for Green Claim Detection." European Journal of Sustainable Development 12, no. 4 (2023): 319. http://dx.doi.org/10.14207/ejsd.2023.v12n4p319.
Full textHussain, Rida Ghafoor. "RiskBERT: A Pre-Trained Insurance-Based Language Model for Text Classification." International Journal of Innovative Technology and Exploring Engineering 14, no. 7 (2025): 12–18. https://doi.org/10.35940/ijitee.f1097.14070625.
Full textLiu, Yinhan, Jiatao Gu, Naman Goyal, et al. "Multilingual Denoising Pre-training for Neural Machine Translation." Transactions of the Association for Computational Linguistics 8 (November 2020): 726–42. http://dx.doi.org/10.1162/tacl_a_00343.
Full textDean, Roger Thornton, and Marcus Thomas Pearce. "Algorithmically-generated Corpora that use Serial Compositional Principles Can Contribute to the Modeling of Sequential Pitch Structure in Non-tonal Music." Empirical Musicology Review 11, no. 1 (2016): 27. http://dx.doi.org/10.18061/emr.v11i1.4900.
Full textKreutzer, Julia, Isaac Caswell, Lisa Wang, et al. "Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets." Transactions of the Association for Computational Linguistics 10 (2022): 50–72. http://dx.doi.org/10.1162/tacl_a_00447.
Full textYuan, Sha, Hanyu Zhao, Zhengxiao Du, et al. "WuDaoCorpora: A super large-scale Chinese corpora for pre-training language models." AI Open 2 (2021): 65–68. http://dx.doi.org/10.1016/j.aiopen.2021.06.001.
Full textQian, Jing, Yong Yue, Katie Atkinson, and Gangmin Li. "Understanding Chinese Moral Stories with Further Pre-Training." International Journal on Natural Language Computing 12, no. 2 (2023): 01–12. http://dx.doi.org/10.5121/ijnlc.2023.12201.
Full textJing, Qian, Yue Yong, Atkinson Katie, and Li Gangmin. "Understanding Chinese Moral Stories with Further Pre-Training." International Journal on Natural Language Computing (IJNLC) 12, no. 2 (2023): 12. https://doi.org/10.5281/zenodo.7929155.
Full textChukhno, Olena, and Nataliia Tuchyna. "OVERCOMING DIFFICULATIES IN USING LINGUISTIC CORPORA FOR TEACHING ENGLISH TO PRE-SERVICE TEACHERS." Education. Innovation. Practice 12, no. 7 (2024): 91–105. http://dx.doi.org/10.31110/2616-650x-vol12i7-014.
Full textJiang, Xiaoze, Yaobo Liang, Weizhu Chen, and Nan Duan. "XLM-K: Improving Cross-Lingual Language Model Pre-training with Multilingual Knowledge." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 10 (2022): 10840–48. http://dx.doi.org/10.1609/aaai.v36i10.21330.
Full textKajiwara, Tomoyuki, Biwa Miura, and Yuki Arase. "Monolingual Transfer Learning via Bilingual Translators for Style-Sensitive Paraphrase Generation." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (2020): 8042–49. http://dx.doi.org/10.1609/aaai.v34i05.6314.
Full textShen, Huawen, Gengluo Li, Jinwen Zhong, and Yu Zhou. "LDP: Generalizing to Multilingual Visual Information Extraction by Language Decoupled Pretraining." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 7 (2025): 6805–13. https://doi.org/10.1609/aaai.v39i7.32730.
Full textAlruwaili, Awatif. "An online training course on the use of corpora for teachers in public schools." JALT CALL Journal 19, no. 1 (2023): 53–70. http://dx.doi.org/10.29140/jaltcall.v19n1.675.
Full textShi, Peng, Patrick Ng, Zhiguo Wang, et al. "Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 15 (2021): 13806–14. http://dx.doi.org/10.1609/aaai.v35i15.17627.
Full textKryeziu, Labehat, and Visar Shehu. "Pre-Training MLM Using Bert for the Albanian Language." SEEU Review 18, no. 1 (2023): 52–62. http://dx.doi.org/10.2478/seeur-2023-0035.
Full textLi, Zhen, Dan Qu, Chaojie Xie, Wenlin Zhang, and Yanxia Li. "Language Model Pre-training Method in Machine Translation Based on Named Entity Recognition." International Journal on Artificial Intelligence Tools 29, no. 07n08 (2020): 2040021. http://dx.doi.org/10.1142/s0218213020400217.
Full textLiu, Peng, Lemei Zhang, and Jon Atle Gulla. "Pre-train, Prompt, and Recommendation: A Comprehensive Survey of Language Modeling Paradigm Adaptations in Recommender Systems." Transactions of the Association for Computational Linguistics 11 (2023): 1553–71. http://dx.doi.org/10.1162/tacl_a_00619.
Full textLuo, Da, Yanglei Gan, Rui Hou, et al. "Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 17 (2024): 18742–50. http://dx.doi.org/10.1609/aaai.v38i17.29838.
Full textSprugnoli, Rachele, Giovanni Moretti, and Marco Passarotti. "Building and Comparing Lemma Embeddings for Latin. Classical Latin versus Thomas Aquinas." Italian Journal of Computational Linguistics 6, no. 1 (2021): 29–45. https://doi.org/10.5281/zenodo.4618000.
Full textMaruyama, Takumi, and Kazuhide Yamamoto. "Extremely Low-Resource Text Simplification with Pre-trained Transformer Language Model." International Journal of Asian Language Processing 30, no. 01 (2020): 2050001. http://dx.doi.org/10.1142/s2717554520500010.
Full textZheng, Yinhe, Rongsheng Zhang, Minlie Huang, and Xiaoxi Mao. "A Pre-Training Based Personalized Dialogue Generation Model with Persona-Sparse Data." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (2020): 9693–700. http://dx.doi.org/10.1609/aaai.v34i05.6518.
Full textMao, Zhuoyuan, Chenhui Chu, and Sadao Kurohashi. "Linguistically Driven Multi-Task Pre-Training for Low-Resource Neural Machine Translation." ACM Transactions on Asian and Low-Resource Language Information Processing 21, no. 4 (2022): 1–29. http://dx.doi.org/10.1145/3491065.
Full textKim, Svetlana, and Yuchae Jung. "Elevating Clinical Semantics: Contrastive Pre-Training Beyond Cross-Entropy in Discharge Summaries." Applied Sciences 15, no. 12 (2025): 6541. https://doi.org/10.3390/app15126541.
Full textAhn, Youngdo, Sangwook Han, Seonggyu Lee, and Jong Won Shin. "Speech Emotion Recognition Incorporating Relative Difficulty and Labeling Reliability." Sensors 24, no. 13 (2024): 4111. http://dx.doi.org/10.3390/s24134111.
Full textAi, Xi, and Bin Fang. "Empirical Regularization for Synthetic Sentence Pairs in Unsupervised Neural Machine Translation." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 14 (2021): 12471–79. http://dx.doi.org/10.1609/aaai.v35i14.17479.
Full textZhu, Quan, Xiaoyin Wang, Xuan Liu, Wanru Du, and Xingxing Ding. "Multi-task learning for aspect level semantic classification combining complex aspect target semantic enhancement and adaptive local focus." Mathematical Biosciences and Engineering 20, no. 10 (2023): 18566–91. http://dx.doi.org/10.3934/mbe.2023824.
Full textSiddhant, Aditya, Anuj Goyal, and Angeliki Metallinou. "Unsupervised Transfer Learning for Spoken Language Understanding in Intelligent Agents." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 4959–66. http://dx.doi.org/10.1609/aaai.v33i01.33014959.
Full textFromont, Robert, and Kevin Watson. "Factors influencing automatic segmental alignment of sociophonetic corpora." Corpora 11, no. 3 (2016): 401–31. http://dx.doi.org/10.3366/cor.2016.0101.
Full textGao, Yunfan, Yun Xiong, Siqi Wang, and Haofen Wang. "GeoBERT: Pre-Training Geospatial Representation Learning on Point-of-Interest." Applied Sciences 12, no. 24 (2022): 12942. http://dx.doi.org/10.3390/app122412942.
Full textChiang, Cheng-Han, and Hung-yi Lee. "On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 10 (2022): 10518–25. http://dx.doi.org/10.1609/aaai.v36i10.21295.
Full textKarimzadeh, Morteza, and Alan MacEachren. "GeoAnnotator: A Collaborative Semi-Automatic Platform for Constructing Geo-Annotated Text Corpora." ISPRS International Journal of Geo-Information 8, no. 4 (2019): 161. http://dx.doi.org/10.3390/ijgi8040161.
Full textLi, Yucheng, Frank Guerin, and Chenghua Lin. "LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time-Sensitive Test Construction." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 17 (2024): 18600–18607. http://dx.doi.org/10.1609/aaai.v38i17.29822.
Full textBae, Jae Kwon. "A Study on Application of the Artificial Intelligence-Based Pre-trained Language Model." Academic Society of Global Business Administration 21, no. 2 (2024): 64–83. http://dx.doi.org/10.38115/asgba.2024.21.2.64.
Full textFang, Liuqin, Qing Ma, and Jiahao Yan. "The effectiveness of corpus-based training on collocation use in L2 writing for Chinese senior secondary school students." Journal of China Computer-Assisted Language Learning 1, no. 1 (2021): 80–109. http://dx.doi.org/10.1515/jccall-2021-2004.
Full textYuan, Ling, Chenglong Zeng, and Peng Pan. "Research of Chinese Entity Recognition Model Based on Multi-Feature Semantic Enhancement." Electronics 13, no. 24 (2024): 4895. https://doi.org/10.3390/electronics13244895.
Full textKang, Yu, Tianqiao Liu, Hang Li, Yang Hao, and Wenbiao Ding. "Self-Supervised Audio-and-Text Pre-training with Extremely Low-Resource Parallel Data." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 10 (2022): 10875–83. http://dx.doi.org/10.1609/aaai.v36i10.21334.
Full textHe, Wanwei, Yinpei Dai, Yinhe Zheng, et al. "GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with Semi-supervised Learning and Explicit Policy Injection." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 10 (2022): 10749–57. http://dx.doi.org/10.1609/aaai.v36i10.21320.
Full textGarrido-Muñoz , Ismael, Arturo Montejo-Ráez , Fernando Martínez-Santiago , and L. Alfonso Ureña-López . "A Survey on Bias in Deep NLP." Applied Sciences 11, no. 7 (2021): 3184. http://dx.doi.org/10.3390/app11073184.
Full textPerkowski, Ernest, Rui Pan, Tuan Dung Nguyen, et al. "AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets." Research Notes of the AAS 8, no. 1 (2024): 7. http://dx.doi.org/10.3847/2515-5172/ad1abe.
Full textPota, Marco, Mirko Ventura, Rosario Catelli, and Massimo Esposito. "An Effective BERT-Based Pipeline for Twitter Sentiment Analysis: A Case Study in Italian." Sensors 21, no. 1 (2020): 133. http://dx.doi.org/10.3390/s21010133.
Full textWang, Ke, Xiutian Zhao, and Wei Peng. "Learning from Failure: Improving Meeting Summarization without Good Samples." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 17 (2024): 19153–61. http://dx.doi.org/10.1609/aaai.v38i17.29883.
Full textGonzález-Docasal, Ander, and Aitor Álvarez. "Enhancing Voice Cloning Quality through Data Selection and Alignment-Based Metrics." Applied Sciences 13, no. 14 (2023): 8049. http://dx.doi.org/10.3390/app13148049.
Full textVu, Dang Thanh, Gwanghyun Yu, Chilwoo Lee, and Jinyoung Kim. "Text Data Augmentation for the Korean Language." Applied Sciences 12, no. 7 (2022): 3425. http://dx.doi.org/10.3390/app12073425.
Full textQi, Kunxun, and Jianfeng Du. "Translation-Based Matching Adversarial Network for Cross-Lingual Natural Language Inference." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (2020): 8632–39. http://dx.doi.org/10.1609/aaai.v34i05.6387.
Full textLi, Lei, Yongfeng Zhang, and Li Chen. "Personalized Prompt Learning for Explainable Recommendation." ACM Transactions on Information Systems 41, no. 4 (2023): 1–26. http://dx.doi.org/10.1145/3580488.
Full textYang, Tiancheng, Ilia Sucholutsky, Kuang-Yu Jen, and Matthias Schonlau. "exKidneyBERT: a language model for kidney transplant pathology reports and the crucial role of extended vocabularies." PeerJ Computer Science 10 (February 28, 2024): e1888. http://dx.doi.org/10.7717/peerj-cs.1888.
Full textA. Brenes, Jose, Javier Ferrández-Pastor, José M. Cámara-Zapata, and Gabriela Marín-Raventós. "Use of Hough Transform and Homography for the Creation of Image Corpora for Smart Agriculture." International Journal on Cybernetics & Informatics 12, no. 6 (2023): 09–19. http://dx.doi.org/10.5121/ijci.2023.120602.
Full textPanboonyuen, Teerapong, Kulsawasd Jitkajornwanich, Siam Lawawirojwong, Panu Srestasathiern, and Peerapon Vateekul. "Semantic Segmentation on Remotely Sensed Images Using an Enhanced Global Convolutional Network with Channel Attention and Domain Specific Transfer Learning." Remote Sensing 11, no. 1 (2019): 83. http://dx.doi.org/10.3390/rs11010083.
Full textLiu, Rui, and Barzan Mozafari. "Transformer with Memory Replay." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 7 (2022): 7567–75. http://dx.doi.org/10.1609/aaai.v36i7.20722.
Full text