Einloggen

Thematische Bibliographien / Multimodal / Dissertationen

Um die anderen Arten von Veröffentlichungen zu diesem Thema anzuzeigen, folgen Sie diesem Link: Multimodal.

Dissertationen zum Thema „Multimodal“

Autor: Grafiati

Veröffentlicht am 4. Juni 2021

Zuletzt aktualisiert am 14. September 2024

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit Top-50 Dissertationen für die Forschung zum Thema "Multimodal" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Sehen Sie die Dissertationen für verschiedene Spezialgebieten durch und erstellen Sie Ihre Bibliographie auf korrekte Weise.

1

Yovera, Solano Luis Ángel, und Cárdenas Julio César Luna. „Multimodal interaction“. Bachelor's thesis, Universidad Peruana de Ciencias Aplicadas (UPC), 2017. http://hdl.handle.net/10757/621880.

Der volle Inhalt der Quelle

Annotation:

El presente trabajo tiene como objetivo identificar los avances, investigaciones y propuestas referentes a esta tecnología; en la cual se verán tanto las tendencias en desarrollo como las más intrépidas pero innovadoras propuestas de soluciones. Así mismo, para poder comprender los mecanismos que permiten esta interacción, es necesario conocer las mejores prácticas y estándares como los estipulados por la W3C (World Wide Web Consortium) y la ACM (Association for Computing Machinery). Identificados todos los avances y propuestas, se procederá a conocer todos los mecanismos como Facial Recognition, Touch Recognition, Speech Recognition y los respectivos requerimientos para que pueda ser usada permitiendo una interacción más natural entre el usuario y el sistema. Identificados todos los avances existentes sobre esta tecnología así como los mecanismos y requerimiento que permiten su uso, se define la propuesta de arquitectura, para la implementación de sistemas Multimodal Interaction. Además, como parte del modelo se propone una cartera de tres (3) proyectos la cual forma parte del plan de continuidad de este trabajo y forma parte de la validación del modelo.
This research aims to identify all the advances, research and proposals for this technology; in which they will be from developing trends to more bold but innovative solutions proposed. Likewise, in order to understand the mechanisms that allow this interaction, it is necessary to know the best practices and standards as stipulated by the W3C (World Wide Web Consortium) and the ACM (Association for Computing Machinery). It identified all the advances and proposals shall be known as the all mechanisms NLP (Natural Language Processing), Facial Recognition, Touch and respective requirements so it can be used allowing a more natural interaction between the user and the system. Identified all existing developments on this technology and the mechanisms and requirements that allow their use, a proposed developable system that is used by Multimodal Interaction is defined.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

2

Hoffmann, Grasiele Fernandes. „Retextualização multimodal“. reponame:Repositório Institucional da UFSC, 2015. https://repositorio.ufsc.br/xmlui/handle/123456789/158434.

Der volle Inhalt der Quelle

Annotation:

Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro de Comunicação e Expressão, Programa de Pós-Graduação em Estudos da Tradução, Florianópolis, 2015.
Made available in DSpace on 2016-01-15T14:53:16Z (GMT). No. of bitstreams: 1 336867.pdf: 2755760 bytes, checksum: 3abfda8513e32d6be165c3a029389b6b (MD5) Previous issue date: 2015
O designer educacional (DE) é o profissional que atua em cursos mediados pelas tecnologias da informação e comunicação realizando, em meio a várias atribuições, a retextualização (adequação e adaptação) de conteúdos educativos e instrucionais para outros gêneros textuais e modalidades semióticas. Foi neste contexto, na relação entre esta atividade desenvolvida pelo DE e a realizada pelo tradutor, que surgiu nosso interesse em verificar se o movimento realizado pelo DE ao transformar o texto base em um outro/novo texto se dá por meio de um processo de tradução/retextualização multimodal. Para realizar essa investigação nos apoiamos nos princípios teóricos da Tradução Funcionalista (REISS, [1984]1996; VERMEER, [1978]1986; [1984]1996; e NORD, [1988]1991; [1997]2014; 2006), na perspectiva da Retextualização (TRAVAGLIA, 2003; MARCUSCHI, 2001; MATÊNCIO, 2002; 2003; DELL?ISOLA, 2007) e na abordagem da multimodalidade textual (HODGE e KRESS, 1988; KRESS e van LEEUWEN, 2001; 2006; JEWITT, 2009; KRESS, 2010). Neste estudo analisamos o livro-texto impresso (texto base) e o e-book (texto meta) produzido para o curso a distância Prevenção dos Problemas Relacionados ao Uso de Drogas - Capacitação para Conselheiros e Lideranças Comunitárias (6ª edição), promovido pela Secretaria Nacional de Políticas sobre Drogas (vinculada ao Ministério da Justiça) e realizado pela Universidade Federal de Santa Catarina, por meio do Núcleo Multiprojetos de Tecnologia Educacional. No e-book estão sintetizados os conceitos mais importantes apresentados no livro-texto impresso, além de algumas informações contidas no AVEA. Para realizar o cotejamento e a análise deste corpus e identificar os movimentos tradutórios/retextualização realizados pelo DE, utilizamos o modelo de análise textual aplicado à tradução proposto por Nord ([1988]1991). Os resultados demonstraram que: 1) a atividade de retextualização realizada pelo DE contempla, durante o processo tradutório, outros modos e recursos semióticos que compõem o texto multimodal; 2) os fatores intratextuais relacionados por Nord enfocavam basicamente os elementos linguísticos e não compreendiam em um nível de igualdade todas as múltiplas modalidades semióticas que compõem o texto multimodal, daí a necessidade de acrescentar outras modalidades semióticas no modelo proposto pela teórica; e 3) o trabalho desenvolvido pelo DE se equipara ao realizado pelo tradutor, pois existe na atividade de retextualização realizada por ele uma ação intencional de produzir um texto multimodal a partir de uma oferta informativa base. Neste contexto, constatamos a necessidade de: 1) ampliar o conceito de retextualização, estendendo o processo para o estudo e a análise das demais modalidades semióticas que compõem os textos multimodais; 2) acrescentar ao quadro de Nord outros fatores de análise, ampliando o modelo para a análise textual aplicada à retextualização multimodal; e 3) o DE realiza sim um trabalho de tradução ao transformar um texto em um outro/novo texto multimodal. Dessa forma, atingimos o objetivo geral de nossa pesquisa e comprovamos, com base na teoria Funcionalista da Tradução, que o movimento realizado pelo DE ao transformar o texto base em outro/novo texto se dá por meio de um processo de tradução/retextualização multimodal e que, por esta razão, nesta função específica, ele se torna um tradutor/retextualizador.

Abstract : Instructional designers (ID) act on courses mediated by Information and Communication Technologies (ICTs), performing actions such as the retextualizaiton (adaptation and adequacy) of educational and instructional content for other textual genres and semiotic modalities. Within this context of relations between designer and translator is that we have acquired an interest in verifying whether the design movement in transforming the base text in another new text is done through a process of multimodal translation/retextualization. In order to perform this investigation, we have based our study in the theoretical principles of Functionalist Translation (REISS, [1984]1996; VERMEER, [1978]1986; [1984]1996; e NORD, [1988]1991; [1997]2014; 2006), in Retextualization perspectives (TRAVAGLIA, 2003; MARCUSCHI, 2001; MATÊNCIO, 2002; 2003; DELL?ISOLA, 2007), and in the textual multimodality approach (HODGE e KRESS, 1988; KRESS e van LEEUWEN, 2001; 2006; JEWITT, 2009; KRESS, 2010). For this study, analysis of a printed textbook (base text) and its eBook (target text) produced for a Distance Education course of Problem Prevention in Drug Use - A Course for Counselors and Community Leadership (6th edition) promoted by the Brazilian office of politics on drugs (Secretaria Nacional de Políticas sobre Drogas - SENAD) with the Ministry of Justice, developed by Universidade Federal de Santa Catarina (UFSC) through the multi-project center for educational technology (Núcleo Multiprojetos de Tecnologia Educacional - NUTE). In this ebook, the most important concepts from the printed textbook are presented, as well as some information available in the VLE. In order to quota and analyze translational and retextualization moves performed by the designer, the textual analysis model for translation proposed by Nord ([1988]1991) was utilized. Results demonstrate that: 1) retextualization performed by the designer includes other modes and semiotic resources composing the multimodal text, during the translation process; 2) intratextual factors listed by Nord focused basically on linguistic elements, not understanding all the multiple semiotic modalities that comprise the text in a degree of equality - hence the need to add other semiotic modalities in Nord?s proposed model; and 3) the instructional design is equivalent to the workof a translator, as there is an intent to produce a multimodal text from an informative source. In this context, it is possible to note the need to: 1) broaden the concept of retextualization, extending the process to study and analyze the other semiotic modalities that compose multimodal texts; 2) add to Nord?s framework other factors of analysis, broadening her model to the textual analysis applied to multimodal retextualization, and 3) the designer does indeed perform translation in ransforming a text into another new multimodal text. In this sense, the main objective of this study was achieved, thus proving within the Functionalist theory of Translation that the designer transforms the source text in another new text through a process of multimoda translation/retextualization, and that designers thus become translators/retextualizers.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

3

Contreras, Lizarraga Adrián Arturo. „Multimodal microwave filters“. Doctoral thesis, Universitat Politècnica de Catalunya, 2013. http://hdl.handle.net/10803/134931.

Der volle Inhalt der Quelle

Annotation:

This thesis presents the conception, design and implementation of new topologies of multimodal microwave resonators and filters, using a combination of uniplanar technologies such as coplanar waveguide (CPW), coplanar strips (CPS) and slotlines. The term "multimodal" refers to uniplanar circuits in which the two fundamental modes of the CPW propagate (the even and the odd mode). By using both modes of the CPW, it is possible to achieve added functions, such as additional transmission zeros to increase the rejection, or to attenuate harmonic frequencies to improve the out-of-band rejection. Moreover, it is demonstrated that by using multimodal circuits, it is possible to reduce the length of of a conventional filter up to 80%. In addition to bandpass filters, new topologies of compact band-stop filters are developed. The proposed band-stop filters make use of slow-wave resonators to decrease the total area of the filters and achieve compact topologies. This work also addresses the development of synthesis techniques for each multimodal filter. The design equations were obtained from generalized multimodal circuits available in the literature, which have been adapted for each particular case and modeled as basic filter components, such as immitance inverters or lumped elements. By using the proposed synthesis equations, it is possible to design filters with a desired response and relative bandwidth. The use of the proposed synthesis enables a fast analysis and design of multimodal filters using circuit simulators. As an added feature, several reconfigurable and tunable filter topologies were demonstrated, using active devices (PIN diodes and varactors) or RF-MEMS. These new topologies demonstrate the flexibility of multimodal circuits. For the RF-MEMS-based tunable filters, different capacitive and ohmic switches were designed, fabricated and measured. As an example of the additional degrees of freedom using of RF-MEMS and multimodal CPW circuits, a reconfigurable filter using RF-MEMS switchable air-bridges as a reconfiguration device has been demonstrated in this work for the first time.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

4

Guilbeault, Douglas Richard. „Multimodal rhetorical figures“. Thesis, University of British Columbia, 2015. http://hdl.handle.net/2429/53978.

Der volle Inhalt der Quelle

Annotation:

Rhetoricians have, for millennia, catalogued a set of persuasive techniques called rhetorical figures, but so far, they have examined them almost exclusively in the verbal modalities – i.e. in the written and spoken word. This paper shows how, in embodied contexts, figures also draw from bodily modalities to enhance their argumentative effects. Focusing on political speeches, I show how hand gestures are systematically incorporated into antithesis, a figure wherein contrastive phrases are framed in parallel form: the stronger lead, the weaker follow. Cognitive approaches to gesture provide my analysis with the tools to use gesture as a window into the embodied foundations of figures and their persuasiveness. I show how various features of gesture, including hand dominance, distance, and shape, allow speakers to channel the uptake of figures in terms of viewpoint and metaphor. With evidence that gestures are produced and perceived implicitly, my study suggests that persuasive aims are implemented by the subconscious mechanisms of multimodal cognition. I further show how multimodality participates in other figures, even in multimodal environments outside of gesture and speech, with each environment giving rise to a novel set of rhetorical affordances. These findings provide the initial steps toward a broader theory of multimodal rhetoric that examines how figures and other forms of persuasion originate from the body and evolve through the cultural and technological engineering of multimodal experience.
Arts, Faculty of
English, Department of
Graduate

APA, Harvard, Vancouver, ISO und andere Zitierweisen

5

Bazo, Rodríquez Alfredo, und Rosado Vitaliano Delgado. „Eje multimodal Amazonas“. Bachelor's thesis, Universidad Peruana de Ciencias Aplicadas (UPC), 2013. http://hdl.handle.net/10757/273520.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

6

Kim, Hana 1980. „Multimodal animation control“. Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/29661.

Der volle Inhalt der Quelle

Annotation:

Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003.
Includes bibliographical references (leaf 44).
In this thesis, we present a multimodal animation control system. Our approach is based on a human-centric computing model proposed by Project Oxygen at MIT Laboratory for Computer Science. Our system allows the user to create and control animation in real time using the speech interface developed using SpeechBuilder. The user can also fall back to traditional input modes should the speech interface fail. We assume that the user has no prior knowledge and experience in animation and yet enable him to create interesting and meaningful animation naturally and fluently. We argue that our system can be used in a number of applications ranging from PowerPoint presentations to simulations to children's storytelling tools.
by Hana Kim.
M.Eng.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

7

Caglayan, Ozan. „Multimodal Machine Translation“. Thesis, Le Mans, 2019. http://www.theses.fr/2019LEMA1016/document.

Der volle Inhalt der Quelle

Annotation:

La traduction automatique vise à traduire des documents d’une langue à une autre sans l’intervention humaine. Avec l’apparition des réseaux de neurones profonds (DNN), la traduction automatique neuronale(NMT) a commencé à dominer le domaine, atteignant l’état de l’art pour de nombreuses langues. NMT a également ravivé l’intérêt pour la traduction basée sur l’interlangue grâce à la manière dont elle place la tâche dans un cadre encodeur-décodeur en passant par des représentations latentes. Combiné avec la flexibilité architecturale des DNN, ce cadre a aussi ouvert une piste de recherche sur la multimodalité, ayant pour but d’enrichir les représentations latentes avec d’autres modalités telles que la vision ou la parole, par exemple. Cette thèse se concentre sur la traduction automatique multimodale(MMT) en intégrant la vision comme une modalité secondaire afin d’obtenir une meilleure compréhension du langage, ancrée de façon visuelle. J’ai travaillé spécifiquement avec un ensemble de données contenant des images et leurs descriptions traduites, où le contexte visuel peut être utile pour désambiguïser le sens des mots polysémiques, imputer des mots manquants ou déterminer le genre lors de la traduction vers une langue ayant du genre grammatical comme avec l’anglais vers le français. Je propose deux approches principales pour intégrer la modalité visuelle : (i) un mécanisme d’attention multimodal qui apprend à prendre en compte les représentations latentes des phrases sources ainsi que les caractéristiques visuelles convolutives, (ii) une méthode qui utilise des caractéristiques visuelles globales pour amorcer les encodeurs et les décodeurs récurrents. Grâce à une évaluation automatique et humaine réalisée sur plusieurs paires de langues, les approches proposées se sont montrées bénéfiques. Enfin,je montre qu’en supprimant certaines informations linguistiques à travers la dégradation systématique des phrases sources, la véritable force des deux méthodes émerge en imputant avec succès les noms et les couleurs manquants. Elles peuvent même traduire lorsque des morceaux de phrases sources sont entièrement supprimés
Machine translation aims at automatically translating documents from one language to another without human intervention. With the advent of deep neural networks (DNN), neural approaches to machine translation started to dominate the field, reaching state-ofthe-art performance in many languages. Neural machine translation (NMT) also revived the interest in interlingual machine translation due to how it naturally fits the task into an encoder-decoder framework which produces a translation by decoding a latent source representation. Combined with the architectural flexibility of DNNs, this framework paved the way for further research in multimodality with the objective of augmenting the latent representations with other modalities such as vision or speech, for example. This thesis focuses on a multimodal machine translation (MMT) framework that integrates a secondary visual modality to achieve better and visually grounded language understanding. I specifically worked with a dataset containing images and their translated descriptions, where visual context can be useful forword sense disambiguation, missing word imputation, or gender marking when translating from a language with gender-neutral nouns to one with grammatical gender system as is the case with English to French. I propose two main approaches to integrate the visual modality: (i) a multimodal attention mechanism that learns to take into account both sentence and convolutional visual representations, (ii) a method that uses global visual feature vectors to prime the sentence encoders and the decoders. Through automatic and human evaluation conducted on multiple language pairs, the proposed approaches were demonstrated to be beneficial. Finally, I further show that by systematically removing certain linguistic information from the input sentences, the true strength of both methods emerges as they successfully impute missing nouns, colors and can even translate when parts of the source sentences are completely removed

APA, Harvard, Vancouver, ISO und andere Zitierweisen

8

Hewa, Thondilege Akila Sachinthani Pemasiri. „Multimodal Image Correspondence“. Thesis, Queensland University of Technology, 2022. https://eprints.qut.edu.au/235433/1/Akila%2BHewa%2BThondilege%2BThesis%281%29.pdf.

Der volle Inhalt der Quelle

Annotation:

Multimodal images are used across many application areas including medical and surveillance. Due to the different characteristics of different imaging modalities, developing image processing algorithms for multimodal images is challenging. This thesis proposes effective solutions for the challenging problem of multimodal semantic correspondence where the connections between similar components across images from different modalities are established. The proposed methods which are based on deep learning techniques have been applied for several applications including epilepsy type classification and 3D reconstruction of human hand from visible and X-ray image. These proposed algorithms can be adapted to many other imaging modalities.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

9

Bruni, Elia. „Multimodal Distributional Semantics“. Doctoral thesis, University of Trento, 2013. http://eprints-phd.biblio.unitn.it/1075/1/EliaBruniThesis.pdf.

Der volle Inhalt der Quelle

Annotation:

Although being one very simple statement, the distributional hypothesis - namely, words that occur in similar contexts are semantically similar - has been granted the role of main assumption in many computational linguistic techniques. This is mostly due to the fact that it allows to easily and automatically construct a representation of word meaning from a large textual input. Among the computational linguistic techniques that are corpus-based and adopt the distributional hypothesis, Distributional semantic models (DSMs) have been shown to be a very effective method in many semantic-related tasks. DSMs approximate word meaning by vectors that keep track of the patterns of co-occurrence of words in the processed corpora. In addition, DSMs have been shown to be a very plausible computational model for human concept cognition, since they are able to simulate several psychological phenomena. Despite their success, one of their strongest limitations is that they entirely represent word meaning in terms of connections with other words. Cognitive scientists have argued that, in this way, DSMs neglect that humans rely also on non-verbal experiences and have access to rich sources of perceptual knowledge when they learn the meaning of words. In this work, the lack of perceptual grounding of distributional models is addressed by exploiting computer vision techniques that automatically identify discrete "visual words" in images, so that the distributional representation of a word can be extended to also encompass its co-occurrence with the visual words of images it is associated with. A flexible architecture to integrate text- and image-based distributional information is introduced and tested on a set of empirical evaluations, showing that an integrated model is superior to a purely text-based approach, and it provides somewhat complementary semantic information with respect to the latter.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

10

Campagnaro, Filippo. „Multimodal underwater networks“. Doctoral thesis, Università degli studi di Padova, 2019. http://hdl.handle.net/11577/3422716.

Der volle Inhalt der Quelle

Annotation:

The recent level of maturity reached by broadband underwater non-acoustic communication technologies paves the way to the development of new applications, such as wireless remote control for underwater vehicles and the possibility to retrieve a massive quantity of data from underwater sensor networks. Indeed, an optical link can support the transmission of high-traffic demanding data (e.g., video streams) in real time, but its reach can hardly exceed 100 meters. Also radio frequency electromagnetic communications can provide a high transmission rate, however, in salty sea-waters their maximum range is less than 7 meters. Therefore, when considering either optical or radio frequency communications, a low-rate long-range acoustic link still has to be employed. Although this backup link cannot be used to transmit hight data traffic, it can still keep the minimal quality of service needed to monitor the status of the underwater network. This thesis presents how optical and acoustic communications can be combined in the so-called multimodal networks, by testing such solutions with DESERT Underwater, a simulation and experimentation framework for underwater networks. Optimal routing and data-link layers for multimodal networks are analyzed, as well as a switching algorithm that decides which technology to employ for transmitting the data packets, depending on the type of data and the channel quality. These protocols are evaluated via both simulation and field experiments.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

11

Servajean, Philippe. „Approche "système unique" de la (méta)cognition“. Thesis, Montpellier 3, 2018. http://www.theses.fr/2018MON30062.

Der volle Inhalt der Quelle

Annotation:

Il existe aujourd’hui un large consensus sur le fait que le système cognitif est capabled’avoir des activités sur lui-même, on parle de métacognition. Si plusieurs travaux se sontintéressés aux mécanismes qui sous-tendent cette métacognition, à notre connaissance,aucun ne l’a fait dans une perspective « sensorimotrice et intégrative » du fonctionne-ment cognitif comme celle que nous proposons. Ainsi, la thèse que nous défendons dansce travail est la suivante : l’information métacognitive, notamment la fluence, possèdestrictement le même statut que l’information cognitive (i.e., sensorielle et motrice). Dansun premier chapitre, nous proposons un modèle de la cognition respectant ce principe.Ensuite, dans les deux chapitres suivants, nous mettons à l’épreuve notre hypothèse parle biais d’expériences et de simulations effectuées à l’aide du modèle mathématique quenous avons élaboré. Ces travaux ont porté plus précisément sur des phénomènes liés à troispossibilités originales prédites par notre hypothèse : la possibilité de méta-métacognition,la possibilité d’intégration entre information sensorielle et information métacognitive, etla possibilité d’abstraction métacognitive
There is today a broad consensus that the cognitive system is capable of having acti-vities on itself, we are talking about metacognition. Although several studies have focusedon the mechanisms underlying this metacognition, to our knowledge, none has done so ina "sensorimotor and integrative" perspective of cognitive functioning such as the one wepropose. Thus, the thesis we defend in this work is the following : metacognitive infor-mation, especially fluency, has strictly the same status as any cognitive information (i.e.,sensory and motor). In a first chapter, we propose a model of cognition respecting thisprinciple. Then, in the next two chapters, we test our hypothesis through experimentsand simulations using the mathematical model we have developed. This work focusedmore specifically on phenomena related to three original possibilities predicted by ourhypothesis : the possibility of meta-metacognition, the possibility of integration betweensensory information and metacognitive information, and the possibility of metacognitiveabstraction

APA, Harvard, Vancouver, ISO und andere Zitierweisen

12

Orta, de la Garza María Rebeca. „Nodo multimodal de transferencias“. Thesis, Universidad de las Américas Puebla, 2003. http://catarina.udlap.mx/u_dl_a/tales/documentos/lar/orta_d_mr/.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

13

Aas, Asbjørn. „Brukerforsøk med multimodal demonstrator“. Thesis, Norwegian University of Science and Technology, Department of Electronics and Telecommunications, 2006. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-10283.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

14

Angelica, Lim. „MEI: Multimodal Emotional Intelligence“. 京都大学 (Kyoto University), 2014. http://hdl.handle.net/2433/188869.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

15

Marcollo, Hayden 1972. „Multimodal vortex-induced vibration“. Monash University, Dept. of Mechanical Engineering, 2002. http://arrow.monash.edu.au/hdl/1959.1/7674.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

16

Qvarfordt, Pernilla. „Eyes on multimodal interaction /“. Linköping : Univ, 2004. http://www.bibl.liu.se/liupubl/disp/disp2004/tek893s.pdf.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

17

Kernchen, Jochen Ralf. „Mobile multimodal user interfaces“. Thesis, University of Surrey, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.531385.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

18

Nyamapfene, Abel. „Unsupervised multimodal neural networks“. Thesis, University of Surrey, 2006. http://epubs.surrey.ac.uk/844064/.

Der volle Inhalt der Quelle

Annotation:

We extend the in-situ Hebbian-linked SOMs network by Miikkulainen to come up with two unsupervised neural networks that learn the mapping between the individual modes of a multimodal dataset. The first network, the single-pass Hebbian linked SOMs network, extends the in-situ Hebbian-linked SOMs network by enabling the Hebbian link weights to be computed through one- shot learning. The second network, a modified counter propagation network, extends the unsupervised learning of crossmodal mappings by making it possible for only one self-organising map to implement the crossmodal mapping. The two proposed networks each have a smaller computation time and achieve lower crossmodal mean squared errors than the in-situ Hebbian- linked SOMs network when assessed on two bimodal datasets, an audio-acoustic speech utterance dataset and a phonological-semantics child utterance dataset. Of the three network architectures, the modified counterpropagation network achieves the highest percentage of correct classifications comparable to that of the LVQ-2 algorithm by Kohonen and the neural network for category learning by de Sa and Ballard in classification tasks using the audio-acoustic speech utterance dataset. To facilitate multimodal processing of temporal data, we propose a Temporal Hypermap neural network architecture that learns and recalls multiple temporal patterns in an unsupervised manner. The Temporal Hypermap introduces flexibility in the recall of temporal patterns - a stored temporal pattern can be retrieved by prompting the network with the temporal pattern's identity vector, whilst the incorporation of short term memory allows the recall of a temporal pattern, starting from the pattern item specified by contextual information up to the last item in the pattern sequence. Finally, we extend the connectionist modelling of child language acquisition in two important respects. First, we introduce the concept of multimodal representation of speech utterances at the one-word and two-word stage. This allows us to model child language at the one-word utterance stage with a single modified counterpropagation network, which is an improvement on previous models in which multiple networks are required to simulate the different aspects of speech at the one-word utterance stage. Secondly, we present, for the time, a connectionist model of the transition of child language from the one-word utterance stage to the two-word utterance stage. We achieve this using a gated multi-net comprising a modified counterpropagation network and a Temporal Hypermap.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

19

Danielsson, Oscar. „Multimodal Brain Age Estimation“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281834.

Der volle Inhalt der Quelle

Annotation:

Machine learning models trained on MRI brain scans of healthy subjects can be used to predict age. Accurate estimation of brain age is important for reliably detecting abnormal aging in the brain. One way to increase the accuracy of predicted brain age is through using multimodal data. Previous research using multimodal data has largely been non-deep learning-based; in this thesis, we examine a deep learning model that can effectively utilize several modalities. Three baseline models were trained. Two used T1-weighted and T2- weighted data, respectively. The third model was trained on both T1- and T2- weighted data using high-level fusion. We found that using multimodal data reduced the mean absolute error of predicted ages. Afourth model utilized disentanglement to create a representation robust to missing T1- or T2-weighted data. Our results showed that this model performed similarly to the baselines, meaning that it is robust to missing data and at no significant cost of prediction accuracy.
Maskininlärningsmodeller tränade på MR-data av friska personer kan användas för att estimera ålder. Noggrann uppskattning hjärnans ålder är viktigt för att pålitligt upptäcka onormalt åldrande av hjärnan. Ett sätt att öka noggrannheten är genom att använda multimodal data. Tidigare forskning gjord med multimodal data har till stor del inte varit baserad på djupinlärning; i detta examensarbete undersöker vi en djupinlärningsmodell som effektivt kan utnyttja flera modaliteter. Tre basmodeller tränades. Två använde T1-viktad respektive T2-viktad data. Den tredje modellen tränades på både T1- och T2-viktad data genom högnivå-fusion. Vi fann att användning av multimodal data minskade det genomsnittliga absoluta felet för estimerade åldrar. En fjärde modell använde separering (eng. disentanglement) för att skapa en representation som är robust vid avsaknad av T1- eller T2-viktad data. Resultaten var lika för denna modell och basmodellerna, vilket innebär att modellen är robust mot avsaknad av data, utan någon betydande försämring i noggranhet.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

20

Sioson, Allan A. „Multimodal Networks in Biology“. Diss., Virginia Tech, 2005. http://hdl.handle.net/10919/29995.

Der volle Inhalt der Quelle

Annotation:

A multimodal network (MMN) is a novel mathematical construct that captures the structure of biological networks, computational network models, and relationships from biological databases. An MMN subsumes the structure of graphs and hypergraphs, either undirected or directed. Formally, an MMN is a triple (V,E,M) where V is a set of vertices, E is a set of modal hyperedges, and M is a set of modes. A modal hyperedge e=(T,H,A,m) in E is an ordered 4-tuple, in which T,H,A are subsets of V and m is an element of M. The sets T, H, and A are the tail, head, and associate of e, while m is its mode. In the context of biology, each vertex is a biological entity, each hyperedge is a relationship, and each mode is a type of relationship (e.g., 'forms complex' and 'is a'). Within the space of multimodal networks, structural operations such as union, intersection, hyperedge contraction, subnetwork selection, and graph or hypergraph projections can be performed. A denotational semantics approach is used to specify the semantics of each hyperedge in MMN in terms of interaction among its vertices. This is done by mapping each hyperedge e to a hyperedge code algo:V(e), an algorithm that details how the vertices in V(e) get used and updated. A semantic MMN-based model is a function of a given schedule of evaluation of hyperedge codes and the current state of the model, a set of vertex-value pairs. An MMN-based computational system is implemented as a proof of concept to determine empirically the benefits of having it. This system consists of an MMN database populated by data from various biological databases, MMN operators implemented as database functions, graph operations implemented in C++ using LEDA, and mmnsh, a shell scripting language that provides a consistent interface to both data and operators. It is demonstrated that computational network models may enrich the MMN database and MMN data may be used as input to other computational tools and environments. A simulator is developed to compute from an initial state and a schedule of hyperedge codes the resulting state of a semantic MMN model.
Ph. D.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

21

Fernández, Carbonell Marcos. „Automated Multimodal Emotion Recognition“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-282534.

Der volle Inhalt der Quelle

Annotation:

Being able to read and interpret affective states plays a significant role in human society. However, this is difficult in some situations, especially when information is limited to either vocal or visual cues. Many researchers have investigated the so-called basic emotions in a supervised way. This thesis holds the results of a multimodal supervised and unsupervised study of a more realistic number of emotions. To that end, audio and video features are extracted from the GEMEP dataset employing openSMILE and OpenFace, respectively. The supervised approach includes the comparison of multiple solutions and proves that multimodal pipelines can outperform unimodal ones, even with a higher number of affective states. The unsupervised approach embraces a traditional and an exploratory method to find meaningful patterns in the multimodal dataset. It also contains an innovative procedure to better understand the output of clustering techniques.
Att kunna läsa och tolka affektiva tillstånd spelar en viktig roll i det mänskliga samhället. Detta är emellertid svårt i vissa situationer, särskilt när information är begränsad till antingen vokala eller visuella signaler. Många forskare har undersökt de så kallade grundläggande känslorna på ett övervakat sätt. Det här examensarbetet innehåller resultaten från en multimodal övervakad och oövervakad studie av ett mer realistiskt antal känslor. För detta ändamål extraheras ljud- och videoegenskaper från GEMEP-data med openSMILE respektive OpenFace. Det övervakade tillvägagångssättet inkluderar jämförelse av flera lösningar och visar att multimodala pipelines kan överträffa unimodala sådana, även med ett större antal affektiva tillstånd. Den oövervakade metoden omfattar en konservativ och en utforskande metod för att hitta meningsfulla mönster i det multimodala datat. Den innehåller också ett innovativt förfarande för att bättre förstå resultatet av klustringstekniker.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

22

Alabau, Gonzalvo Vicente. „Multimodal interactive structured prediction“. Doctoral thesis, Universitat Politècnica de València, 2014. http://hdl.handle.net/10251/35135.

Der volle Inhalt der Quelle

Annotation:

This thesis presents scientific contributions to the field of multimodal interac- tive structured prediction (MISP). The aim of MISP is to reduce the human effort required to supervise an automatic output, in an efficient and ergonomic way. Hence, this thesis focuses on the two aspects of MISP systems. The first aspect, which refers to the interactive part of MISP, is the study of strate- gies for efficient human¿computer collaboration to produce error-free outputs. Multimodality, the second aspect, deals with other more ergonomic modalities of communication with the computer rather than keyboard and mouse. To begin with, in sequential interaction the user is assumed to supervise the output from left-to-right so that errors are corrected in sequential order. We study the problem under the decision theory framework and define an optimum decoding algorithm. The optimum algorithm is compared to the usually ap- plied, standard approach. Experimental results on several tasks suggests that the optimum algorithm is slightly better than the standard algorithm. In contrast to sequential interaction, in active interaction it is the system that decides what should be given to the user for supervision. On the one hand, user supervision can be reduced if the user is required to supervise only the outputs that the system expects to be erroneous. In this respect, we define a strategy that retrieves first the outputs with highest expected error first. Moreover, we prove that this strategy is optimum under certain conditions, which is validated by experimental results. On the other hand, if the goal is to reduce the number of corrections, active interaction works by selecting elements, one by one, e.g., words of a given output to be supervised by the user. For this case, several strategies are compared. Unlike the previous case, the strategy that performs better is to choose the element with highest confidence, which coincides with the findings of the optimum algorithm for sequential interaction. However, this also suggests that minimizing effort and supervision are contradictory goals. With respect to the multimodality aspect, this thesis delves into techniques to make multimodal systems more robust. To achieve that, multimodal systems are improved by providing contextual information of the application at hand. First, we study how to integrate e-pen interaction in a machine translation task. We contribute to the state-of-the-art by leveraging the information from the source sentence. Several strategies are compared basically grouped into two approaches: inspired by word-based translation models and n-grams generated from a phrase-based system. The experiments show that the former outper- forms the latter for this task. Furthermore, the results present remarkable improvements against not using contextual information. Second, similar ex- periments are conducted on a speech-enabled interface for interactive machine translation. The improvements over the baseline are also noticeable. How- ever, in this case, phrase-based models perform much better than word-based models. We attribute that to the fact that acoustic models are poorer estima- tions than morphologic models and, thus, they benefit more from the language model. Finally, similar techniques are proposed for dictation of handwritten documents. The results show that speech and handwritten recognition can be combined in an effective way. Finally, an evaluation with real users is carried out to compare an interactive machine translation prototype with a post-editing prototype. The results of the study reveal that users are very sensitive to the usability aspects of the user interface. Therefore, usability is a crucial aspect to consider in an human evaluation that can hinder the real benefits of the technology being evaluated. Hopefully, once usability problems are fixed, the evaluation indicates that users are more favorable to work with the interactive machine translation system than to the post-editing system.
Alabau Gonzalvo, V. (2014). Multimodal interactive structured prediction [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/35135
TESIS
Premiado

APA, Harvard, Vancouver, ISO und andere Zitierweisen

23

Theissing, Simon. „Supervision en transport multimodal“. Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLN076/document.

Der volle Inhalt der Quelle

Annotation:

Les réseaux de transport multimodaux modernes sont essentiels pour la durabilité écologique et l’aisance économique des agglomérations urbaines, par conséquent aussi pour la qualité de vie de leurs habitants. D’ailleurs, le bon fonctionnement sur le plan de la compatibilité entre les différents services et lignes est essentiel pour leur acceptation, étant donné que (i) la plupart des trajets nécessitent des changements entre les lignes et que (ii) des investissements coûteux, dans le but de créer des liens plus directs avec la construction de nouvelles lignes ou l’extension de lignes existantes, ne sont pas à débattre. Une meilleure compréhension des interactions entre les modes et les lignes dans le contexte des transferts de passagers est ainsi d’une importance cruciale. Toutefois, comprendre ces transferts est singulièrement difficile dans le cas de situations inhabituelles comme des incidents de passagers et/ou si la demande dévie des plans statistiques à long terme. Ici le développement et l’intégration de modèles mathématiques sophistiqués peuvent remédier à ces inconvénients. À ce propos, la supervision via des modèles prévoyants représente un champ d’application très prometteur, analysée ici. La supervision selon des modèles prévoyants peut prendre différentes formes. Dans le présent travail, nous nous intéressons à l’analyse de l’impact basé sur des modèles de différentes actions, comme des départs en retard de certains véhicules après un arrêt, appliqué sur le fonctionnement du réseau de transport et sa gestion de situations de stress qui ne font pas partie des données statistiques. C’est pourquoi nous introduisons un nouveau modèle, un automate hybride avec une dynamique probabiliste, et nous montrons comment ce modèle profondément mathématique peut prédire le nombre de passagers dans et l’état de fonctionnement du véhicule en question du réseau de transport, d’abord par de simples estimations du nombre de tous les passagers et la connaissance exacte de l’état du véhicule au moment de l’incident. Ce nouvel automate réunit sous un même regard les passagers demandeurs de services de transport à parcours fixes ainsi que les véhicules capables de les assurer. Il prend en compte la capacité maximale et le fait que les passagers n’empruntent pas nécessairement des chemins efficaces, dont la représentation sous la forme d’une fonction de coût facilement compréhensible devient nécessaire. Chaque passager possède son propre profil de voyage qui définit un chemin fixe dans l’infrastructure du réseau de transport, et une préférence pour les différents services de transport sur son chemin. Les mouvements de véhicules sont inclus dans la dynamique du modèle, ce qui est essentiel pour l’analyse de l’impact de chaque action liée aux mouvements de véhicule. De surcroît, notre modèle prend en compte l’incertitude qui résulte du nombre inconnu de passagers au début et de passagers arrivant au fur et à mesure. Comparé aux modèles classiques d’automates hybrides, notre approche inspirée du style des réseaux de Pétri ne requiert pas le calcul de ces équations différentielles à la main. Ces systèmes peuvent être dérivés de la représentation essentiellement graphique d’une manière automatique pour le calcul en temps discret d’une prévision. Cette propriété de notre modèle réduit le risque de précisions faites par des humains et les erreurs qui en résulteraient. Après avoir introduit notre nouveau modèle, nous développons dans ce rapport également quelques éléments constitutifs sous la forme d'algorithmes qui visent les deux types d'impasses qui sont probables d'occurir pendant la simulation faisant un pronostic, c-à-d l'intégration numérique des systèmes de haute dimension d'équations différentielles et l'explosion combinatoire de son état discret. En plus, nous prouvons la faisabilité des calculs et nous montrons les bénéfices prospectifs de notre approche dans la forme de quelques tests simplistes et quelques cas plus réalistes
Without any doubt, modern multimodal transportation systems are vital to the ecological sustainability and the economic prosperity of urban agglomerations, and in doing so to the quality of life of their many inhabitants. Moreover it is known that a well-functioning interoperability of the different modes and lines in such networked systems is key to their acceptance given the fact that (i) many if not most trips between different origin/destination pairs require transfers, and (ii) costly infrastructure investments targeting the creation of more direct links through the construction of new or the extension of existing lines are not open to debate. Thus, a better understanding of how the different modes and lines in these systems interact through passenger transfers is of utmost importance. However, acquiring this understanding is particularly tricky in degraded situations where some or all transportation services cannot be provided as planned due to e.g. some passenger incident, and/or where the demand for these scheduled services deviates from any statistical long term-plannings. Here, the development for and integration of sophisticated mathematical models into the operation of such systems may provide remedy, where model-predictive supervision seems to be one very promising area of application which we consider here. Model-predictive supervision can take several forms. In this work, we focus on the model-based impact analysis of different actions, such as the delayed departure of some vehicle from a stop, applied to the operation of the considered transportation system upon some downgrading situation occurs which lacks statistical data. For this purpose, we introduce a new stochastic hybrid automaton model, and show how this mathematically profound model can be used to forecast the passenger numbers in and the vehicle operational state of this transportation system starting from estimations of all passenger numbers and an exact knowledge of the vehicle operational state at the time of the incident occurrence. Our new automaton model brings under the same roof, all passengers who demand fixed-route transportation services, and all vehicles which provide them. It explicitly accounts for all capacity-limits and the fact that passengers do not necessarily follow efficient paths which must be mapped to some simple to understand cost function. Instead, every passenger has a trip profile which defines a fixed route in the infrastructure of the transportation system, and a preference for the different transportation services along this route. Moreover, our model does not abstract away from all vehicle movements but explicitly includes them in its dynamics, which latter property is crucial to the impact analysis of any vehicle movement-related action. In addition our model accounts for uncertainty; resulting from unknown initial passenger numbers and unknown passenger arrival flows. Compared to classical modelling approaches for hybrid automata, our Petri net-styled approach does not require the end user to specify our model's many differential equations systems by hand. Instead, all these systems can be derived from the model's predominantly graphical specification in a fully automated manner for the discrete time computation of any forecast. This latter property of our model in turn reduces the risk of man-made specification and thus forecasting errors. Besides introducing our new model, we also develop in this report some algorithmic bricks which target two major bottlenecks which are likely to occur during its forecast-producing simulation, namely the numerical integration of the many high-dimensional systems of stochastic differential equations and the combinatorial explosion of its discrete state. Moreover, we proof the computational feasibility and show the prospective benefits of our approach in form of some simplistic test- and some more realistic use case

APA, Harvard, Vancouver, ISO und andere Zitierweisen

24

Perez, Lloret Marta. „Photoactivable Multimodal Antimicrobial Nanoconstructs“. Doctoral thesis, Università di Catania, 2017. http://hdl.handle.net/10761/3999.

Der volle Inhalt der Quelle

Annotation:

The search of novel antibacterial treatment modalities designed to face problems of an- tibiotic Multi Drug Resistance (MDR) associated with the alarmingly low turnover of new clinically approved antibiotic drugs is one of the main challenges in biomedicine. In this frame, the achievement of tailored systems able to release therapeutic agents in a controlled fashion is one of the growing area in the burgeoning field of nanomedicine. Light represents the most elegant and non-invasive trigger to deliver bio-active compounds on demand at the target site with superb control of three main factors, site, timing and dosage, determin- ing for the therapeutic outcome. In addition, light triggering is biofriendly, provides fast reaction rates and offers the great benefit of not affecting physiological parameters such as temperature, pH and ionic strength, fundamental requisite for biomedical applications. Recent breakthroughs of nanotechnology offer the opportunity to characterize, manipulate and organize matter at the nanometer scale, controlling the size and shape of the result- ing nanomaterials and greatly improving the biocompatibility and the cellular uptake effi- ciency. This thesis focuses on the design and fabrication of light-activated nanoconstructs for the controlled delivery of unconventional therapeutics such as reactive oxygen and nitrogen species, and heat which, in contrast to conventional drugs, do not suffer MDR problems and display reduced systemic effects. A range of nanosystems able to generate individually, sequentially or simultaneously the above cytotoxic agents is reported and, in some case, their antibacterial activity is also investigated. This dissertation is divided in two sections: the first one regards nanomaterials, while the second focuses on molecular hybrid systems, all preceded by a brief in-troduction.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

25

Adebayo, Kolawole John <1986&gt. „Multimodal Legal Information Retrieval“. Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amsdottorato.unibo.it/8634/1/ADEBAYO-JOHN-tesi.pdf.

Der volle Inhalt der Quelle

Annotation:

The goal of this thesis is to present a multifaceted way of inducing semantic representation from legal documents as well as accessing information in a precise and timely manner. The thesis explored approaches for semantic information retrieval (IR) in the Legal context with a technique that maps specific parts of a text to the relevant concept. This technique relies on text segments, using the Latent Dirichlet Allocation (LDA), a topic modeling algorithm for performing text segmentation, expanding the concept using some Natural Language Processing techniques, and then associating the text segments to the concepts using a semi-supervised text similarity technique. This solves two problems, i.e., that of user specificity in formulating query, and information overload, for querying a large document collection with a set of concepts is more fine-grained since specific information, rather than full documents is retrieved. The second part of the thesis describes our Neural Network Relevance Model for E-Discovery Information Retrieval. Our algorithm is essentially a feature-rich Ensemble system with different component Neural Networks extracting different relevance signal. This model has been trained and evaluated on the TREC Legal track 2010 data. The performance of our models across board proves that it capture the semantics and relatedness between query and document which is important to the Legal Information Retrieval domain.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

26

Medjahed, Hamid. „Distress situation identification by multimodal data fusion for home healthcare telemonitoring“. Thesis, Evry, Institut national des télécommunications, 2010. http://www.theses.fr/2010TELE0002/document.

Der volle Inhalt der Quelle

Annotation:

Aujourd'hui, la proportion des personnes âgées devient importante par rapport à l'ensemble de la population, et les capacités d'admission dans les hôpitaux sont limitées. En conséquence, plusieurs systèmes de télévigilance médicale ont été développés, mais il existe peu de solutions commerciales. Ces systèmes se concentrent soit sur la mise en oeuvre d’une architecture générique pour l'intégration des systèmes d'information médicale, soit sur l'amélioration de la vie quotidienne des patients en utilisant divers dispositifs automatiques avec alarme, soit sur l’offre de services de soins aux patients souffrant de certaines maladies comme l'asthme, le diabète, les problèmes cardiaques ou pulmonaires, ou la maladie d'Alzheimer. Dans ce contexte, un système automatique pour la télévigilance médicale à domicile est une solution pour faire face à ces problèmes et ainsi permettre aux personnes âgées de vivre en toute sécurité et en toute indépendance à leur domicile. Dans cette thèse, qui s’inscrit dans le cadre de la télévigilance médicale, un nouveau système de télévigilance médicale à plusieurs modalités nommé EMUTEM (Environnement Multimodale pour la Télévigilance Médicale) est présenté. Il combine et synchronise plusieurs modalités ou capteurs, grâce à une technique de fusion de données multimodale basée sur la logique floue. Ce système peut assurer une surveillance continue de la santé des personnes âgées. L'originalité de ce système avec la nouvelle approche de fusion est sa flexibilité à combiner plusieurs modalités de télévigilance médicale. Il offre un grand bénéfice aux personnes âgées en surveillant en permanence leur état de santé et en détectant d’éventuelles situations de détresse
The population age increases in all societies throughout the world. In Europe, for example, the life expectancy for men is about 71 years and for women about 79 years. For North America the life expectancy, currently is about 75 for men and 81 for women. Moreover, the elderly prefer to preserve their independence, autonomy and way of life living at home the longest time possible. The current healthcare infrastructures in these countries are widely considered to be inadequate to meet the needs of an increasingly older population. Home healthcare monitoring is a solution to deal with this problem and to ensure that elderly people can live safely and independently in their own homes for as long as possible. Automatic in-home healthcare monitoring is a technological approach which helps people age in place by continuously telemonitoring. In this thesis, we explore automatic in-home healthcare monitoring by conducting a study of professionals who currently perform in-home healthcare monitoring, by combining and synchronizing various telemonitoring modalities,under a data synchronization and multimodal data fusion platform, FL-EMUTEM (Fuzzy Logic Multimodal Environment for Medical Remote Monitoring). This platform incorporates algorithms that process each modality and providing a technique of multimodal data fusion which can ensures a pervasive in-home health monitoring for elderly people based on fuzzy logic.The originality of this thesis which is the combination of various modalities in the home, about its inhabitant and their surroundings, will constitute an interesting benefit and impact for the elderly person suffering from loneliness. This work complements the stationary smart home environment in bringing to bear its capability for integrative continuous observation and detection of critical situations

APA, Harvard, Vancouver, ISO und andere Zitierweisen

27

Sundström, Jessica. „Multimodalt skrivande - förutsättningar och lärandemöjligheter : Litteraturstudie om mellanstadieelevers lärandemöjligheter vid multimodalt skrivande inom svenskämnet och förutsättningar för en multimodal skrivundervisning“. Thesis, Högskolan Dalarna, Pedagogiskt arbete, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:du-18956.

Der volle Inhalt der Quelle

Annotation:

Kursplanen i svenska förklarar att eleverna ska utveckla det multimodala skrivandet inom svenskämnet. Det multimodala skrivandet innebär att ord, bild och ljud kombineras och samspelar. Huvudsyftet med den här litteraturstudien har varit att undersöka hur det multimodala skrivandet inom svenskämnet för årskurs 4-6 kan se ut, vilka kompetenser och resurser som krävs för att bedriva en multimodal skrivundervisning, samt vilket slags lärande det multimodala skrivandet kan ge upphov till hos eleverna. Litteraturstudien visar att det multimodala skrivandet kan förekomma såväl analogt som digitalt. Vidare visar den att svensk forskning på området är mycket begränsad. De artiklar och avhandlingar som inkluderats i litteraturstudien visar att forskare är eniga om att lärare behöver utveckla sina kunskaper om olika teckenvärldar, såsom auditiva och visuella, för att göra elever medvetna om teckenvärldarnas meningspotential och samspel. Det multimodala skrivandet ger upphov till en form av samordnat lärande, eftersom det multimodala skrivandet är en komplex process, som kräver att eleverna får explicit undervisning om aktuell digital programvara och teckenvärldarnas meningsskapande. Multimodalt skrivande är ett vanligt inslag utanför skolan, men bör få tillträde in i skolvärlden. Det förutsätter att digitala resurser finns tillgängliga och att lärare är positivt inställda till den multimodala skrivutvecklingen.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

28

Malmberg, Lovisa, und Sara Stensils. „Multimodala texter i skolan : En multimodal läromedelsanalys av läseböcker i svenskämnet för årskurs F-3“. Thesis, Uppsala universitet, Institutionen för pedagogik, didaktik och utbildningsstudier, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-434737.

Der volle Inhalt der Quelle

Annotation:

Syftet med denna studie är att bidra med mer kunskap om hur modaliteter samspelar i analoga läseböcker för årskurs två i svenskämnet samt hur dessa böcker kan utgöra didaktiska resurser i ett multimodalt arbete. Studien tar sin teoretiska utgångspunkt i sociosemiotiken samt ett designteoretiskt multimodalt perspektiv på lärande. De frågeställningar som studien grundar sig i är: Vilka modaliteter rymmer läseböckerna och hur ser samspelet ut mellan dessa modaliteter? Hur kan dessa läseböcker användas som didaktiska resurser i ett multimodalt arbetssätt? För att samla in data genomfördes en kvalitativ textanalys av tre valda läseböcker. Textanalysen genomfördes med det analysverktyg som arbetats fram utifrån Danielsson och Selanders (2014) modell för ämnesdidaktiskt arbete med multimodala texter. Analysverktyget innehåller fem analyskategorier som berör samspelet mellan textens olika delar samt hur böckerna är didaktiska resurser för multimodalt lärande. Gällande samspelet mellan modaliteterna har studien fokuserat på samspel mellan följande kategorier: verbal text och rubrik, verbal text och bild samt den visuella närheten mellan modaliteterna. Studiens resultat visar att böckerna innehåller många olika modaliteter såsom exempelvis bilder, fotografier, rubriker, brödtext, bildtext, kartor, symboler och pilar. Dessa olika modaliteter bygger på ett tydligt samspel mellan varandra. Rubrikerna ger oftast läsaren en indikation på vad kommande kapitel ska handla om och består av ett elevnära språk. Bilderna i böckerna samspelar med brödtexten i form av att de illustrerar innehållet i brödtexten, de tillför information till brödtexten eller att de är dekorativa. De didaktiska resurser som studien visar på är bildanalys, samspel mellan modaliteter, omvandling av modaliteter samt lärande i samspel med andra.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

29

Chen, Jianan. „Deep Learning Based Multimodal Retrieval“. Electronic Thesis or Diss., Rennes, INSA, 2023. http://www.theses.fr/2023ISAR0019.

Der volle Inhalt der Quelle

Annotation:

Les tâches multimodales jouent un rôle crucial dans la progression vers l'atteinte de l'intelligence artificielle (IA) générale. L'objectif principal de la recherche multimodale est d'exploiter des algorithmes d'apprentissage automatique pour extraire des informations sémantiques pertinentes, en comblant le fossé entre différentes modalités telles que les images visuelles, le texte linguistique et d'autres sources de données. Il convient de noter que l'entropie de l'information associée à des données hétérogènes pour des sémantiques de haut niveau identiques varie considérablement, ce qui pose un défi important pour les modèles multimodaux. Les modèles de réseau multimodal basés sur l'apprentissage profond offrent une solution efficace pour relever les difficultés découlant des différences substantielles d'entropie de l’information. Ces modèles présentent une précision et une stabilité impressionnantes dans les tâches d'appariement d'informations multimodales à grande échelle, comme la recherche d'images et de textes. De plus, ils démontrent de solides capacités d'apprentissage par transfert, permettant à un modèle bien entraîné sur une tâche multimodale d'être affiné et appliqué à une nouvelle tâche multimodale. Dans nos recherches, nous développons une nouvelle base de données multimodale et multi-vues générative spécifiquement conçue pour la tâche de segmentation référentielle multimodale. De plus, nous établissons une référence de pointe (SOTA) pour les modèles de segmentation d'expressions référentielles dans le domaine multimodal. Les résultats de nos expériences comparatives sont présentés de manière visuelle, offrant des informations claires et complètes
Multimodal tasks play a crucial role in the progression towards achieving general artificial intelligence (AI). The primary goal of multimodal retrieval is to employ machine learning algorithms to extract relevant semantic information, bridging the gap between different modalities such as visual images, linguistic text, and other data sources. It is worth noting that the information entropy associated with heterogeneous data for the same high-level semantics varies significantly, posing a significant challenge for multimodal models. Deep learning-based multimodal network models provide an effective solution to tackle the difficulties arising from substantial differences in information entropy. These models exhibit impressive accuracy and stability in large-scale cross-modal information matching tasks, such as image-text retrieval. Furthermore, they demonstrate strong transfer learning capabilities, enabling a well-trained model from one multimodal task to be fine-tuned and applied to a new multimodal task, even in scenarios involving few-shot or zero-shot learning. In our research, we develop a novel generative multimodal multi-view database specifically designed for the multimodal referential segmentation task. Additionally, we establish a state-of-the-art (SOTA) benchmark and multi-view metric for referring expression segmentation models in the multimodal domain. The results of our comparative experiments are presented visually, providing clear and comprehensive insights

APA, Harvard, Vancouver, ISO und andere Zitierweisen

30

Gutiérrez, Aldrete Mariana. „El tratamiento del feminicidio en medios de comunicación en México“. Doctoral thesis, Universitat Autònoma de Barcelona, 2020. http://hdl.handle.net/10803/670554.

Der volle Inhalt der Quelle

Annotation:

Aquesta investigació analitza el tractament del tema Feminicidi a la premsa mexicana. L’objectiu és estudiar quins són els aspectes de l’conflicte que es destaquen en la informació de premsa. Mesurem l’atenció mediàtica i la comparem amb un lapse de temps a el principi de la investigació i a al final. Es van extreure els marcs periodístics multimodals, les falles de context i els discursos ideològics disseminats. A més analitzem la representació dels moviments socials contra el feminicidi com a actor i les oportunitats discursives assolides, en comparació amb la representació de les autoritats. Vam escollir tres diaris de circulació nacional tenint en compte les preferències de l’audiència per triar els més llegits en la seva versió impresa i per mitjans electrònics. Durant un període de 41 mesos es van demanar tots els articles que informen feminicidi com a tema principal o secundari, i els articles sobre assassinats de dones en els quals no s’ha comprovat si tenien motius de gènere o no. Vam obtenir 2,527 articles i es van codificar tots manualment. Es va utilitzar la metodologia d’anàlisi de contingut quantitatiu textual i anàlisi de les imatges per extreure els elements dels marcs periodístics multimodals, d’acord amb la teoria de Entman: la denominació de el problema, actors principals, l’avaluació moral, l’atribució de responsabilitat i el tractament recomanat. Cada element conté diverses variables que es van agrupar en conglomerats per ordre d’incidència. La representació dels moviments socials es va mesurar amb les característiques del ‘paradigma de la protesta’ i analitzem el grau en que s’adhereix a aquesta teoria. Utilitzem les mateixes variables per mesurar les oportunitats discursives de el moviment. Trobem que l’atenció mediàtica a l’conflicte ha augmentat considerablement en els 3 diaris disseminant la idea que la severitat de el problema també augmenta; però, la representació de les víctimes tendeix a ser negativa i es reprodueixen discursos discriminatoris.
Esta investigación analiza el tratamiento del tema Feminicidio en la prensa mexicana. El objetivo es estudiar cuales son los aspectos del conflicto que se destacan en la información de prensa. Medimos la atención mediática y la comparamos con un lapso de tiempo al principio de la investigación y al final. Se extrajeron los marcos periodísticos multimodales, las fallas de contexto y los discursos ideológicos diseminados. Además analizamos la representación de los movimientos sociales contra el feminicidio como actor y las oportunidades discursivas alcanzadas, en comparación con la representación de las autoridades. Escogimos tres periódicos de circulación nacional tomando en cuenta las preferencias de la audiencia para elegir los más leídos en su versión impresa y por medios electrónicos. Durante un periodo de 41 meses se recabaron todos los artículos que informan feminicidio como tema principal o secundario, y los artículos sobre asesinatos de mujeres en los que no se ha comprobado si tenían motivos de género o no. Obtuvimos 2,527 artículos y se codificaron todos manualmente. Se utilizó la metodología de análisis de contenido cuantitativo textual y análisis de las imágenes para extraer los elementos de los marcos periodísticos multimodales, de acuerdo con la teoría de Entman: la denominación del problema, actores principales, la evaluación moral, la atribución de responsabilidad y el tratamiento recomendado. Cada elemento contiene diversas variables que se agruparon en conglomerados por orden de incidencia. La representación de los movimientos sociales se midió con las características del ‘paradigma de la protesta’ y analizamos el grado en que se adhiere a esta teoría. Utilizamos las mismas variables para medir las oportunidades discursivas del movimiento. Encontramos que la atención mediática al conflicto ha aumentado considerablemente en los 3 diarios diseminando la idea de que la severidad del problema también aumenta; sin embargo, la representación de las víctimas tiende a ser negativa y se reproducen discursos discriminatorios.
This research analyzes the treatment of the Feminicide issue in the Mexican press. The objective is to study which aspects of the conflict are highlighted in the press information. We measure media attention and compare it with a period of time at the beginning of the investigation and at the end. Multimodal journalistic frameworks, context failures and disseminated ideological discourses were extracted. We also analyze the representation of social movements against femicide as an actor and the discursive opportunities achieved, compared to the representation of the authorities. We chose three newspapers of national circulation taking into account the preferences of the audience to choose the most read in its printed version and electronically. During a period of 41 months, all articles that report femicide as the main or secondary topic were collected, and articles on murders of women in which it was not proven whether they had gender motives or not. We obtained 2,527 articles and all were manually coded. The methodology of textual quantitative content analysis and image analysis was used to extract the elements of multimodal journalistic frameworks, according to Entman's theory: the name of the problem, main actors, moral evaluation, attribution of responsibility and The recommended treatment. Each element contains several variables that were grouped into clusters in order of incidence. The representation of social movements was measured with the characteristics of the 'protest paradigm' and we analyzed the degree to which it adheres to this theory. We use the same variables to measure the discursive opportunities of the movement. We found that media attention to the conflict has increased considerably in the 3 newspapers disseminating the idea that the severity of the problem also increases; however, the representation of victims tends to be negative and discriminatory discourses are reproduced.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

31

Specker, Elizabeth. „L1/L2 Eye Movement Reading of Closed Captioning: A Multimodal Analysis of Multimodal Use“. Diss., The University of Arizona, 2008. http://hdl.handle.net/10150/194820.

Der volle Inhalt der Quelle

Annotation:

Learning in a multimodal environment entails the presentation of information in a combination of more than one mode (i.e. written words, illustrations, and sound). Past research regarding the benefits of multimodal presentation of information includes both school age children and adult learners (e.g. Koolstra, van der Voort & d'Ydewalle, 1999; Neumen & Koskinen, 1992), as well as both native and non-native language learners (e.g. d'Ydewalle & Gielen, 1992; Kothari et al, 2002). This dissertation focuses how the combination of various modalities are used by learners of differing proficiencies in English to gain better comprehension (cf. Mayer, 1997, 2005; Graber, 1990; Slykhuis et al, 2005). The addition of the written mode (closed captioning) to the already multimodal environment that exists in film and video presentations is analyzed. A Multimodal Multimedia Communicative Event is used to situate the language learner. Research questions focus on the eye movements of the participants as they read moving text both with and without the audio and video modes of information. Small case studies also give a context to four participants by bringing their individual backgrounds and observations to bear on the use of multimodal texts as language learning tools in a second or foreign language learning environment. It was found that Non Native English Speakers (NNS) (L1 Arabic) show longer eye movement patterns in reading dynamic text (closed captioning), echoing past research with static texts while Native Speakers of English (NS) tend to have quicker eye movements. In a multimodal environment the two groups also differed: NNS looked longer at the closed captioning and NS were able to navigate the text presentation quickly. While associative activation (Paivio, 2007) between the audio and print modalities was not found to alter the eye movement patterns of the NNS, participants did alternate between the modalities in search of supplementary information. Other research using additional closed captioning and subtitling have shown that viewing a video program with written text added turns the activity into a reading activity (Jensema, 2000; d'Ydewalle, 1987). The current study found this to be the case, but the results differed in regard to proficiency and strategy.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

32

García, Guerra Carlos Enrique. „Multimodal eye's optical quality (MEOQ)“. Doctoral thesis, Universitat Politècnica de Catalunya, 2016. http://hdl.handle.net/10803/397198.

Der volle Inhalt der Quelle

Annotation:

Within the visual system, the optics of the eye is responsible for forming images of external objects on the ocular fundus for its photo-reception and neural interpretation. However, the eye is not perfect and its capabilities may be limited by aberrations and scattering. Therefore, the quantification of optical factors affecting the eye is important for diagnosis and monitoring purposes. In this context, this document summarizes the work done during the implementation of the Multimodal Eye's Optical Quality (MEOQ) system, a measurement device that integrates a double-pass (DP) instrument and a Hartmann-Shack (HS) sensor to provide not only information on aberrations, but also on scattering that occurs in the human eye. A binocular open-view design permits evaluation in natural viewing conditions. Furthermore, the system is able to compensate for both spherical and astigmatic refractive errors by using devices of configurable optical power. The MEOQ system has been used to quantify scattering in the human eye based on differences between DP and HS estimations. Moreover, DP information has been employed to measure intraocular scattering using a novel method of quantification. Finally, the configurable properties of the spherical refractive error corrector have been used to explore a method for reducing speckle in systems that rely on reflections of light in the ocular fundus.
Dentro del sistema visual, la óptica del ojo es responsable de la formación de imágenes de objetos externos en el fondo de ojo para su fotorrecepción e interpretación neuronal. Sin embargo, el ojo no es perfecto y sus capacidades pueden verse limitadas por la presencia de aberraciones o de luz dispersa. De esta manera, la cuantificación de los factores ópticos que afectan al ojo resulta importante para fines de diagnóstico y de monitoreo. En este contexto, el presente documento resume el trabajo realizado durante la implementación del sistema Multimodal Eye’s Optical Quality (MEOQ), un dispositivo de medición que integra un instrumento de doble paso (DP) y un sensor de Hartmann-Shack (HS) para proporcionar no sólo información sobre aberraciones, sino también en la dispersión que se produce en el ojo humano. Un diseño binocular de campo abierto permite evaluaciones en condiciones visuales naturales. Además, el sistema es capaz de compensar tanto errores refractivos esféricos como astigmáticos mediante el uso de dispositivos de potencia óptica configurable. El sistema MEOQ se ha utilizado para cuantificar la dispersión en el ojo humano basándose en las diferencias entre estimaciones de DP y HS. Además, la información de DP se ha empleado para medir la dispersión intraocular utilizando un nuevo método de cuantificación. Por último, las propiedades configurables del corrector de refracción esférica se han utilizado para explorar un método para la reducción de ruido speckle en sistemas basados en reflexiones de luz en el fondo ocular.
Innerhalb des visuellen Systems ist die Optik des Auges verantwortlich für die Abbildung externer Objekte auf dem Fundus des Auges, damit Licht umgewandelt und neural interpretiert wird. Dennoch ist das Auge nicht perfekt und seine Möglichkeiten sind durch Abbildungsfehler und Streuung begrenzt. Daraus ergibt sich, dass die Quantifizierung der optischen Faktoren, welche das Auge betreffen, wichtig für die Diagnose und Überwachung sind. Innerhalb dieses Rahmens fasst dieses Dokument die Arbeit zusammen, welche die Implementierung eines System zur multimodalen Bestimmung der optischen Qualität des Auges (MEOQ), bestehend aus einem Doppelpass-Instument (DP) und einem Hartmann-Shack-Sensor (HS), beschreibt, um nicht nur Informationen über Abbildungsfehler, sondern auch über Streuung im menschlichen Auge zu erhalten. Ein biokulares Freisicht-Design ermöglicht natürliche Sehverhältnisse. Darüberhinaus ist das System in der Lage sphärische und astigmatische Brechungsfehler mit einem Gerät einstellbarer optischer Leistung zu korrigieren. Das MEOQ System wurde genutzt um Streuung im menschlichen Auge mit Hilfe der Unterschiede der bschätzungen des DP und des HS zu quantifizieren. Darüberhinaus wurden die DP Informationen angewandt um intraokulare Streuung durch eine neue Methode der Quantifizierung zu messen. Schließlich wurden die konfigurierbaren Einstellungen des sphärischen Brechungsfehlerskorrektor genutzt um eine Methode zur Reduzierung von Speckle in Systemen, welche auf Reflektionen von Licht vom Fundus des Auges basieren, zu untersuchen.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

33

Sheikhi, Shoshtari Ava. „Multimodal assessment of neurodegenerative diseases“. Thesis, University of British Columbia, 2016. http://hdl.handle.net/2429/58324.

Der volle Inhalt der Quelle

Annotation:

There is growing recognition that accurate assessment of brain function includes activity at multiple temporal and spatial scales. In this thesis, we explored ways to combine clinically-relevant imaging information derived from subjects with neurodegenerative disease. In the first work, we investigated a two-step framework to determine both joint and unique biomarkers from structural and functional MRI in 18 healthy control (HC) and 12 Parkinson’s disease (PD) subjects. Three matrices (structural, functional, and structural/functional interactions) were derived from a subset of features in both modalities that were likely candidates for discrimination between PD and HC subjects. Finally, Least Absolute Shrinkage and Selection Operator (LASSO) regression was performed to determine if subjects’ clinical characteristics such as gender, smoking history, smell performance, Hoehn and Yahr Scale (H&Y Stage), and Unified Parkinson’s Disease Rating Scale (UPDRS) values, could be accurately predicted based on the imaging features. The results revealed that complementary biomarkers were most informative in predicting clinical scores in both groups. In the second work, for analyzing imaging data from subjects with Multiple Sclerosis (MS), we employed a joint Multimodal Statistical Analysis Framework, a data fusion approach that used Latent Variables (LV). We studied fusion of information from seven different imaging modalities: Myelin Water Imaging (MWI), Diffusion Tensor Imaging (DTI), resting state functional MRI (rsfMRI), cortical thickness of the right and left hemisphere, MS lesion load, and normalized brain volume from 47 subjects with MS. Decomposed common and unique information in each modality were acquired and their relationships with disease duration (DD), the Expanded Disability Status Scale (EDSS), and age, were analyzed through LASSO regression. We noted that common components of the seven modalities were the most accurate in predicting clinical indices. Results further revealed the regional importance of each modality by indicating a unique pattern of degeneration in MS and an asymmetry between the cortical thickness components in the two hemispheres. Our results demonstrate the power of utilizing multimodal imaging biomarkers in neurodegenerative diseases. Since structural imaging data is acquired along with functional data, we propose that fusion of information from both types of data should become part of routine analysis.
Applied Science, Faculty of
Graduate

APA, Harvard, Vancouver, ISO und andere Zitierweisen

34

Black, Cory A., und n/a. „Supramolecular complexes of multimodal ligands“. University of Otago. Department of Chemistry, 2007. http://adt.otago.ac.nz./public/adt-NZDU20070518.091104.

Der volle Inhalt der Quelle

Annotation:

This thesis describes the synthesis and X-ray crystallographic analysis of a series of supramolecular architectures prepared using seven flexible multimodal ligands with Ag(I), Cu(I), Cd(II), Co(II), Ni(II) and Pd(II) metal salts. Chapter one introduces some examples of fundamental supramolecular systems with particular focus on metallo-supramolecular motifs, specifically coordination polymers. Topological analysis is discussed as a method for the simplified description and comparison of network structures. Chapter two describes the design, synthesis and characterisation of the symmetrical ligands bis(2-pyrazylmethyl)sulfide (psp), bis(4-pyrimidylmethyl)sulfide (msm) and 5,5�-(thiodimethylene)di-pyrazine-2-carboxylic acid methyl ester (csc) as well as the asymmetrical ligands 2-benzylsulfanylmethyl-pyrazine (psb), 2-pyridylsulfanylmethyl-pyrazine (psd), 3-pyridylsulfanylmethyl-pyrazine (psn) and 4-pyridylsulfanylmethyl-pyrazine (psy). Chapter three presents a literature review of ligands related to psp, msm and csc, followed by the synthesis and characterization of thirteen Ag(I), Cd(II), Co(II), Ni(II) and Pd(II) complexes. The X-ray crystal structures of nine of these complexes are reported and compared. The structures were present as either one- or two-dimensional coordination polymers. The {[Ag(psp)](PF₆)}[infinity] and {[Ag₂(psp)(C₆H₆)(CH₃CN)₂](PF₆)₂�CH₃CN}[infinity] structures demonstrated a solvent dependence by forming a 1-D twisted ladder with a [eta]�-bound benzene and a 2-D undulating sheet with a 4.8� topology respectively. Six of the structures {[Cd₂(psp)(CH₃CN)(H₂O)(NO₃)₄]�H₂O}[infinity], {[Co(psp)(CH₃CN)₂](ClO₄)₂}[infinity], {[Ni(psp)(NO₃)₂]}[infinity] and {[Ag(msm)](X)}[infinity] (X = BF₄⁻, ClO₄⁻, PF₆⁻) displayed anion-[pi] interactions between multi-atomic anions and [pi]-acidic ring centres. A novel N[pz]��cent[pz] T-shaped [pi]-[pi] interaction was also identified in the {[Ni(psp)(NO₃)₂]}[infinity] structure. A 2-D sheet with 6� topology was observed in the X-ray structure of {[Ag₂(csc)](NO₃)₂}[infinity]. Following a review of related ligands, chapter four focuses on seven Ag(I), Cd(II), Co(II) and Cu(I) complexes formed using the asymmetric pyrazine-benzene ligand psb. In total six 1-D coordination polymer chains are reported. Two structurally disparate supramolecular isomers were formed in [Ag(psb)NO₃][infinity] and {[Ag₂(psb)₂NO₃]NO₃�H₂O}[infinity]. The compound {[Ag(psb)](BF₄)}[infinity] was similar to the former isomer [Ag(psb)NO₃][infinity]. The structurally similar coordination polymers {[Cd(psb)(H₂O)(NO₃)₂]}[infinity] and {[Co(psb)(H₂O)₃](ClO₄)₂�H₂O}[infinity] formed structures that showed anion-[pi] interactions using coordinated and non-coordinated anions respectively. The [Cu₂(psb)I₂][infinity] chain consisted of ligands linked together by a Cu₄I₄ stepped cubane tetramer. Chapter five presents seventeen Ag(I) and Cu(I) complexes prepared using three asymmetric pyrazine-pyridine ligands psd, psn and psy. A review of asymmetric pyrazine-pyridine ligands is provided. Seventeen X-ray crystal structures are described. Four psd complexes using AgBF₄, AgClO₄, AgNO₃ and AgPF₆ crystallised as discrete dimers with three types of crystal packing and ligand-supported Ag��Ag interactions. The complexes {[Ag₂(psd)₂CF₃SO₃]CF₃SO₃}[infinity] and {Cu₂(psd)I₂}[infinity] were a 1-D X-shaped chain and a 2-D 6� net respectively. The isostructural 2-D sheets in {[Ag(psn)]ClO₄}[infinity] and {[Ag(psn)]PF₆�CH₃CN}[infinity], had 4.8� topologies whereas a thicker sheet was formed in {[Ag₂(psn)₂](BF₄)₂}[infinity] with a complicated (4�.6�.8�)₂(4.6.8)₂ topology. The {[Ag₃(psn)₂](CF₃SO₃)₃�CH₃CN}[infinity] chain polymer displayed three different coordination geometries around the three Ag(I) centres with two ligand-unsupported Ag��Ag interactions. The complex [Cu₂(psn)₂I₂] crystallised as a discrete dimer with a different ligand arrangement than those found in the psd dimers. Six Ag(I) 3-D networks were formed using psy. The complexes {[Ag(psy)]X}[infinity] (X = BF₄, ClO₄, PF₆) formed as isostructural non-interpenetrated (10,3)-d networks. An unprecedented tri-nodal (4.6.8)₂(6.8�)₂(4.6.8�.10)₂ topology was observed in the {[Ag₂(psy)₂](CF₃SO₃)₂}[infinity] structure. The suprarmolecular isomers {[Ag₃(psy)₂(NO₃)₂]NO₃]}[infinity] and {[Ag₃(psy)₂(NO₃)₃]�H₂O}[infinity] formed inclined interpenetrated 6� sheets and a (4�.6)₂(4⁴.6�.8⁸.10) 3-D network respectively. The structures in this chapter showed a general trend of increasing dimensionality when progressing from psd to psn to psy. Chapter six presents a summary of the more significant results and concluding remarks.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

35

Treviranus, Jutta. „Multimodal access to written communication“. Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ28724.pdf.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

36

Ropinski, Timo, Ivan Viola, Martin Biermann, Helwig Hauser und Klaus Hinrichs. „Multimodal Visualization with Interactive Closeups“. University of Münster, Germany, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-93205.

Der volle Inhalt der Quelle

Annotation:

Closeups are used in illustrations to provide detailed views on regions of interest. They are integrated into the rendering of the whole structure in order to reveal their spatial context. In this paper we present the concept of interactive closeups for medical reporting. Each closeup is associated with a region of interest and may show a single modality or a desired combination of the available modalities using different visualization styles. Thus it becomes possible to visualize multiple modalities simultaneously and to support doctor-to-doctor communication on the basis of interactive multimodal closeup visualizations. We discuss how to compute a layout for 2D and 3D closeups, and how to edit a closeup configuration to prepare a presentation or a subsequent doctor-to-doctor communication. Furthermore, we introduce a GPU-based rendering algorithm, which allows to render multiple closeups at interactive frame rates. We demonstrate the application of the introduced concepts to multimodal PET/CT data sets additionally co-registered with MRI.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

37

Correia, Rose Mary. „Legal aspects of multimodal telecommunications“. Thesis, McGill University, 1995. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=23309.

Der volle Inhalt der Quelle

Annotation:

The telecommunications industry is being shaped by technological and market developments, and is moving into the 21st Century. The telecommunications technology of the future is integrated services digital networks. ISDN, which is the concept for a future digital telecommunications network for delivery of a wide range of innovative voice, data and video services through satellite systems and the national information highways being developed in several countries, will lead to a Global Information Infrastructure. ISDN development will pose challenges to traditional telecommunications regulation, lead to increased multimodal competition between ground and space-based transmission systems, and erode INTELSAT's market base since future digital ISDN systems will be interchangeable with satellite systems.
This study begins in Chapter I with an examination of the emerging technologies and recent market trends which challenge traditional regulation, as well as the importance of upholding regulation in the emerging ISDN telecommunications environment. Chapter II discusses the recent market developments in Canada, the legal implications of emerging technologies for the current regulatory regime, and the need for comprehensive policy and regulation. Chapter III discusses the role of satellites in the emerging global ISDN environment, the mandate of INTELSAT in terms of spectrum/orbit resource management, the regulation of multimodal telecommunications under the INTELSAT Agreement, the challenges to INTELSAT represented by ISDN development, the role of the ITU in the regulation of the emerging global ISDN environment, and the future of INTELSAT in light of competition, technological progress, and regulatory trends. This is followed by a conclusion in Chapter IV.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

38

Fatukasi, Omolara O. „Multimodal fusion of biometric experts“. Thesis, University of Surrey, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.493242.

Der volle Inhalt der Quelle

Annotation:

Person authentication is the process of confirming or determining a person's identity. Its purpose is to ensure that a system can only be accessed by authorised users. The Biometric method uses a person's physical or behavioural characteristics. The use of biometric characteristics is increasingly more popular as it makes unauthorised access more difficult.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

39

Lingam, Sumanth (Sumanth Kumar) 1978. „User interfaces for multimodal systems“. Thesis, Massachusetts Institute of Technology, 2001. http://hdl.handle.net/1721.1/8614.

Der volle Inhalt der Quelle

Annotation:

Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Civil and Environmental Engineering, 2001.
Includes bibliographical references (leaves 68-69).
As computer systems become more powerful and complex, efforts to make computer interfaces more simple and natural become increasingly important. Natural interfaces should be designed to facilitate communication in ways people are already accustomed to using. Such interfaces allow users to concentrate on the tasks they are trying to accomplish, not worry about what they must do to control the interface. Multimodal systems process combined natural input modes- such as speech, pen, touch, manual gestures, gaze, and head and body movements- in a coordinated manner with multimedia system output. The initiative at W3C is to make the development of interfaces simple and easy to distribute applications across the Internet in an XML development environment. The languages so far such as HTML designed at W3C are for a particular platform and are not portable to other platforms. User Interface Markup Language (UIML) has been designed to develop cross-platform interfaces. It will be shown in this thesis that UIML can be used not only to develop multi-platform interfaces but also for creating multimodal interfaces. A survey of existing multimodal applications is performed and an efficient and easy-to-develop methodology is proposed. Later it will be also shown that the methodology proposed satisfies a major set of requirements laid down by W3C for multimodal dialogs.
by Sumanth Lingam.
M.Eng.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

40

Adler, Aaron D. (Aaron Daniel) 1979. „MIDOS : Multimodal Interactive DialOgue System“. Thesis, Massachusetts Institute of Technology, 2009. http://hdl.handle.net/1721.1/52776.

Der volle Inhalt der Quelle

Annotation:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (p. 239-243).
Interactions between people are typically conversational, multimodal, and symmetric. In conversational interactions, information flows in both directions. In multimodal interactions, people use multiple channels. In symmetric interactions, both participants communicate multimodally, with the integration of and switching between modalities basically effortless. In contrast, consider typical human-computer interaction. It is almost always unidirectional { we're telling the machine what to do; it's almost always unimodal (can you type and use the mouse simultaneously?); and it's symmetric only in the disappointing sense that when you type, it types back at you. There are a variety of things wrong with this picture. Perhaps chief among them is that if communication is unidirectional, it must be complete and unambiguous, exhaustively anticipating every detail and every misinterpretation. In brief, it's exhausting. This thesis examines the benefits of creating multimodal human-computer dialogues that employ sketching and speech, aimed initially at the task of describing early stage designs of simple mechanical devices. The goal of the system is to be a collaborative partner, facilitating design conversations. Two initial user studies provided key insights into multimodal communication: simple questions are powerful, color choices are deliberate, and modalities are closely coordinated. These observations formed the basis for our multimodal interactive dialogue system, or Midos. Midos makes possible a dynamic dialogue, i.e., one in which it asks questions to resolve uncertainties or ambiguities.
(cont.) The benefits of a dialogue in reducing the cognitive overhead of communication have long been known. We show here that having the system able to ask questions is good, but for an unstructured task like describing a design, knowing what questions to ask is crucial. We describe an architecture that enables the system to accept partial information from the user, then request details it considers relevant, noticeably lowering the cognitive overhead of communicating. The multimodal questions Midos asks are in addition purposefully designed to use the same multimodal integration pattern that people exhibited in our study. Our evaluation of the system showed that Midos successfully engages the user in a dialogue and produces the same conversational features as our initial human-human conversation studies.
by Aaron Daniel Adler.
Ph.D.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

41

Zhao, Yang. „Multimodal transport and competing regimes“. Thesis, University of Cambridge, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.612135.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

42

Mukherjee, Sankha Subhra. „Multimodal headpose estimation and applications“. Thesis, Heriot-Watt University, 2017. http://hdl.handle.net/10399/3338.

Der volle Inhalt der Quelle

Annotation:

This thesis presents new research into human headpose estimation and its applications in multi-modal data. We develop new methods for head pose estimation spanning RGB-D Human Computer Interaction (HCI) to far away "in the wild" surveillance quality data. We present the state-of-the-art solution in both head detection and head pose estimation through a new end-to-end Convolutional Neural Network architecture that reuses all of the computation for detection and pose estimation. In contrast to prior work, our method successfully spans close up HCI to low-resolution surveillance data and is cross modality: operating on both RGB and RGB-D data. We further address the problem of limited amount of standard data, and different quality of annotations by semi supervised learning and novel data augmentation. (This latter contribution also finds application in the domain of life sciences.) We report the highest accuracy by a large margin: 60% improvement; and demonstrate leading performance on multiple standardized datasets. In HCI we reduce the angular error by 40% relative to the previous reported literature. Furthermore, by defining a probabilistic spatial gaze model from the head pose we show application in human-human, human-scene interaction understanding. We present the state-of-the art results on the standard interaction datasets. A new metric to model "social mimicry" through the temporal correlation of the headpose signal is contributed and shown to be valid qualitatively and intuitively. As an application in surveillance, it is shown that with the robust headpose signal as a prior, state-of-the-art results in tracking under occlusion using a Kalman filter can be achieved. This model is named the Intentional Tracker and it improves visual tracking metrics by up to 15%. We also apply the ALICE loss that was developed for the end-to-end detection and classification, to dense classiffication of underwater coral reefs imagery. The objective of this work is to solve the challenging task of recognizing and segmenting underwater coral imagery in the wild with sparse point-based ground truth labelling. To achieve this, we propose an integrated Fully Convolutional Neural Network (FCNN) and Fully-Connected Conditional Random Field (CRF) based classification and segmentation algorithm. Our major contributions lie in four major areas. First, we show that multi-scale crop based training is useful in learning of the initial weights in the canonical one class classiffication problem. Second, we propose a modified ALICE loss for training the FCNN on sparse labels with class imbalance and establish its signi cance empirically. Third we show that by arti cially enhancing the point labels to small regions based on class distance transform, we can improve the classification accuracy further. Fourth, we improve the segmentation results using fully connected CRFs by using a bilateral message passing prior. We improve upon state-of-the-art results on all publicly available datasets by a significant margin.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

43

Upadhaya, Taman. „Multimodal radiomics in neuro-oncology“. Thesis, Brest, 2017. http://www.theses.fr/2017BRES0036/document.

Der volle Inhalt der Quelle

Annotation:

Le glioblastome multiforme (GBM) est une tumeur de grade IV représentant 49% de toutes les tumeurs cérébrales. Malgré des modalités de traitement agressives (radiothérapie, chimiothérapie et résection chirurgicale), le pronostic est mauvais avec une survie globale médiane de 12 à 14 mois. Les aractéristiques issues de la neuro imagerie des GBM peuvent fournir de nouvelles opportunités pour la classification, le pronostic et le développement de nouvelles thérapies ciblées pour faire progresser la pratique clinique. Cette thèse se concentre sur le développement de modèles pronostiques exploitant des caractéristiques de radiomique extraites des images multimodales IRM (T1 pré- et post-contraste, T2 et FLAIR). Le contexte méthodologique proposé consiste à i) recaler tous les volumes multimodaux IRM disponibles et en segmenter un volume tumoral unique, ii) extraire des caractéristiques radiomiques et iii) construire et valider les modèles pronostiques par l’utilisation d’algorithmes d’apprentissage automatique exploitant des cohortes cliniques multicentriques de patients. Le coeur des méthodes développées est fondé sur l’extraction de radiomiques (incluant des paramètres d’intensité, de forme et de textures) pour construire des modèles pronostiques à l’aide de deux algorithmes d’apprentissage, les machines à vecteurs de support (support vector machines, SVM) et les forêts aléatoires (random forest, RF), comparées dans leur capacité à sélectionner et combiner les caractéristiques optimales. Les bénéfices et l’impact de plusieurs étapes de pré-traitement des images IRM (re-échantillonnage spatial des voxels, normalisation, segmentation et discrétisation des intensités) pour une extraction de métriques fiables ont été évalués. De plus les caractéristiques radiomiques ont été standardisées en participant à l’initiative internationale de standardisation multicentrique des radiomiques. La précision obtenue sur le jeu de test indépendant avec les deux algorithmes d’apprentissage SVM et RF, en fonction des modalités utilisées et du nombre de caractéristiques combinées atteignait 77 à 83% en exploitant toutes les radiomiques disponibles sans prendre en compte leur fiabilité intrinsèque, et 77 à 87% en n’utilisant que les métriques identifiées comme fiables.Dans cette thèse, un contexte méthodologique a été proposé, développé et validé, qui permet la construction de modèles pronostiques dans le cadre des GBM et de l’imagerie multimodale IRM exploitée par des algorithmes d’apprentissage automatique. Les travaux futurs pourront s’intéresser à l’ajout à ces modèles des informations contextuelles et génétiques. D’un point de vue algorithmique, l’exploitation de nouvelles techniques d’apprentissage profond est aussi prometteuse
Glioblastoma multiforme (GBM) is a WHO grade IV tumor that represents 49% of ail brain tumours. Despite aggressive treatment modalities (radiotherapy, chemotherapy and surgical resections) the prognosis is poor, as médian overall survival (OS) is 12-14 months. GBM’s neuroimaging (non-invasive) features can provide opportunities for subclassification, prognostication, and the development of targeted therapies that could advance the clinical practice. This thesis focuses on developing a prognostic model based on multimodal MRI-derived (Tl pre- and post-contrast, T2 and FLAIR) radiomics in GBM. The proposed methodological framework consists in i) registering the available 3D multimodal MR images andsegmenting the tumor volume, ii) extracting radiomics iii) building and validating a prognostic model using machine learning algorithms applied to multicentric clinical cohorts of patients. The core component of the framework rely on extracting radiomics (including intensity, shape and textural metrics) and building prognostic models using two different machine learning algorithms (Support Vector Machine (SVM) and Random Forest (RF)) that were compared by selecting, ranking and combining optimal features. The potential benefits and respective impact of several MRI pre-processing steps (spatial resampling of the voxels, intensities quantization and normalization, segmentation) for reliable extraction of radiomics was thoroughly assessed. Moreover, the standardization of the radiomics features among methodological teams was done by contributing to “Multicentre Initiative for Standardisation of Radiomics”. The accuracy obtained on the independent test dataset using SVM and RF reached upto 83%- 77% when combining ail available features and upto 87%-77% when using only reliable features previously identified as robust, depending on number of features and modality. In this thesis, I developed a framework for developing a compréhensive prognostic model for patients with GBM from multimodal MRI-derived “radiomics and machine learning”. The future work will consists in building a unified prognostic model exploiting other contextual data such as genomics. In case of new algorithm development we look forward to develop the Ensemble models and deep learning-based techniques

APA, Harvard, Vancouver, ISO und andere Zitierweisen

44

Valero-Mas, Jose J. „Towards Interactive Multimodal Music Transcription“. Doctoral thesis, Universidad de Alicante, 2017. http://hdl.handle.net/10045/71275.

Der volle Inhalt der Quelle

Annotation:

La transcripción de música por computador es de vital importancia en tareas del llamo campo de la Extracción y recuperación de información musical por su utilidad como proceso para la obtención de una abstracción simbólica que codifica el contenido musical de un fichero de audio. En esta disertación se estudia este problema desde una perspectiva diferente a la típicamente considerada para estos problemas, la perspectiva interactiva y multimodal. En este paradigma el usuario cobra especial importancia puesto que es parte activa en la resolución del problema (interactividad); por otro lado, la multimodalidad implica que diferentes fuentes de información extraídas de la misma señal se aúnan para ayudar a una mejor resolución de la tarea.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

45

Rezaei, Masoud. „Multimodal implantable neural interfacing microsystem“. Doctoral thesis, Université Laval, 2019. http://hdl.handle.net/20.500.11794/36437.

Der volle Inhalt der Quelle

Annotation:

Afin d’étudier le cerveau humain dans le but d’aider les patients souffrant de maladies neurologiques, on a besoin d’une interface cérébrale entièrement implantable pour permettre l’accès direct aux neurones et enregistrer et analyser l’activité neuronale. Dans cette thèse, des interfaces cerveau-machine implantables (IMC) à très faible puissance basées sur plusieurs circuits et innovations de systèmes ont été étudiées pour être utilisées comme analyseur neuronal. Un tel système est destiné à recueillir l’activité neuronale émise par centaines de neurones tout en les activant à la demande en utilisant des moyens d’actionnement tels que l’électro- et / ou la photo-stimulation. Un tel système doit fournir plusieurs canaux d’enregistrement, tout en consommant très peu d’énergie, et présente une taille extrêmement réduite pour la sécurité et la biocompatibilité. Typiquement, un microsystème d’interfaçage avec le cerveau comprend plusieurs blocs, tels qu’un bloc analogique d’acquisition (AFE), un convertisseur analogique-numérique (ADC), des modules de traitement de signal numérique et un émetteur-récepteur de données sans fil. Un IMC extrait les signaux neuronaux du bruit, les numérise et les transmet à une station de base sans interférer avec le comportement naturel du sujet. Cette thèse se concentre sur les blocs analogiques d’acquisition à très faible consommation à utiliser dans l’IMC. Cette thèse présente des frontaux avec plusieurs stratégies innovantes pour consommer moins d’énergie tout en permettant des données de haute résolution et de haute qualité. Premièrement, nous présentons une nouvelle structure frontale utilisant un schéma de réutilisation du courant. Cette structure est extensible à un très grand nombre de canaux d’enregistrement, grâce à sa petite taille de silicium et à sa faible consommation d’énergie. L’AFE à réutilisation de courant proposée, qui comprend un amplificateur à faible bruit (LNA) et un amplificateur à gain programmable (PGA), utilise une nouvelle topologie de miroir de courant entièrement différentielle utilisant moins de transistors et améliorant plusieurs paramètres de conception, tels que la consommation d’énergie et du bruit, par rapport aux mises en oeuvre de circuit d’amplificateur de réutilisation de courant précédentes. Ensuite, dans la deuxième partie de cette thèse, nous proposons un nouveau convertisseur sigmadelta multicanal qui convertit plusieurs canaux indépendamment en utilisant un seul amplificateur et plusieurs condensateurs de stockage de charge. Par rapport aux techniques conventionnelles, cette méthode applique un nouveau schéma de multiplexage entrelacé, qui ne nécessite aucune phase de réinitialisation pour l’intégrateur lors du passage à un nouveau canal, ce qui améliore sa résolution. Lorsque la taille des puces n’est pas une priorité, d’autres approches peuvent être plus attrayantes, et nous proposons une nouvelle stratégie d’économie d’énergie basée sur un nouveau convertisseur sigma-delta à très basse consommation conçu pour réduire la consommation d’énergie. Ce nouveau convertisseur utilise une architecture basse tension basée sur une topologie prédictive innovante qui minimise la non-linéarité associée à l’alimentation basse tension.
Studying brain functionality to help patients suffering from neurological diseases needs fully implantable brain interface to enable access to neural activities as well as read and analyze them. In this thesis, ultra-low power implantable brain-machine-interfaces (BMIs) that are based on several innovations on circuits and systems are studied for use in neural recording applications. Such a system is intended to collect information on neural activity emitted by several hundreds of neurons, while activating them on demand using actuating means like electro- and/or photo-stimulation. Such a system must provide several recording channels, while consuming very low energy, and have an extremely small size for safety and biocompatibility. Typically, a brain interfacing microsystem includes several building blocks, such as an analog front-end (AFE), an analog-to-digital converter (ADC), digital signal processing modules, and a wireless data transceiver. A BMI extracts neural signals from noise, digitizes them, and transmits them to a base station without interfering with the natural behavior of the subject. This thesis focuses on ultra-low power front-ends to be utilized in a BMI, and presents front-ends with several innovative strategies to consume less power, while enabling high-resolution and high-quality of data. First, we present a new front-end structure using a current-reuse scheme. This structure is scalable to huge numbers of recording channels, owing to its small implementation silicon area and its low power consumption. The proposed current-reuse AFE, which includes a low-noise amplifier (LNA) and a programmable gain amplifier (PGA), employs a new fully differential current-mirror topology using fewer transistors. This is an improvement over several design parameters, in terms of power consumption and noise, over previous current-reuse amplifier circuit implementations. In the second part of this thesis, we propose a new multi-channel sigma-delta converter that converts several channels independently using a single op-amp and several charge storage capacitors. Compared to conventional techniques, this method applies a new interleaved multiplexing scheme, which does not need any reset phase for the integrator while it switches to a new channel; this enhances its resolution. When the chip area is not a priority, other approaches can be more attractive, and we propose a new power-efficient strategy based on a new in-channel ultra-low power sigma-delta converter designed to decrease further power consumption. This new converter uses a low-voltage architecture based on an innovative feed-forward topology that minimizes the nonlinearity associated with low-voltage supply.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

46

Bensaid, Eden. „Multimodal generative models for storytelling“. Thesis, Massachusetts Institute of Technology, 2021. https://hdl.handle.net/1721.1/130680.

Der volle Inhalt der Quelle

Annotation:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2021
Cataloged from the official PDF of thesis.
Includes bibliographical references (pages 41-45).
Storytelling is an open-ended task that entails creative thinking and requires a constant flow of ideas. Generative models have recently gained momentum thanks to their ability to identify complex data's inner structure and learn efficiently from unlabeled data [34]. Natural language generation (NLG) for storytelling is especially challenging because it requires the generated text to follow an overall theme while remaining creative and diverse to engage the reader [26]. Competitive story generation models still suffer from repetition [19], are unable to consistently condition on a theme [51] and struggle to produce a grounded, evolving storyboard [43]. Published story visualization architectures that generate images require a descriptive text to depict the scene to illustrate [30]. Therefore, it seems promising to evaluate an interactive multimodal generative platform that collaborates with writers to face the complex story-generation task. With co-creation, writers contribute their creative thinking, while generative models contribute to their constant workflow. In this work, we introduce a system and a web-based demo, FairyTailor¹, for machine-in-the-loop visual story co-creation. Users can create a cohesive children's story by weaving generated texts and retrieved images with their input. FairyTailor adds another modality and modifies the text generation process to produce a coherent and creative sequence of text and images. To our knowledge, this is the first dynamic tool for multimodal story generation that allows interactive co-creation of both texts and images. It allows users to give feedback on co-created stories and share their results. We release the demo source code² for other researchers' use.
by Eden Bensaid.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science

APA, Harvard, Vancouver, ISO und andere Zitierweisen

47

Klimas, Matthew L. „Argent Sound Recordings: Multimodal Storytelling“. VCU Scholars Compass, 2008. http://scholarscompass.vcu.edu/etd/795.

Der volle Inhalt der Quelle

Annotation:

ARGENT SOUND RECORDINGS explores the integration of visual, written and sonic elements to tell a story. "The Silver Bell," a fairy tale, is delivered through the internet providing users an opportunity to experience and interpret a constructed narrative under the guise of an independent record label website.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

48

Rönnqvist, Kim. „Multimodal characterisation of sensorimotor oscillations“. Thesis, Aston University, 2013. http://publications.aston.ac.uk/19564/.

Der volle Inhalt der Quelle

Annotation:

The studies in this project have investigated the ongoing neuronal network oscillatory activity found in the sensorimotor cortex using two modalities: magnetoencephalography (MEG) and in vitro slice recordings. The results have established that ongoing sensorimotor oscillations span the mu and beta frequency region both in vitro and in MEG recordings, with distinct frequency profiles for each recorded laminae in vitro, while MI and SI show less difference in humans. In addition, these studies show that connections between MI and SI modulate the ongoing neuronal network activity in these areas. The stimulation studies indicate that specific frequencies of stimulation affect the ongoing activity in the sensorimotor cortex. The continuous theta burst stimulation (cTBS) study demonstrates that cTBS predominantly enhances the power of the local ongoing activity. The stimulation studies in this project show limited comparison between modalities, which is informative of the role of connectivity in these effects. However, independently these studies provide novel information on the mechanisms on sensorimotor oscillatory interaction. The pharmacological studies reveal that GABAergic modulation with zolpidem changes the neuronal oscillatory network activity in both healthy and pathological MI. Zolpidem enhances the power of ongoing oscillatory activity in both sensorimotor laminae and in healthy subjects. In contrast, zolpidem attenuates the “abnormal” beta oscillatory activity in the affected hemisphere in Parkinsonian patients, while restoring the hemispheric beta power ratio and frequency variability and thereby improving motor symptomatology. Finally we show that independent signals from MI laminae can be integrated in silico to resemble the aggregate MEG MI oscillatory signals. This highlights the usefulness of combining these two methods when elucidating neuronal network oscillations in the sensorimotor cortex and any interventions.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

49

Dickson-LaPrade, Daniel. „Charles Darwin’s Multimodal Scientific Invention“. Research Showcase @ CMU, 2016. http://repository.cmu.edu/dissertations/882.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

50

McGee, David R. „Augmenting environments with multimodal interaction /“. Full text open access at:, 2003. http://content.ohsu.edu/u?/etd,222.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Wir bieten Rabatte auf alle Premium-Pläne für Autoren, deren Werke in thematische Literatursammlungen aufgenommen wurden. Kontaktieren Sie uns, um einen einzigartigen Promo-Code zu erhalten!