To see the other types of publications on this topic, follow the link: Datamini.

Dissertations / Theses on the topic 'Datamini'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Datamini.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Collazos, Linares Kathya Silvia. "Aspectos teóricos do Datamining." Florianópolis, SC, 2003. http://repositorio.ufsc.br/xmlui/handle/123456789/84645.

Full text
Abstract:
Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Engenharia Elétrica.<br>Made available in DSpace on 2012-10-20T11:13:01Z (GMT). No. of bitstreams: 0Bitstream added on 2013-07-16T19:10:41Z : No. of bitstreams: 1 198161.pdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)<br>Este trabalho apresenta um estudo do método Descoberta de Conhecimento em Base de Dados (Knowledge Discovery in Database - KDD) aplicado à medicina, o qual permitiu a definição de aspectos teóricos deste método. O KDD é uma linha de pesquisa da
APA, Harvard, Vancouver, ISO, and other styles
2

Popelka, Aleš. "Datamining - theory and it's application." Master's thesis, Vysoká škola ekonomická v Praze, 2012. http://www.nusl.cz/ntk/nusl-164981.

Full text
Abstract:
This thesis deals with the topic of the technology called data mining. First, the thesis describes the term data mining as an independent discipline and then its processing methods and the most common use. The term data mining is thereafter explained with the help of methodologies describing all parts of the process of knowledge discovery in databases -- CRISP-DM, SEMMA. The study's purpose is presenting new data mining methods and particular algorithms -- decision trees, neural networks and genetic algorithms. These facts are used as theoretical introduction, which is followed by practical application searching for causes of meningoencephalitis development of certain sample of patients. Decision trees in system Clementine, which is one of the top datamining tools, were used for the analysys.
APA, Harvard, Vancouver, ISO, and other styles
3

Doversten, Martin. "Log Search : En form av datamining." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-159179.

Full text
Abstract:
This report examines the possibility of optimizing troubleshooting log files generated when constructing new Volvo trucks. Errors occur when CAD models are stored and managed by the versioning system PDMLink. By developing a new diagnostic tool, Log Search, the troubleshooting process is automated and thereby streamlines the current manual search.
APA, Harvard, Vancouver, ISO, and other styles
4

Clerc, Frédéric. "Optimization and datamining for catalysts library design." Lyon 1, 2006. http://www.theses.fr/2006LYO10157.

Full text
Abstract:
Afin de créer et de prédire le comportement de bibliothèques de catalyseurs, on utilise des techniques informatiques. Les difficultés couramment rencontrées peuvent être résolues grâce à des algorithmes de méta modélisation. Nous décrivons précisément ces algorithmes qui consistent en l'hybridation de méthodes d'optimisation et d'extraction des connaissances. Plusieurs cas d'étude nous permettent d'expliquer comment les paramétrer efficacement selon le cas. Les conclusions démontrent que la méta modélisation est une technique efficace pour créer des bibliothèques virtuelles prometteuses. Pour obtenir l'ensemble de ces résultats, nous avons développé le logiciel OptiCat. Grâce à une interface utilisateur intuitive, OptiCat permet de construire et d'utiliser les algorithmes d'optimisation les plus complexes en quelques secondes. OptiCat et son code source sont distribués gratuitement à l'adresse hpp://chirouble. Univ-lyon2. Fr/~fclerc/<br>For designing and screening virtual libraries of catalysts, computer techniques are used. The most current difficulties encountered can be overtopped with the help of meta modeling algorithms. In this thesis, we precisely describe these methods that hybridize optimization with data mining. Computer experiments demonstrate the superiority of meta modeling compared to classic methods. Moreover, on the basis of several case studies, we explain how to tune efficiently optimization and learning parameters. Conclusions prove this technique is very efficient for virtual library design : important guidelines are found and costs are minimized. For obtaining these results, we developed the OptiCat software. Thanks to an intuitive graphical user interface, the user can easily tune and run the most complex optimization algorithms within seconds. OptiCat and its source code are downloadable free of charge at http://chirouble. Univ-lyon2. Fr/~fclerc/
APA, Harvard, Vancouver, ISO, and other styles
5

Luca, Venegas Mauricio Pascual de. "Plan para enfocar las campañas bancarias utilizando datamining." Tesis, Universidad de Chile, 2006. http://www.repositorio.uchile.cl/handle/2250/102838.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Straková, Kristýna. "Datamining a využití rozhodovacích stromů při tvorbě Scorecards." Master's thesis, Vysoká škola ekonomická v Praze, 2014. http://www.nusl.cz/ntk/nusl-201627.

Full text
Abstract:
The thesis presents a comparison of several selected modeling methods used by financial institutions for (not exclusively) decision-making processes. First theoretical part describes well known modeling methods such as logistic regression, decision trees, neural networks, alternating decision trees and relatively new method called "Random forest". The practical part of thesis outlines some processes within financial institutions, in which selected modeling methods are used. On real data of two financial institutions logistic regression, decision trees and decision forest are compared which each other. Method of neural network is not included due to its complex interpretability. In conclusion, based on resulting models, thesis is trying to answers, whether logistic regression (method most widely used by financial institutions) remains most suitable.
APA, Harvard, Vancouver, ISO, and other styles
7

Espinoza, Torres José Angel. "Planeamiento estratégico del proyecto Pukaqaqa mediante el método del tajo abierto utilizando soluciones datamine." Bachelor's thesis, Pontificia Universidad Católica del Perú, 2017. http://tesis.pucp.edu.pe/repositorio/handle/123456789/9032.

Full text
Abstract:
El objetivo central de la presente tesis es obtener el mayor Valor Presente Neto (VPN) posible para el proyecto de cobre “Pukaqaqa”, como resultado del diseño y optimización del tajo en mención con características determinadas, por medio de la aplicación de datos geológicos, metalúrgicos, parámetros económicos, entre otros. De la misma manera, se procederá a comparar el análisis entre las curvas Tonelaje – Ley y los reportes obtenidos de las corridas con el software Net Present Value Scheduler 4 (en adelante software NPVS 4), el cual será la principal herramienta para cumplir con el objetivo general de la tesis. En el primer capítulo, se describen algunos conceptos generales de la optimización de tajos; así como, diversos criterios para el desarrollo de la misma, dando énfasis en el algoritmo de Lerchs – Grossmann, consideraciones básicas para el diseño de tajos, entre otros. Asimismo, se realiza una breve descripción de la metodología del trabajo llevado a cabo para la obtención de los resultados deseados. En un segundo capítulo, se desarrolla la descripción de las características generales y principales del proyecto “Pukaqaqa”, partiendo de su geología y ubicación hasta una descripción del modelo de bloques a trabajar. Seguidamente, se detalla los pasos a seguir para la optimización del proyecto en mención empezando por la importación de los datos seguido de la optimización del tajo, en la cual se analiza y se selecciona el tajo óptimo a partir de los tajos anidados obtenidos de la corrida en el software de planeamiento NPVS 4. De igual modo, se realiza la curva Tonelaje – Ley de los resultados obtenidos para su respectivo análisis y, posteriormente, se procede a generar las fases de minado del proyecto usando nuevamente NPVS 4 y los parámetros de restricción y control como el ancho mínimo de minado y tonelaje mínimo por fase. Finalmente, teniendo en cuenta el objetivo central de la planta y restricciones como el “Strip Ratio” (ratio desmonte/mineral), se genera el programa de producción de mina. El tercer capítulo, se centra en el diseño del tajo final o tajo óptimo, en el cual se detallan todos los parámetros de diseño tomados en cuenta para el resultado final; además, se crean los diseños de las fases en donde se puede observar como el tajo va siendo minado hasta llegar a sus paredes finales, llamado también Tajo Operativo. El cuarto capítulo desarrolla la optimización final del tajo usando el Tajo Operativo para este fin; de este modo, después de la optimización, se genera una nueva curva Tonelaje – Ley, la misma que será analizada. Por lo tanto, con los resultados obtenidos se procede a comparar y analizar esta curva con la mencionada y detallada en el segundo capítulo. Asimismo, en este capítulo se detalla la generación del nuevo programa de producción de mina optimizado con el diseño del Tajo Operativo y diseño de fases. En el último capítulo, se analiza todo el proyecto en conjunto de acuerdo a los resultados obtenidos. Finalmente, se procede a señalar las observaciones, conclusiones y bibliografía pertinentes. En conclusión, el siguiente trabajo se enfoca en la evaluación de la optimización y diseño del proyecto de cobre “Pukaqaqa”, resultado del análisis de las corridas realizadas en el software de planeamiento estratégico, NPVS 4 de DATAMINE, siguiendo criterios y parámetros específicos.<br>Tesis
APA, Harvard, Vancouver, ISO, and other styles
8

Poncelet, Jocelyn. "Gestion des assortiments de produits dans la grande distribution." Thesis, IMT Mines Alès, 2020. http://www.theses.fr/2020EMAL0004.

Full text
Abstract:
L’objectif de cette thèse est de proposer un système de recommandations permettant aux grands distributeurs d’améliorer leurs assortiments de produits distribués à travers de nombreux points de vente. Dans ce contexte, la problématique adressée est celle de la planification d’assortiment qui consiste à éliciter les meilleurs produits, e.g., ceux faisant le plus de chiffre d’affaires. Pour ce faire, nous proposons dans un premier temps une comparaison des méthodes pragmatiques mises en place dans l’industrie avec l’état de l’art associé à la planification d’assortiment. Cette comparaison permet de mettre en lumière le problème de transversalité des connaissances utilisées aujourd’hui pour améliorer l’assortiment. Pour pallier ce problème, nous proposons des structures de connaissances propres à la grande distribution. Grâce à ces structures, une méthode Agile d’optimisation de l’assortiment pouvant être intégrée dans un processus d’amélioration continue est formalisée. Cette méthode permet d’intégrer l’expertise humaine, que nous considérons comme indispensable, aux différents leviers actuellement adoptés.Pour souligner la modularité de notre approche, nous proposons ensuite une analyse sémantique des magasins qui, en plus d’améliorer la précision de nos simulations, permet de définir un nouvel axe d’amélioration de l’assortiment. Cette analyse se base sur les structures de connaissances propres à chaque enseigne et sur les mesures de similarités sémantiques. Enfin, pour perfectionner notre méthode et aller plus loin dans l’exploitation ces structures, nous proposons une analyse sémantique des consommateurs qui sont les cibles finales de l’assortiment. Cette seconde analyse sémantique permet d’apporter de nouvelles connaissances pour les distributeurs et d’apporter de nouvelles contraintes sur les assortiments. En parallèle de ces contributions scientifiques, différentes applications ont été développées pour souligner l’interopérabilité de nos contributions avec des notions propres à différents types de distributeurs (e.g. Alimentaire, Bricolage . . . ). Ces applications sont présentées dans le manuscrit dans la limite du respect de la confidentialité et de la propriété intellectuelle<br>The main objective of this thesis is to propose a recommendation system allowing retailers to improve their assortments of products distributed through numerous stores. In this context, the problem addressed is the assortment planning which consists in eliciting the best products, e.g., those with the highest turnover. To this end, we first propose a comparison of assortment planning with the pragmatic methods which are commonly used in the industry and the state of the art. This comparison highlights the problem of cross-functionality of the knowledge used today to improve the assortment. To overcome this problem, we propose knowledge structures specific to mass distribution. Thanks to these structures, an Agile assortment optimisation method that can be integrated into a continuous improvement process is formalised. This method makes possible to integrate human expertise, which we deem essential, in the various levers currently adopted.To underline the modularity of our approach, we then propose a semantic analysis of the stores which, in addition to improving the accuracy of our simulations, allows us to define a new axis of assortment improvement. This analysis is based on our proposals both for domain ontologies which are specific to each brand and on semantic similarity measures. Finally, to perfect our method and go further in the exploitation of those structures, we propose a semantic analysis of the consumers who are the final targets of the assortment. This second semantic analysis allows us to bring new knowledge to retailers and new constraints on assortments. In parallel to these scientific contributions, different applications have been developed to highlight the interoperability of our contributions with concepts specific to different types of retailers (e.g. Food, DIY . . . ). These applications are presented in the manuscript within the limits of respect for confidentiality and intellectual property
APA, Harvard, Vancouver, ISO, and other styles
9

Song, Xuhang. "Exam-based Education System." University of Dayton / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1399044144.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Silva, Fábio Corrêa da. "Utilização de datamining no comércio eletrônico como forma de criação de valor." reponame:Repositório Institucional do FGV, 2000. http://hdl.handle.net/10438/5708.

Full text
Abstract:
Made available in DSpace on 2010-04-20T20:20:27Z (GMT). No. of bitstreams: 0 Previous issue date: 2000-04-28T00:00:00Z<br>Cada vez mais as empresas vêm enfrentando níveis de concorrência mais altos, menores margens nas vendas e perda de diferenciação de seus produtos e serviços. Por outro lado, o desenvolvimento tecnológico e particularmente a Internet vêm permitindo às empresas ter um contato próximo com seus clientes. Este mesmo desenvolvimento também permite que todos os dados, operações e transações empresas sejam armazenadas, fazendo com que maioria delas possuam grandes bancos de dados. Este trabalho tem por objetivo: • Entender quais as tendências do Marketing e como o surgimento de novos canais como a Internet podem alavancar a posição competitiva das empresas; • Constatar a grande quantidade de dados que as empresas possuem, e qual o potencial de informações contida nos mesmos; • Mostrar como Datamining pode auxiliar na descoberta de informações contidas em meio ao emaranhado de dados, • Desenvolvimento de metodologia em um estudo de caso prático, onde se buscará determinar que clientes têm maiores probabilidades de compra, e quais seus padrões de consumo através de um estudo de caso, e como a empresa pode usufruir destas informações; • Mostrar como as empresas podem usufruir das informações descobertas e quais são os valores econômico e estratégico.
APA, Harvard, Vancouver, ISO, and other styles
11

RAJSHIVA, ANSHUMAAN. "MINING STRUCTURED SETS OF SUBSPACES FROM HIGH DIMENSIONAL DATA." University of Cincinnati / OhioLINK, 2004. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1085667702.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Bonino, Quintana Joaquin, and Sebastian Fagerlind. "Datamining GitHub : Examining Time-to-solve for issues in relation to group-size." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280303.

Full text
Abstract:
There exists a mountain of data on GitHub on how software development is done in reality. The data is accessible through an API but is rarely used outside of the projects it belongs to. This data could be an untapped resource and potentially give us an insight about software development in general and how to organize projects more efficiently. This report takes a sample of 676 800 repositories from GitHub and tries to find a relation or underlying function between the number of members and the time it takes them to solve issues. This report also looks at GitHub API and its viability as a tool for research. The data collection was done with two programs written in Golang and took 4 days of execution to run. Because GitHub is used for many different things by many people, some filtering of the data had to be done to remove repositories that were irrelevant or not usable for this study. The requirements included having more than 3 members, having made at least two pull requests and having actually resolved issues. After filtering out projects that didn’t fit our requirements, 1517 repositories remained. It was found that no clear relationship of underlying function could be established due to the variance being too high. However, GitHub API proved to be a valuable tool for this research and holds great potential, even though there are several limitations that have to be considered for further studies.<br>GitHub innehåller en stor mängd data om hur programutveckling går till i verkligheten. Denna data är åtkomlig med hjälp av ett API men används dock oftast inte utanför de projekt som de tillhör. Informationen kan vara en värdefull resurs som potentiellt kan säga något om hur programutveckling går till och hur projekt bör organiseras. I denna rapport studeras ett urval på 676 800 GitHub repositories i ett försök att hitta en relation mellan antalet medlemmar och den tid det tar att lösa issues. Rapporten undersöker även användbarheten av GitHub API som ett forskningsverktyg. All data samlades in med hjälp av två program som skrevs i Golang. Eftersom att GitHub används för många olika syften av en stor mängd personer behövde datan filtreras. För att ta bort de repositories som var relevanta för studien eller som inte kunde användas formulerades ett antal krav baserat på tidigare studier. Kraven som användes var att projekten behövde inkludera tre eller fler medlemmar, de behövdes ha gjorts minst två pull requests samt innehålla lösta issues över huvud taget. Insamlingen av datan tog totalt 4 dagar och efter att filtrerat ut de repositories som inte var användbara för studien kvarstod 1517 som kunde analyseras. Analysen resulterade i att ingen relation eller underliggande funktion kunde hittas eftersom att datamängden hade för hög varians. Däremot visade GitHub API stor potential och var ett värdefullt verktyg för denna studie. Dock har den vissa begränsningar som måste tas hänsyn till för vidare studier.
APA, Harvard, Vancouver, ISO, and other styles
13

Melgueira, Pedro Miguel Lúcio. "Educational data mining applied to Moodle data from the University of Évora." Master's thesis, Universidade de Évora, 2017. http://hdl.handle.net/10174/21346.

Full text
Abstract:
E-Learning tem vindo a ganhar popularidade como forma de transmissão de conhecimentos a nível educacional graças aos avanços nas tecnologias, como por exemplo, a Internet. Instituições como universidades e empresas têm vindo a usar E-Learning para a transmissão de conteúdos educacionais para locais remotos estendendo o seu alcance a estudantes e colaboradores que estão fisicamente distantes. Sistemas chamados “Learning Management Systems”, como o Moodle, existem para organizar E-Learning. Eles oferecem plataformas online onde professores e educadores podem publicar conteúdo, organizar actividades, fazer avaliações, e outro tipo de ações relacionadas, de modo a que os estudantes possam aprender e serem avaliados. Estes sistemas geram e guardam muitos dados relacionados não só com o seu uso, mas também relacionados com notas de estudantes. Este tipo de dados são frequentemente chamados de Dados Educacionais. Métodos de Data Mining são aplicados a estes dados de modo a fazer suposições não triviais. As técnicas aplicadas tomam inspiração de projectos semelhantes no campo da Data Mining Educacional. Este campo consiste na aplicação de métodos de Data Mining a Dados Educacionais. Neste projecto, um repositório de dados do Moodle da Universidade de Évora foi explorado. Técnicas de aprendizagem supervisionada são aplicadas aos dados de modo a mostrar como é possível prever o sucesso de estudantes a partir do seu uso do Moodle. Métodos de aprendizagem não supervisionada são também aplicados de modo a mostrar como há divisões nos dados; Abstract: E-Learning has been rising in popularity as a way to deliver training due to the advancements of technologies, like the Internet. Institutions such as universities and companies have been making use of E-Learning to deliver training to remote locations extending their reach to students and employees who are physically distant. Systems called Learning Management Systems, like Moodle, exist to organize E-Learning. They provide online platforms where professors and educators can publish content, organize activities, perform evaluations, and so on, in order for students to learn and get evaluated. These systems generate and store lots of data regarding not only their usage, but also regarding the grades of students. This kind of data is often referred too as Educational Data. Data Mining techniques are applied to this data in order to make non trivial assumptions. The techniques applied take inspiration from similar projects within the field of Educational Data Mining. This field consists in applying Data Mining Techniques to Education Data. In this project, a data repository from the Moodle of the University of Évora is explored. Supervised learning techniques are applied to this data in order to show how it is possible to make predictions about the success of students based on their usage of Moodle. Unsupervised learning techniques are also applied in order to show how data is divisible.
APA, Harvard, Vancouver, ISO, and other styles
14

Daras, Gautier. "Conception et réalisation d’un outil de traitement et analyse des données spatiales pour l'aide à la décision : application au secteur de la distribution." Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAI099/document.

Full text
Abstract:
L'outil conceptualisé et développé au cours de cette thèse aspire à: (1) Tirer profit des récentes évolutions des Systèmes d'Information Géographique (SIG) en proposant de nouvelles approches pour le traitement de problème ayant un aspect spatial. (2) Appliquer des approches théoriques dans des problématiques industrielles réelles afin de proposer des approches pour les phases qui ne sont pas abordées dans la recherche théorique. Dans cette optique, trois modules ont été développés, un module d’intégration et de visualisation des données spatiales, un module de pré-traitement des données et un module d’optimisation de la couverture.- La première partie de la thèse aborde le sujet de la mise en place du premier module, et propose un framework conceptuel pour le développement d'outil similaire. Le module d'intégration et de visualisation développé permet l’accès aux données de ventes via une interface web dédiée. La plateforme permet la mise en contexte des données de ventes en affichant les détaillants sur une carte, et en donnant accès à la visualisation d’autres données (ex. : socio-démo graphique, concurrentielle). Les détaillants affichés sur la carte sont filtrables suivant leurs caractéristiques et colorables suivant de multiples critères (ex. : comparaison aux années précédentes, comparaison aux objectifs, etc.). La sélection des éléments présents sur la carte permet d’avoir accès à leurs informations détaillées. L’ensemble des différentes fonctionnalités permet une meilleure compréhension du marché, et autorise l’exploration des résultats de ventes sous un nouvel angle.- La seconde partie traite de l’outil de pré-traitement des données spatiale. Notre approche permet de rendre accessible l’analyse de données spatiales aux utilisateurs ne disposant pas de connaissances en SIG. En plus de cela, la réalisation des étapes de prétraitement peut être réalisée plus rapidement, et avec des choix guidés quant à la sélection des relations spatiales à prendre en compte. Une implémentation fonctionnelle de l’approche a été mise en place, basée sur des outils open sources pour permettre l’implémentation à coûts réduits de notre solution. L’utilisation de notre implémentation permet des gains de temps conséquents lors du prétraitement des données spatiales pour les analyses des données géospatiales.- La troisième et dernière partie se concentre sur l’outil d’optimisation de la couverture qui s’appuie sur la structure et les outils mis en place précédemment. Il prend en entrée les jeux de données correspondant aux potentiels des zones et ceux correspondant aux points de vente et à leurs zones de chalandise. À partir de ces données, l’outil propose des solutions d’amélioration de la couverture qui tiennent compte des aspects liés à la zone de chalandise de chaque magasin et à la captation collaborative de la demande<br>The tool conceptualized and developed during this thesis aims to: (1) Take advantage of recent evolutions of Geographic Information Systems (GIS) by proposing new approaches for the treatment of problems having a spatial aspect. (2) Apply theorical approach in real industrial issues to propose approaches for phases that are not addressed in theoretical research. With this in mind, three modules have been developed, a spatial data integration and visualization module, a data pre-processing module and a coverage optimization module.- The first part of the thesis addresses the subject of the implementation of the first module, and proposes a conceptual framework for the development of similar tools. The integration and visualization module allows access to sales data via a dedicated web interface. The platform allows the contextualization of sales data by displaying retailers on a map and giving access to the visualization of other data (eg socio-demographic, competitive). The retailers displayed on the map can be filtered according to their characteristics and colorable according to multiple criteria (eg comparison with previous years, comparison with objectives, etc.). The selection of the elements present on the map allows to have access to their detailed information. All the different functionalities allow for a better understanding of the market, and allow for the exploration of the sales results in a new angle.- The second part deals with the spatial data pre-processing tool. Our approach makes it possible to make spatial data analysis available to users who do not have GIS knowledge. In addition to this, the realization of the pre-processing steps can be carried out more quickly, and with guided choices for the selection of the spatial relations to take into account. A functional implementation of the approach has been implemented, based on open source tools to enable cost-effective implementation of our solution. The use of our implementation allows for significant time savings when pre-processing spatial data for geospatial data analysis.- The third and final part focuses on the coverage optimization module that is based on the structure and modules previously implemented. It takes as input the datasets corresponding to the potentials of the zones and those corresponding to the distributors and their catchment areas. From this data, the module proposes solutions to improve the coverage that take into account the aspects related to the catchment area of ​​each distributors and the collaborative capture of the potential
APA, Harvard, Vancouver, ISO, and other styles
15

Ayouni, Sarra. "Etude et Extraction de règles graduelles floues : définition d'algorithmes efficaces." Thesis, Montpellier 2, 2012. http://www.theses.fr/2012MON20015/document.

Full text
Abstract:
L'Extraction de connaissances dans les bases de données est un processus qui vise à extraire un ensemble réduit de connaissances à fortes valeurs ajoutées à partir d'un grand volume de données. La fouille de données, l'une des étapes de ce processus, regroupe un certain nombre de taches, telles que : le clustering, la classification, l'extraction de règles d'associations, etc.La problématique d'extraction de règles d'association nécessite l'étape d'extraction de motifs fréquents. Nous distinguons plusieurs catégories de motifs : les motifs classiques, les motifs flous, les motifs graduels, les motifs séquentiels. Ces motifs diffèrent selon le type de données à partir desquelles l'extraction est faite et selon le type de corrélation qu'ils présentent.Les travaux de cette thèse s'inscrivent dans le contexte d'extraction de motifs graduels, flous et clos. En effet, nous définissons de nouveaux systèmes de clôture de la connexion de Galois relatifs, respectivement, aux motifs flous et graduels. Ainsi, nous proposons des algorithmes d'extraction d'un ensemble réduit pour les motifs graduels et les motifs flous.Nous proposons également deux approches d'extraction de motifs graduels flous, ceci en passant par la génération automatique des fonctions d'appartenance des attributs.En se basant sur les motifs flous clos et graduels clos, nous définissons des bases génériques de toutes les règles d'association graduelles et floues. Nous proposons également un système d'inférence complet et valide de toutes les règles à partir de ces bases<br>Knowledge discovery in databases is a process aiming at extracting a reduced set of valuable knowledge from a huge amount of data. Data mining, one step of this process, includes a number of tasks, such as clustering, classification, of association rules mining, etc.The problem of mining association rules requires the step of frequent patterns extraction. We distinguish several categories of frequent patterns: classical patterns, fuzzy patterns, gradual patterns, sequential patterns, etc. All these patterns differ on the type of the data from which the extraction is done and the type of the relationship that represent.In this thesis, we particularly contribute with the proposal of fuzzy and gradual patterns extraction method.Indeed, we define new systems of closure of the Galois connection for, respectively, fuzzy and gradual patterns. Thus, we propose algorithms for extracting a reduced set of fuzzy and gradual patterns.We also propose two approaches for automatically defining fuzzy modalities that allow obtaining relevant fuzzy gradual patterns.Based on fuzzy closed and gradual closed patterns, we define generic bases of fuzzy and gradual association rules. We thus propose a complet and valid inference system to derive all redundant fuzzy and gradual association rules
APA, Harvard, Vancouver, ISO, and other styles
16

Araújo, Thessika Hialla Almeida. "Desenvolvimento de um banco de dados (HTLV-1 molecular epidemiology databases) para dataming e data management de sequências do HTLV-1." reponame:Repositório Institucional da FIOCRUZ, 2012. https://www.arca.fiocruz.br/handle/icict/4315.

Full text
Abstract:
Submitted by Ana Maria Fiscina Sampaio (fiscina@bahia.fiocruz.br) on 2012-08-31T17:46:19Z No. of bitstreams: 1 Thessika Hialla Almeida Araujo Desenvolvimento de um banco de dados HTLV-1....pdf: 1454472 bytes, checksum: 665a7044bdf71c54637b51f71b0d6527 (MD5)<br>Made available in DSpace on 2012-08-31T17:46:19Z (GMT). No. of bitstreams: 1 Thessika Hialla Almeida Araujo Desenvolvimento de um banco de dados HTLV-1....pdf: 1454472 bytes, checksum: 665a7044bdf71c54637b51f71b0d6527 (MD5) Previous issue date: 2012<br>Fundação Oswaldo Cruz. Centro de Pesquisas Gonçalo Moniz. Salvador, Bahia, Brasil<br>As pesquisas biológicas geram uma grande quantidade de informações que devem ser armazenadas e gerenciadas, permitindo que os usuários tenham acesso a dados completos sobre o tema de interesse. O volume de dados não relacionados gerados nas pesquisas com HTLV-1 justifica a criação de um Banco de dados que contenha o maior número de informações sobre o vírus, seus aspectos epidemiológicos, para que possam estabelecer melhores relações sobre infecção, patogênese, origem e principalmente, evolução. Os dados foram obtidos a partir de pesquisa no GENBANK, em artigos relacionados e diretamente com os autores dos dados. O banco de dados foi desenvolvido utilizando o Apache Webserver 2.1.6 e o SGBD – MySQL. A webpage foi desenvolvida em HTML a escrita em PHP. Atualmente temos cadastradas 2435 sequências, sendo que 1968 (80,8%) representam diferentes isolados. Em relação ao status clínico, o banco de dados tem informação de 40,49% das sequências, no qual 43%, 18,69%, 32,7%, 5,61% são TSP/HAM, ATL, assintomático e outras doenças, respectivamente. Quanto ao gênero e idade tem-se informação de 15,4% e 10,56% respectivamente. O HTLV-1 Molecular Epidemiology Database está hospedado no servidor do Centro de Pesquisa Gonçalo Moniz/FIOCRUZ-BA com acesso em http://htlv1db.bahia.fiocruz.br/, sendo um repositório de sequências do HTLV-1 com informações clínicas, epidemiológicas e geográficas. Esta base de dados dará apoio às investigações clínicas e pesquisas para desenvolvimento de vacinas.<br>Scientific development has generated a large amount of data that should be stored and managed in order for researchers to have access to complete data sets. Information generated from research on HTLV-1 warrants the design of databases to aggregate data from a range of epidemiological aspects. This database would support further research on HTLV-1 viral infections, pathogenesis, origins, and evolutionary dynamics. All data was obtained from publications available at GenBank or through contact with the authors. The database was developed using the Apache Webserver 2.1.6 and SGBD MySQL. The webpage interfaces were developed in HTML and sever-side scripting written in PHP. There are currently 2,435 registered sequences with 1,968 (80.8%) of those sequences representing different isolates. Of these sequences, 40.49% are related to clinical status (TSP/HAM, 43%, ATLL, 18.69%, asymptomatic, 32.7%, and other diseases, 5.61%). Further, 15.4% of sequences contain information on patient gender while 10.56% of sequences provide the age of the patient. The HTLV-1 Molecular Epidemiology Database is hosted on the Gonçalo Moniz/FIOCRUZ-BA research center server with access at http://htlv1db.bahia.fiocruz.br/. Here, we have developed a repository of HTLV-1 genetic sequences from clinical, epidemiological, and geographical studies. This database will support clinical research and vaccine development related to viral genotype.
APA, Harvard, Vancouver, ISO, and other styles
17

Pospíšil, Jan. "Modul víceúrovňových asociačních pravidel systému pro dolování z dat." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2010. http://www.nusl.cz/ntk/nusl-237192.

Full text
Abstract:
This thesis focuses on the problematics of implementing a multilevel association rules mining module, for existing data mining project. There are two main algorithms explained, Apriori and MLT2L1. The thesis continues with the datamining module implementation, as well as the DMSL elements design. In the final chapters deal with an example dataminig task and its result comparison as well as the whole thesis achievement description.
APA, Harvard, Vancouver, ISO, and other styles
18

Olofsson, Niklas. "Implementation of the Apriori algorithm for effective item set mining in VigiBaseTM : Project report in Teknisk Fysik 15 hp." Thesis, Uppsala University, Department of Engineering Sciences, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-129688.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Vojíř, Stanislav. "Mapování PMML a BKEF dokumentů v projektu SEWEBAR-CMS." Master's thesis, Vysoká škola ekonomická v Praze, 2010. http://www.nusl.cz/ntk/nusl-75744.

Full text
Abstract:
In the data mining process, it is necessary to prepare the source dataset - for example, to select the cutting or grouping of continuous data attributes etc. and use the knowledge from the problem area. Such a preparation process can be guided by background (domain) knowledge obtained from experts. In the SEWEBAR project, we collect the knowledge from experts in a rich XML-based representation language, called BKEF, using a dedicated editor, and save into the database of our custom-tailored (Joomla!-based) CMS system. Data mining tools are then able to generate, from this dataset, mining models represented in the standardized PMML format. It is then necessary to map a particular column (attribute) from the dataset (in PMML) to a relevant 'metaattribute' of the BKEF representation. This specific type of schema mapping problem is addressed in my thesis in terms of algorithms for automatic suggestion of mapping of columns to metaattributes and from values of these columns to BKEF 'metafields'. Manual corrections of this mapping by the user are also supported. The implementation is based on the PHP language and then it was tested on datasets with information about courses taught in 5 universities in the U.S.A. from Illinois Semantic Integration Archive. On this datasets, the auto-mapping suggestion process archieved the precision about 70% and recall about 77% on unknown columns, but when mapping the previously user-mapped data (using implemented learning module), the recall is between 90% and 100%.
APA, Harvard, Vancouver, ISO, and other styles
20

Tapia, Rivas Iván Gildo. "Una Metodología para sectorizar pacientes en el consumo de medicamentos aplicando Datamart y Datamining en un hospital nacional." Bachelor's thesis, Universidad Nacional Mayor de San Marcos, 2006. https://hdl.handle.net/20.500.12672/273.

Full text
Abstract:
La Minería de Datos (Data Mining) es la búsqueda de patrones interesantes y de regularidades importantes en grandes bases de datos. La minería de datos inteligente utiliza métodos de aprendizaje automático para descubrir y enumerar patrones presentes en los datos. Una forma para describir los atributos de una entidad de una base de datos es utilizar algoritmos de segmentación o clasificación. El presente trabajo, propone un método para el análisis de datos, para evaluar la forma con la que se consumen los medicamentos en un hospital peruano, poder identificar algunas realidades o características no observables que producirían desabastecimiento o insatisfacción del paciente, y para que sirva como una herramienta en la toma de decisión sobre el abastecimiento de medicamentos en el hospital. En esta investigación, se utilizan técnicas para la Extracción, Transformación y Carga de datos, y para la construcción de un Datamart, para finalmente un algoritmo de minería de datos adecuado para el tipo de información que se encuentra contenida.<br>The Mining of Data (Mining Data) is the search of interesting patterns and important regularities in great data bases. The intelligent mining of data uses methods of automatic learning to discover and to enumerate present patterns in the data. A form to describe the attributes of an organization of a data base is to use algorithms of segmentation or classification. The present work, proposes a method for the analysis of data, to evaluate the form with which the medicines in a Peruvian hospital are consumed, to be able to identify some non-observable realities or characteristics which they would produce shortage of supplies or dissatisfaction of the patient, and so that it serves as a tool in the decision making on the medicine supplying in the hospital. In this investigation, techniques for the Extraction, Transformation and Load of data are used, and for the construction of a Datamart, finally an algorithm of mining of data adapted for the type of information that is contained.<br>Tesis
APA, Harvard, Vancouver, ISO, and other styles
21

Tapia, Rivas Iván Gildo, and Rivas Iván Gildo Tapia. "Una Metodología para sectorizar pacientes en el consumo de medicamentos aplicando Datamart y Datamining en un hospital nacional." Bachelor's thesis, Universidad Nacional Mayor de San Marcos, 2006. http://cybertesis.unmsm.edu.pe/handle/cybertesis/273.

Full text
Abstract:
La Minería de Datos (Data Mining) es la búsqueda de patrones interesantes y de regularidades importantes en grandes bases de datos. La minería de datos inteligente utiliza métodos de aprendizaje automático para descubrir y enumerar patrones presentes en los datos. Una forma para describir los atributos de una entidad de una base de datos es utilizar algoritmos de segmentación o clasificación. El presente trabajo, propone un método para el análisis de datos, para evaluar la forma con la que se consumen los medicamentos en un hospital peruano, poder identificar algunas realidades o características no observables que producirían desabastecimiento o insatisfacción del paciente, y para que sirva como una herramienta en la toma de decisión sobre el abastecimiento de medicamentos en el hospital. En esta investigación, se utilizan técnicas para la Extracción, Transformación y Carga de datos, y para la construcción de un Datamart, para finalmente un algoritmo de minería de datos adecuado para el tipo de información que se encuentra contenida.<br>The Mining of Data (Mining Data) is the search of interesting patterns and important regularities in great data bases. The intelligent mining of data uses methods of automatic learning to discover and to enumerate present patterns in the data. A form to describe the attributes of an organization of a data base is to use algorithms of segmentation or classification. The present work, proposes a method for the analysis of data, to evaluate the form with which the medicines in a Peruvian hospital are consumed, to be able to identify some non-observable realities or characteristics which they would produce shortage of supplies or dissatisfaction of the patient, and so that it serves as a tool in the decision making on the medicine supplying in the hospital. In this investigation, techniques for the Extraction, Transformation and Load of data are used, and for the construction of a Datamart, finally an algorithm of mining of data adapted for the type of information that is contained.<br>Tesis
APA, Harvard, Vancouver, ISO, and other styles
22

Genua, Eleonora <1995&gt. "Big Data e Customer Centricity: il CRM a supporto delle strategie di business tramite tecniche di datamining e clusterizzazione." Master's Degree Thesis, Università Ca' Foscari Venezia, 2021. http://hdl.handle.net/10579/18774.

Full text
Abstract:
L’evoluzione del mondo del retail e i costanti progressi della tecnologia hanno portato alla definizione di diverse strategie di canale e di business per le aziende. Il modello a cui aspirare è quello dell’omnicanalità che garantisce al consumatore un’esperienza fluida e senza punti di frizione tra i diversi touch points offerti dall’azienda, online e offline. Tale strategia richiede un’attenta raccolta dei dati provenienti da tutti i punti di contatto, tramite differenti e integrate tecnologie, con l’obiettivo di analizzare i comportamenti dei clienti, comprendere le loro necessità e definire un piano di contatto. Il CRM e il data analytics permettono di raccogliere e analizzare le informazioni così da poter aumentare la probabilità di ingaggio dei clienti tramite azioni mirate e definite in base ai loro interessi e comportamenti di consumo. Fatte le suddette considerazioni, l'elaborato si concentrerà in una prima parte sull'evoluzione del mondo retail con particolare focus sul fashion, per poi andare a soffermarsi sul CRM come strumento a supporto delle decisioni aziendali, sul ruolo dei Big Data e delle tecniche di machine learning e in particolare sulla clusterizzazione. L'elaborato si concluderà con il caso aziendale e lo svolgimento della cluster analysis relativa ai clienti fidelizzati del brand, effettuata tramite la piattaforma analitica KNIME.
APA, Harvard, Vancouver, ISO, and other styles
23

Merlo, Riccardo. "Ottimizzazione di algoritmi per l'Outlier Detection." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017.

Find full text
Abstract:
Si è partiti da un algoritmo noto per l'outlier detection (solvingset), al quale sono state applicate delle ottimizzazioni algoritmiche. La fase di testing ha evidenziato discrepanze significative tra gli speedup dei tempi e quelli delle distanze. Sono quindi state fatte analisi prestazionali aggiuntive, con degli strumenti chiamati profiler, per individuare i colli di bottiglia e gli overhead generati dall'algoritmo. Tali analisi hanno evidenziato la presenza di overhead non facilmente eliminabile che mitigano i teorici vantaggi temporali ottenibili dall'applicazione di ottimizzazioni algoritmicamente valide.
APA, Harvard, Vancouver, ISO, and other styles
24

Svensson, Christine, and Susanne Strandberg. "Reporting Management för den interna rapporterings processen med hjälp av verktyget Tivoli Decision Support : TDS." Thesis, Blekinge Tekniska Högskola, Institutionen för telekommunikation och signalbehandling, 2001. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-2772.

Full text
Abstract:
Rapporten inleds med en beskrivning av WM-datas Network Management struktur och Reporting Management behov. Därefter följer en beskrivning av de två analys tekniker Datamining och On Line Analytical Processing (OLAP) vilka är de mest använda databasbaserade tekniker. Verktyget Tivoli Decision Support (TDS) är ett stödssystem som ska underlätta för beslutsfattare inom organisationen. TDS baseras på OLAP ? tekniken och rapporten visar avslutningsvis de möjligheter som verktyget ger avseende WM-datas Reporting Management.<br>Christine Svensson 0454-14926 Susanne Strandberg 0456-22824
APA, Harvard, Vancouver, ISO, and other styles
25

Jacques, Julie. "Classification sur données médicales à l'aide de méthodes d'optimisation et de datamining, appliquée au pré-screening dans les essais cliniques." Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2013. http://tel.archives-ouvertes.fr/tel-00919876.

Full text
Abstract:
Les données médicales souffrent de problèmes d'uniformisation ou d'incertitude, ce qui les rend difficilement utilisables directement par des logiciels médicaux, en particulier dans le cas du recrutement pour les essais cliniques. Dans cette thèse, nous proposons une approche permettant de palier la mauvaise qualité de ces données à l'aide de méthodes de classification supervisée. Nous nous intéresserons en particulier à 3 caractéristiques de ces données : asymétrie, incertitude et volumétrie. Nous proposons l'algorithme MOCA-I qui aborde ce problème combinatoire de classification partielle sur données asymétriques sous la forme d'un problème de recherche locale multi-objectif. Après avoir confirmé les apports de la modélisation multi-objectif dans ce contexte, nous calibrons MOCA-I et le comparons aux meilleurs algorithmes de classification de la littérature, sur des jeux de données réels et asymétriques de la littérature. Les ensembles de règles obtenus par MOCA-I sont statistiquement plus performants que ceux de la littérature, et 2 à 6 fois plus compacts. Pour les données ne présentant pas d'asymétrie, nous proposons l'algorithme MOCA, statistiquement équivalent à ceux de la littérature. Nous analysons ensuite l'impact de l'asymétrie sur le comportement de MOCA et MOCA-I, de manière théorique et expérimentale. Puis, nous proposons et évaluons différentes méthodes pour traiter les nombreuses solutions Pareto générées par MOCA-I, afin d'assister l'utilisateur dans le choix de la solution finale et réduire le phénomène de sur-apprentissage. Enfin, nous montrons comment le travail réalisé peut s'intégrer dans une solution logicielle.
APA, Harvard, Vancouver, ISO, and other styles
26

Malca, Pérez Fidel. "Propuesta de un datamining como solución de inteligencia de negocios, para facilitar la acreditación académica en la EAPIS de la UNMSM." Bachelor's thesis, Universidad Nacional Mayor de San Marcos, 2013. https://hdl.handle.net/20.500.12672/12440.

Full text
Abstract:
En el contexto actual de Globalización/Internacionalización, los estándares de calidad están presentes en todos los ámbitos del quehacer humano, dichos estándares de calidad garantizan calidad y continuidad del producto o servicio. El ámbito educativo no escapa de cumplir estos requisitos de calidad y continuidad que el estudiante exige, planteándose así la Acreditación Académica. La acreditación es el reconocimiento formal de la calidad demostrada por una institución o programa educativo, otorgado a través del órgano acreditador correspondiente, según el informe de evaluación externa emitido por dicha entidad evaluadora, debidamente autorizada, de acuerdo con las normas vigentes. Las universidades como instituciones de educación superior constituyen un organismo de primer orden en la preparación de hombres y mujeres capaces de contribuir significativamente al desarrollo social, económico y político de la sociedad. A éstas le corresponde, mediante la investigación y la experimentación, abrir brechas para el cambio y la transformación de la sociedad. La Data Mining como solución de Inteligencia de Negocios, nos facilitara datos e indicadores para alcanzar con éxito la Acreditación Académica.<br>Trabajo de suficiencia profesional
APA, Harvard, Vancouver, ISO, and other styles
27

Korger, Christina. "Clustering of Distributed Word Representations and its Applicability for Enterprise Search." Master's thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2016. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-208869.

Full text
Abstract:
Machine learning of distributed word representations with neural embeddings is a state-of-the-art approach to modelling semantic relationships hidden in natural language. The thesis “Clustering of Distributed Word Representations and its Applicability for Enterprise Search” covers different aspects of how such a model can be applied to knowledge management in enterprises. A review of distributed word representations and related language modelling techniques, combined with an overview of applicable clustering algorithms, constitutes the basis for practical studies. The latter have two goals: firstly, they examine the quality of German embedding models trained with gensim and a selected choice of parameter configurations. Secondly, clusterings conducted on the resulting word representations are evaluated against the objective of retrieving immediate semantic relations for a given term. The application of the final results to company-wide knowledge management is subsequently outlined by the example of the platform intergator and conceptual extensions."
APA, Harvard, Vancouver, ISO, and other styles
28

Stega, Alessandro Pio. "Sviluppo, implementazione e ottimizzazione di algoritmi per la scoperta di “outlier”." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019.

Find full text
Abstract:
Il lavoro oggetto dell'elaborato è incentrato sugli algoritmi di ricerca degli ouliers. In particolare, si è studiato l'algoritmo per il calcolo del solving set, grazie al quale è possibile trovare i top n outlier per un dataset e fornire un sottoinsieme, chiamato solving set, del dataset per predire se un oggetto si possa o meno classificare come un outlier. Il solving set, dunque, rappresenta un modello per i dati. In caso di arrivo di nuovi dati in generale il modello degrada nel tempo e pertanto è stata trovata un'euristica per individuare il momento in cui il modello si invalida e quali siano le soluzioni possibili. Inoltre, è stata proposta un'euristica per stimare il parametro n corrispondente al numero di top che si intende individuare. Infine, è stata realizzata una libreria, con linguaggio Python, che implementa questo algoritmo.
APA, Harvard, Vancouver, ISO, and other styles
29

Bergami, Giacomo. "Hypergraph Mining for Social Networks." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2014. http://amslaurea.unibo.it/7106/.

Full text
Abstract:
Nowadays, more and more data is collected in large amounts, such that the need of studying it both efficiently and profitably is arising; we want to acheive new and significant informations that weren't known before the analysis. At this time many graph mining algorithms have been developed, but an algebra that could systematically define how to generalize such operations is missing. In order to propel the development of a such automatic analysis of an algebra, We propose for the first time (to the best of my knowledge) some primitive operators that may be the prelude to the systematical definition of a hypergraph algebra in this regard.
APA, Harvard, Vancouver, ISO, and other styles
30

Mafra, Denis Teixeira. "Proposição de uma metodologia para o desenvolvimento de uma arquitetura de informações como uma etapa de implantação de um datamining de suporte à tomada de decisão gerencial." Florianópolis, SC, 2005. http://repositorio.ufsc.br/handle/123456789/102907.

Full text
Abstract:
Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Sócio-Econômico. Programa de Pós-Graduação em Administração<br>Made available in DSpace on 2013-07-16T02:17:29Z (GMT). No. of bitstreams: 0<br>A Tecnologia da Informação - TI emerge como uma mudança inevitável para as organizações que almejam a permanência no mercado. Nesse sentido, pode-se afirmar que o uso dessa ferramenta permite às organizações uma rapidez no acesso as informações que são necessárias para um monitoramento eficiente dos seus desempenhos. Nesse sentido, o problema central do estudo é: Como nortear o desenvolvimento de um arquitetura de informações sendo uma etapa de implantação de um Datamining de suporte à tomada de decisão gerencial ? Dessa forma, fixou-se o seguinte objetivo geral: Propor uma metodologia para o desenvolvimento de uma arquitetura de informações como uma etapa de implantação de um Datamining de suporte à tomada de decisão gerencial para o Planejamento e Sistema de Medição de Desempenho - Balanced Scorecard - do SENAI/SC. Para alcançá-lo, foram fixados objetivos específicos. Na caracterização da pesquisa, o método utilizado quanto à abordagem do problema foi qualitativo e a perspectiva de estudo quanto à dimensão temporal consistiu no estudo cross-sectional. A pesquisa foi bibliográfica e estudo de caso.Quanto à classificação com base em seus objetivos gerais, a pesquisa foi exploratória e descritiva. O objeto de estudo foi o SENAI/SC. Foram propostas e apresentadas as seguintes Ferramentas de Gestão e dentre elas, o objeto do estudo em questão - Planejamento e Sistema de Medição de Desempenho - o Balanced Scorecard (BSC). Para o BSC, apresentaram-se: o Mapa Estratégico, as Perspectivas, os Objetivos Estratégicos, e o Painel de Desempenho Balanceado (Indicadores e Iniciativas Estratégicas). Concernente ao objetivo central do trabalho, foram apresentadas as etapas necessárias para o desenvolvimento de uma arquitetura de informações: 1) Mapeamento dos processos; 2) Implantação de um sistema de informações gerenciais (ERP); 3) Mapeamento e descrição dos softwares da organização; 4)Identificação do objetivo e da fórmula de cálculo dos indicadores; 5) Mapeamento e descrição de todas os dados necessários para a medição dos indicadores; 6) Identificação das fontes, responsáveis e prazos de cada um dos dados; e 7) Verificação de possíveis inter-relações do mesmo dado com mais de um indicador. Cada uma dessas etapas foi aplicada no BSC do SENAI/SC, sendo que foram coletadas as principais informações para o desenvolvimento da arquitetura de informações. Com base na arquitetura, demonstrou-se também a construção de Datamart´s de Indicadores do BSC, que irão compor o Datawarehouse necessário para que o Datamining possa filtrar as informações e permitir a geração de informações gerenciais para a tomada de decisão.
APA, Harvard, Vancouver, ISO, and other styles
31

Cabillic, Marine. "Caractérisation de l'organisation et du trafic de paires récepteur/anticorps thérapeutiques par microscopie de localisation de molécules uniques couplée au criblage à haut débit." Thesis, Bordeaux, 2021. http://www.theses.fr/2021BORD0026.

Full text
Abstract:
L'immuno-oncologie est un domaine en pleine expansion, à la frontière de la thérapie du cancer. Les immunothérapies du cancer visent à stimuler le système immunitaire de l'organisme pour qu'il cible et attaque la tumeur, grâce à des anticorps thérapeutiques. Ces anticorps se lient spécifiquement aux récepteurs membranaires des cellules T, lymphocytes jouant un rôle central dans la réponse immunitaire, modifiant leur signalisation intracellulaire. Comprendre comment l'organisation spatiale des récepteurs et des protéines de signalisation est régulée, et comment elle détermine l'activation des lymphocytes, est devenu le "Saint Graal" de l'immunologie cellulaire. Dans cette optique, une meilleure compréhension des fonctions des anticorps et du trafic intracellulaire associé, permettrait d’expliciter les différences d’efficacité des candidats thérapeutiques ciblant les récepteurs d'intérêt. La microscopie de super-résolution permet l’accès à l'organisation et la dynamique des récepteurs membranaires avec des résolutions nanométrique. Elle offre la capacité de révéler des informations sur les évènements précoces déclenchés par la liaison d’un anticorps à son récepteur, permettant à terme l’optimisation de leur efficacité fonctionnelle. Associée aux techniques de criblage à haut débit, elle a le potentiel de jouer un rôle prépondérant dans les phases précoces des projets où il est nécessaire de sélectionner les meilleurs anticorps issus de banques pouvant en compter plusieurs centaines.L'objectif de cette thèse CIFRE a été de caractériser fonctionnellement l'organisation et le trafic de paires récepteur/anticorps par l’association de méthodes de microscopie super-résolution par localisation de molécules individuelles (SMLM) et de criblage à haut débit (HCS). Dans ce contexte, nous avons mis au point et utilisé une plateforme permettant de caractériser différents anticorps thérapeutiques ciblant des récepteurs de cellules T, dans le but de recueillir des informations quantitatives sur les candidats thérapeutiques potentiels. Nous avons également optimisé la technique d'imagerie à feuille de lumière simple objectif (soSPIM) dans le but de pouvoir réaliser une cartographie 3D des récepteurs membranaires sur d'une cellule T entière avec une résolution nanométrique. Cette nouvelle approche permet l'imagerie de cellules T dans des conditions plus physiologiques, fournissant des informations complémentaires par rapport aux expériences de criblage à grande échelle. Ces deux techniques nous ont permis d’améliorer notre compréhension du mode d'action des anticorps sur les récepteurs au niveau de la cellule unique. Les expériences à grande échelle réalisées dans le cadre de ce travail ont nécessité plusieurs développements logiciels pour l'automatisation de l'acquisition et l'analyse statistique des Téraoctets de données de molécules individuelles générées. Ce projet de thèse s'est concentré sur la cible PD-1, un point de contrôle crucial du système immunitaire impliqué dans la modulation de l'activation des lymphocytes. La première partie de la thèse a été principalement consacrée à la mise en place de nouveaux protocoles pour l'imagerie de super-résolution des récepteurs PD-1 sur cellules Jurkat activées. La seconde partie concerne l’étude de l'impact d’anticorps thérapeutiques anti-PD-1 utilisés en routine en clinique, sur l'organisation spatiale et la dynamique des récepteurs PD-1 sur cellules vivantes, à l’échelle nanométrique. Ce travail est la preuve concept de l’utilité de ces outils d’imagerie de pointe pour la caractérisation quantitative d’anticorps monoclonaux thérapeutiques ciblant PD-1 à la membrane des cellules T<br>Immuno-oncology is a young and growing field at the frontier of cancer therapy. Immuno-oncology therapies aim to stimulate the body's immune system to target and attack the tumor through therapeutic antibodies, by binding and modifying the intracellular signaling of T-cells (lymphocytes playing a central role in the immune response) surface receptors. Understanding how the spatial organization of receptors and signaling proteins is regulated and how it determines lymphocyte activation and cell fate decisions has become a ‘holy grail’ for cellular immunology. To achieve this goal, a better comprehension of antibodies functions and subcellular trafficking is requested to explain the differential efficacies of therapeutic candidates targeting receptors of interest. Quantitative super-resolution microscopy provides access to the nanoscale organization of membrane receptors playing a physiological role. It offers a new investigation tool for antibody optimization as well as maximizing their functional efficacy. In combination with high throughput screening techniques, it has the potential to play a crucial role in the early phases of projects in which it is necessary to select the best antibodies from banks that may contain several hundred of them. The goal of this PhD thesis was to functionally characterize receptor/antibodies pairs organization and trafficking by quantitative single-molecule localization microscopy (SMLM) combined with high content screening (HCS). In this context, we have developed and used an HCS-SMLM platform to characterize multiple antibodies targeting T-cell membrane receptors, allowing gathering unprecedented quantitative insight of potential therapeutic candidates. We also optimized the single objective light-sheet microscope (soSPIM) to permit 3D mapping of membrane receptors across an entire T-cell, with single molecule resolution. It allows 3D nanoscale imaging of T-cells in more physiological conditions, and provide complementary information compared to large scale single molecule screening experiments. Altogether, these developments improved our comprehension of antibody mode of action on receptors at the single cell level. Large-scale experiments performed during this work required the development of several software for the automation of the acquisition and the statistical analysis of the Terabytes of single molecule data generated.This project is focused on targeting PD-1, a control point of the immune system involved in the modulation of immune cells activation. The first part of the thesis was mainly devoted to the implementation of new protocols for PD-1 receptors super-resolution imaging on activated Jurkat cells. In the second part, we further investigated the impact of known anti-PD-1 therapeutic antibodies used in clinics, on the nanoscale spatial organization and dynamics of PD-1 receptors in living cells using our HCS-SMLM platform. This work provides the proof of concept of the capacity of these cutting-edge imaging techniques to characterize quantitatively different therapeutic monoclonal antibodies targeting PD-1 on T-cell membrane
APA, Harvard, Vancouver, ISO, and other styles
32

Alam, Mohammad Tanveer. "Image Classification for Remote Sensing Using Data-Mining Techniques." Youngstown State University / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1313003161.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Mader, Pavel. "Dolovací moduly systému pro dolování z dat v prostředí Oracle." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2009. http://www.nusl.cz/ntk/nusl-236690.

Full text
Abstract:
This master's thesis deals with questions of the data mining and an extension of a data mining system in the Oracle environment developed at FIT. So far, this system cannot apply to real-life conditions as there are no data mining modules available. This system's core application design includes an interface allowing the addition of mining modules. Until now, this interface has been tested on a sample mining module only; this module has not been executing any activity just demonstrating the use of this interface. The main focus of this thesis is the study of this interface and the implementation of a functional mining module testing the applicability of the implemented interface. Association rule mining module was selected for implementation.
APA, Harvard, Vancouver, ISO, and other styles
34

Novák, Ondřej. "Databázová nezávislost jádra systému pro dolování z dat FIT-Miner." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2013. http://www.nusl.cz/ntk/nusl-236162.

Full text
Abstract:
System for data mining Fit-Miner is now dependant on only one specific DBMS. This master&#8217;s thesis  deals with analysis of implementation that works with database, modules and functions for data mining. Next it shows the set of changes which will allow FIT-Miner to work with another DBMS. And finally, a description of the implementation of these changes.
APA, Harvard, Vancouver, ISO, and other styles
35

Lanc, Martin. "Systém pro testování obchodní strategie." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2008. http://www.nusl.cz/ntk/nusl-235884.

Full text
Abstract:
Aim of this thesis is to introduce questions about trading stocks on global stock exchange. It shows up basics ideas, which are necessary to understand the system of trading stocks, building a bussines strategy and its automatization by simple information technology techniques. In the following, there is a description of concept and implementation of business system for testing a trading strategy, which is based on historical market data analysis. The next part of this work is focused on the demonstration system and its expansion possibilities. Whole aplication is created by means of scripting language PHP and Javascript, markup language HTML, using the MySQL database system.
APA, Harvard, Vancouver, ISO, and other styles
36

Kolář, Roman. "Klasifikace webových stránek." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2008. http://www.nusl.cz/ntk/nusl-235991.

Full text
Abstract:
This paper presents problem of automatic webpages classification using association rules based classifier. Classification problem is presented, as a one of  datamining technique, in context of mining knowledges from text data. There are many text document classification methods presented with highlighting benefits of classification methods using association rules. The main goal of work is adjusting selected classification method for relation data and design draft of webpages classifier, which classifies pages with the aid of visual properties - independent section layout on the web page, not (only) by textual data. There is also ARC-BC classification method presented as a selected method and as one of intriguing classificators, that derives accuracy and understandableness benefits of all other methods.
APA, Harvard, Vancouver, ISO, and other styles
37

Hopkins, Ashley R. "Privacy Within Photo-Sharing and Gaming Applications: Motivation and Opportunity and the Decision to Download." Ohio University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1556821782704244.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Middlebrooks, Sam E. "The COMPASS Paradigm For The Systematic Evaluation Of U.S. Army Command And Control Systems Using Neural Network And Discrete Event Computer Simulation." Diss., Virginia Tech, 2003. http://hdl.handle.net/10919/26605.

Full text
Abstract:
In todayâ s technology based society the rapid proliferation of new machines and systems that would have been undreamed of only a few short years ago has become a way of life. Developments and advances especially in the areas of digital electronics and micro-circuitry have spawned subsequent technology based improvements in transportation, communications, entertainment, automation, the armed forces, and many other areas that would not have been possible otherwise. This rapid â explosionâ of new capabilities and ways of performing tasks has been motivated as often as not by the philosophy that if it is possible to make something better or work faster or be more cost effective or operate over greater distances then it must inherently be good for the human operator. Taken further, these improvements typically are envisioned to consequently produce a more efficient operating system where the human operator is an integral component. The formal concept of human-system interface design has only emerged this century as a recognized academic discipline, however, the practice of developing ideas and concepts for systems containing human operators has been in existence since humans started experiencing cognitive thought. An example of a human system interface technology for communication and dissemination of written information that has evolved over centuries of trial and error development, is the book. It is no accident that the form and shape of the book of today is as it is. This is because it is a shape and form readily usable by human physiology whose optimal configuration was determined by centuries of effort and revision. This slow evolution was mirrored by a rate of technical evolution in printing and elsewhere that allowed new advances to be experimented with as part of the overall use requirement and need for the existence of the printed word and some way to contain it. Today, however, technology is advancing at such a rapid rate that evolutionary use requirements have no chance to develop along side the fast pace of technical progress. One result of this recognition is the establishment of disciplines like human factors engineering that have stated purposes and goals of systematic determination of good and bad human system interface designs. However, other results of this phenomenon are systems that get developed and placed into public use simply because new technology allowed them to be made. This development can proceed without a full appreciation of how the system might be used and, perhaps even more significantly, what impact the use of this new system might have on the operator within it. The U.S. Army has a term for this type of activity. It is called â stove-piped developmentâ . The implication of this term is that a system gets developed in isolation where the developers are only looking â upâ and not â aroundâ . They are thus concerned only with how this system may work or be used for its own singular purposes as opposed to how it might be used in the larger community of existing systems and interfaces or, even more importantly, in the larger community of other new systems in concurrent development. Some of the impacts for the Army from this mode of system development are communication systems that work exactly as designed but are unable to interface to other communications systems in other domains for battlefield wide communications capabilities. Having communications systems that cannot communicate with each other is a distinct problem in its own right. However, when developments in one industry produce products that humans use or attempt to use with products from totally separate developments or industries, the Army concept of product development resulting from stove-piped design visions can have significant implication on the operation of each system and the human operator attempting to use it. There are many examples that would illustrate the above concept, however, one that will be explored here is the Army effort to study, understand, and optimize its command and control (C2) operations. This effort is at the heart of a change in the operational paradigm in C2 Tactical Operations Centers (TOCs) that the Army is now undergoing. For the 50 years since World War II the nature, organization, and mode of the operation of command organizations within the Army has remained virtually unchanged. Staffs have been organized on a basic four section structure and TOCs generally only operate in a totally static mode with the amount of time required to move them to keep up with a mobile battlefield going up almost exponentially from lower to higher command levels. However, current initiatives are changing all that and while new vehicles and hardware systems address individual components of the command structures to improve their operations, these initiatives do not necessarily provide the environment in which the human operator component of the overall system can function in a more effective manner. This dissertation examines C2 from a system level viewpoint using a new paradigm for systematically examining the way TOCs operate and then translating those observations into validated computer simulations using a methodological framework. This paradigm is called COmputer Modeling Paradigm And Simulation of Systems (COMPASS). COMPASS provides the ability to model TOC operations in a way that not only includes the individuals, work groups and teams in it, but also all of the other hardware and software systems and subsystems and human-system interfaces that comprise it as well as the facilities and environmental conditions that surround it. Most of the current literature and research in this area focuses on the concept of C2 itself and its follow-on activities of command, control, communications (C3), command, control, communications, and computers (C4), and command, control, communications, computers and intelligence (C4I). This focus tends to address the activities involved with the human processes within the overall system such as individual and team performance and the commanderâ s decision-making process. While the literature acknowledges the existence of the command and control system (C2S), little effort has been expended to quantify and analyze C2Ss from a systemic viewpoint. A C2S is defined as the facilities, equipment, communications, procedures, and personnel necessary to support the commander (i.e., the primary decision maker within the system) for conducting the activities of planning, directing, and controlling the battlefield within the sector of operations applicable to the system. The research in this dissertation is in two phases. The overall project incorporates sequential experimentation procedures that build on successive TOC observation events to generate an evolving data store that supports the two phases of the project. Phase I consists of the observation of heavy maneuver battalion and brigade TOCs during peacetime exercises. The term â heavy maneuverâ is used to connotate main battle forces such as armored and mechanized infantry units supported by artillery, air defense, close air, engineer, and other so called combat support elements. This type of unit comprises the main battle forces on the battlefield. It is used to refer to what is called the conventional force structure. These observations are conducted using naturalistic observation techniques of the visible functioning of activities within the TOC and are augmented by automatic data collection of such things as analog and digital message traffic, combat reports generated by the computer simulations supporting the wargame exercise, and video and audio recordings where appropriate and available. Visible activities within the TOC include primarily the human operator functions such as message handling activities, decision-making processes and timing, coordination activities, and span of control over the battlefield. They also include environmental conditions, functional status of computer and communications systems, and levels of message traffic flows. These observations are further augmented by observer estimations of such indicators as perceived level of stress, excitement, and level of attention to the mission of the TOC personnel. In other words, every visible and available component of the C2S within the TOC is recorded for analysis. No a priori attempt is made to evaluate the potential significance of each of the activities as their contribution may be so subtle as to only be ascertainable through statistical analysis. Each of these performance activities becomes an independent variable (IV) within the data that is compared against dependent variables (DV) identified according to the mission functions of the TOC. The DVs for the C2S are performance measures that are critical combat tasks performed by the system. Examples of critical combat tasks are â attacking to seize an objectiveâ , â seizure of key terrainâ , and â river crossingsâ . A list of expected critical combat tasks has been prepared from the literature and subject matter expert (SME) input. After the exercise is over, the success of these critical tasks attempted by the C2S during the wargame are established through evaluator assessments, if available, and/or TOC staff self analysis and reporting as presented during after action reviews. The second part of Phase I includes datamining procedures, including neural networks, used in a constrained format to analyze the data. The term constrained means that the identification of the outputs/DV is known. The process was to identify those IV that significantly contribute to the constrained DV. A neural network is then constructed where each IV forms an input node and each DV forms an output node. One layer of hidden nodes is used to complete the network. The number of hidden nodes and layers is determined through iterative analysis of the network. The completed network is then trained to replicate the output conditions through iterative epoch executions. The network is then pruned to remove input nodes that do not contribute significantly to the output condition. Once the neural network tree is pruned through iterative executions of the neural network, the resulting branches are used to develop algorithmic descriptors of the system in the form of regression like expressions. For Phase II these algorithmic expressions are incorporated into the CoHOST discrete event computer simulation model of the C2S. The programming environment is the commercial programming language Micro Saintä running on a PC microcomputer. An interrogation approach was developed to query these algorithms within the computer simulation to determine if they allow the simulation to reflect the activities observed in the real TOC to within an acceptable degree of accuracy. The purpose of this dissertation is to introduce the COMPASS concept that is a paradigm for developing techniques and procedures to translate as much of the performance of the entire TOC system as possible to an existing computer simulation that would be suitable for analyses of future system configurations. The approach consists of the following steps: · Naturalistic observation of the real system using ethnographic techniques. · Data analysis using datamining techniques such as neural networks. · Development of mathematical models of TOC performance activities. · Integration of the mathematical into the CoHOST computer simulation. · Interrogation of the computer simulation. · Assessment of the level of accuracy of the computer simulation. · Validation of the process as a viable system simulation approach.<br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
39

Hlosta, Martin. "Modul pro shlukovou analýzu systému pro dolování z dat." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2010. http://www.nusl.cz/ntk/nusl-237158.

Full text
Abstract:
This thesis deals with the design and implementation of a cluster analysis module for currently developing datamining system DataMiner on FIT BUT. So far, the system lacked cluster analysis module. The main objective of the thesis was therefore to extend the system of such a module. Together with me, Pavel Riedl worked on the module. We have created a common part for all the algorithms so that the system can be easily extended to other clustering algorithms. In the second part, I extended the clustering module by adding three density based clustering aglorithms - DBSCAN, OPTICS and DENCLUE. Algorithms have been implemented and appropriate sample data was chosen to verify theirs functionality.
APA, Harvard, Vancouver, ISO, and other styles
40

Asp, Sandin Agnes. "Individuell marknadsföring : En kvalitativ studie om konsumenters uppfattning kring individanpassad marknadsföring gällande personlig integritet." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-18770.

Full text
Abstract:
Digitaliseringen av samhället har ökat kraftigt det senaste decenniet och hela 95% av Sveriges befolkning uppger att de använder internet dagligen. Det har också lett till att data som personer lämnar efter sig på nätet även den har ökat. Denna ökning i kombination med smartare analysverktyg för att utvinna information ur dessa datamängder har möjliggjort en ny typ av marknadsföring, den individanpassade marknadsföringen. Det innebär att företag riktar sina erbjudanden och annonser för den specifika konsumenten med hjälp av dennes egen data även kallat digitala fotspår. De digitala fotspår som företag använder sig utav kan vara köpbeteende, GPS- koordinater och tidigare sökhistorik på exempelvis Google. Dock finns det en risk att företag kan kränka den personliga integriteten när den individanpassade reklamen blir för specifik eller är baserad på känslig information om konsumenten. Detta leder in på studiens forskningsfråga som är ” Vad är konsumentens uppfattning kring individanpassad marknadsföring gällande den personliga integriteten?”. Denna forskningsfråga har även två delfrågor som är ” Vilka faktorer bidrar till att konsumenten i mindre utsträckning uppfattar individanpassad reklam som integritetskränkande?” och ” Vad uppfattas som integritetskränkande för konsumenten?”. Dessa delfrågor ska tillsammans hjälpa till att svara på huvudfrågan. Ordet konsument avser i denna studie de som handlar digitalt och använder sig av olika digitala tjänster där de blir exponerade för individanpassad reklam.Forskningsfrågan besvaras med hjälp av en kvalitativ metod och samtalsintervjuer där en induktiv med inslag av deduktiv tematisk analys används för att analysera materialet. Studien kommer fokusera på att undersöka individanpassad marknadsföring gällande integritet utifrån ett konsumentperspektiv och kommer därmed inte gå in på hur företag resonerar kring denna fråga. Syftet med studien är att ge företag en inblick i vilken information som konsumenter tycker är okej att de använder sig av för att individanpassa reklam samt när de anser att det blir integritetskränkande.Det resultat som studien kom fram till är att olika åtgärder från företag så som att ge konsumenten ökad kontroll över sin personliga information samt en ökad transparens kring vilken information som samlas in och används leder till ett ökat förtroende för företaget vilket i sin tur bidrar till ett positivare mottagande av reklamen. Vidare uppfattas reklam som är baserat på konsumentens köpbeteende och tidigare sökningar3som minst integritetskränkande medan reklam som baseras på ljudupptag, GPS- koordinater samt vilka konsumenten integrerar med som mest integritetskränkande. Företaget som ligger bakom en reklam som uppfattas som integritetskränkande kan få en skadad image vilket kan leda till att konsumenten i framtiden väljer ett annat företag.
APA, Harvard, Vancouver, ISO, and other styles
41

Felício, Crícia Zilda. "VISTREE: uma linguagem visual para análise de padrões arborescentes e para especificação de restrições em um ambiente de mineração de árvores." Universidade Federal de Uberlândia, 2008. https://repositorio.ufu.br/handle/123456789/12459.

Full text
Abstract:
The frequent pattern mining in data represented by more complex structures like trees and graphs are growing lately. Among the reasons for this improvement is the fact that the tree and graph patterns has more information than sequential patterns, besides there is the possibility of usage of this type of mining in several areas like XML Mining,Web Mining and Bioinformatic. A problem that occurs in mining patterns in general is the great amount of patterns generated. Being some of them not interesting for users. The decrease in the quantity of patterns generated can be done restricting the patterns types produced through the user constraint. Even incorporating constraints in the mining process, the quantity of tree pattern mined is large, what make necessary one tool for pattern analysis, possibiliting the user specify queries to extract in the mass of mined patterns that satisfy the criteria of the selection in the query. The pattern mining with constraint, aim to obtain as a result of the process of mining only the patterns with the real interest for the user. The constraint about patterns will be represented related to the structure of them. One form to represent the sequential pattern mining would be through regular expressions, for the tree pattern mining, the tree automata. The use of constraints solve the problem to generate a large amout of patterns, but the mechanism used to represent the constraint is still constituted in another problem that would be the difficult for a user do the input of constraint using this mechanism. The queries about frequent patterns are made according to the characteristics of the data. One way to extract specific patterns in data structured like trees is to store the specific patterns in a XML file and make queries using one of the query languages for XML files. Among the XML query languages, the XQuery language is very used, mainly by the fact that it s similar in semantic to SQL, the query language for databases. The frequently patterns queries could be made using this language, but, for this the user would have to know and be capable to express queries through it. In this research it will be presented the visual language VisTree that consists of visual tool to be used in a phase of preprocess for specification the user preferences that involves the format of the tree pattern that are interested to him, as in a phase of postprocess to analyze the mined patterns. The VisTree sintaxe is based on in a fragment of the Tree Pattern language[Chen et al. 2003, Che and Liu 2005], the core of XPath 1.0 [Clark and Derose 1999, Olteanu et al. 2002]. However, the semantic of VisTree differs from the semantic of these languages in the sense that VisTree queries return the sets of tree patterns. VisTree uses a XQuery language [Chamberlin 2003, Katz et al. 2003] like query process mechanism: the visual queries specified in VisTree are mapped in XQuery queries and theirs responses are adapted to fit the format returned by VisTree. VisTree works like a XQuery front-end. A complete system of mining tree pattern was developed to test and validate the use of VisTree language in specific contexts of applications. The system was made in a modular form, in a way to allow that new applications could be incorporated in a simple way. This research show the application of tree mining with constraint in the areas of XML Mining andWeb Mining through study case. In both applications, the system use the VisTree language in the preprocess modules (constraint input) and analysis of patterns (query input).<br>A mineração de padrões freqüentes em dados representados por estruturas mais complexas como árvores e grafos vêm crescendo muito nos últimos tempos. Entre as razões para esse crescimento está o fato do padrão arborescente ou em forma de grafo possuir mais informações do que os padrões seqüenciais, e na possibilidade de aplicação desse tipo de mineração em várias áreas como XML Mining, Web Mining e Bioinformática. Um problema que ocorre na mineração de padrões em geral é a grande quantidade de padrões gerados; sendo que muitos deles nem são do interesse do usuário. A diminuição da quantidade de padrões gerados pode ser feita restringido o tipo de padrão produzido através de especificações do usuário. Mesmo incorporando restrições no processo de mineração, a quantidade de padrões arborescentes minerados é grande, o que torna necessário uma ferramenta de análise dos padrões, possibilitando ao usuário especificar consultas para extrair da massa de padrões minerados aqueles que satisfazem os critérios de seleção da consulta. A mineração de padrões com restrição, visa obter como resultado de um processo de mineração apenas os padrões de real interesse do usuário. Uma restrição sobre padrões será representada de acordo com a estrutura dos mesmos. Para a mineração de padrões seqüencias uma forma de representá-la seria através de expressões regulares, para a mineração de padrões arborescentes, os autômatos de árvore. O uso de restrições resolve o problema da geração de uma grande quantidade de padrões, mas o mecanismo usado para representar a restrição ainda se constitui em um outro problema que seria a dificuldade de um usuário em fazer a entrada da restrição utilizando esse mecanismo. As consultas sobre padrões freqüentes são feitas de acordo com as características dos dados. Uma forma de extrair padrões específicos em dados estruturados como árvores é armazenar os padrões freqüentes em um documento XML e efetuar uma consulta usando uma das linguagens de consulta a documentos XML. Dentre as linguagens de consulta XML, a linguagem XQuery é muito utilizada, principalmente pelo fato de ser similar semanticamente a SQL (linguaguem de consulta a banco de dados). A consulta aos padrões freqüentes poderia então ser feita utilizando essa linguagem, mas para isso o usuário teria que conhecer e ser capaz de expressar sua consulta através dela. Nesse trabalho é apresentada a linguagem visual VisTree, que consiste em uma ferramenta visual a ser utilizada tanto numa fase de Pré-processamento para a especificação das preferências do usuário no que se refere ao formato dos padrões arborescentes que lhe interessa, quanto numa fase de pós-processamento para a análise dos padrões minerados. A sintaxe da VisTree se baseia na sintaxe de um fragmento simples da linguagem Tree Pattern [Miklau and Suciu 2004, Chen et al. 2003], na qual a linguagem XPath 1.0 [Clark and Derose 1999, Olteanu et al. 2002] também se baseou. Entretanto, a semântica de VisTree difere da semântica destas linguagens no sentido de que consultas de VisTree retornam conjuntos de padrões arborescentes. A VisTree utiliza a linguagem XQuery [Chamberlin 2003, Katz et al. 2003] como mecanismo de processamento de consultas: as consultas visuais especificadas em VisTree são mapeadas em consultas da XQuery e suas respostas adaptadas para se adequarem ao formato retornado por VisTree. Um sistema completo de mineração de padrões arborescentes foi desenvolvido para testar e validar o uso da linguagem VisTree em contextos específicos de aplicações. O sistema foi construído de forma modular para que novas aplicações possam ser incorporadas de maneira simples. A aplicação de mineração de árvores com restrição nas áreas de XML Mining e Web Mining foi feita através de um estudo de caso. Nas duas aplicações, o sistema utiliza a linguagem VisTree nos módulos que fazem a tarefa de Pré-Processamento (entrada da restrição) e de Análise de Padrões (entrada da consulta).<br>Mestre em Ciência da Computação
APA, Harvard, Vancouver, ISO, and other styles
42

Toussaint, Ben-Manson. "Apprentissage automatique à partir de traces multi-sources hétérogènes pour la modélisation de connaissances perceptivo-gestuelles." Thesis, Université Grenoble Alpes (ComUE), 2015. http://www.theses.fr/2015GREAM063/document.

Full text
Abstract:
Les connaissances perceptivo-gestuelles sont difficiles à saisir dans les Systèmes Tutoriels Intelligents. Ces connaissances sont multimodales : elles combinent des connaissances théoriques, ainsi que des connaissances perceptuelles et gestuelles. Leur enregistrement dans les Systèmes Tutoriels Intelligents implique l'utilisation de plusieurs périphériques ou capteurs couvrant les différentes modalités des interactions qui les sous-tendent. Les « traces » de ces interactions –aussi désignées sous le terme "traces d'activité"- constituent la matière première pour la production de services tutoriels couvrant leurs différentes facettes. Les analyses de l'apprentissage ou les services tutoriels privilégiant une facette de ces connaissances au détriment des autres, sont incomplets. Cependant, en raison de la diversité des périphériques, les traces d'activité enregistrées sont hétérogènes et, de ce fait, difficiles à modéliser et à traiter. Mon projet doctoral adresse la problématique de la production de services tutoriels adaptés à ce type de connaissances. Je m'y intéresse tout particulièrement dans le cadre des domaines dits mal-définis. Le cas d'étude de mes recherches est le Système Tutoriel Intelligent TELEOS, un simulateur dédié à la chirurgie orthopédique percutanée. Les propositions formulées se regroupent sous trois volets : (1) la formalisation des séquences d'interactions perceptivo-gestuelles ; (2) l'implémentation d'outils capables de réifier le modèle conceptuel de leur représentation ; (3) la conception et l'implémentation d'outils algorithmiques favorisant l'analyse de ces séquences d'un point de vue didactique<br>Perceptual-gestural knowledge is multimodal : they combine theoretical and perceptual and gestural knowledge. It is difficult to capture in Intelligent Tutoring Systems. In fact, its capture in such systems involves the use of multiple devices or sensors covering all the modalities of underlying interactions. The "traces" of these interactions -also referred to as "activity traces"- are the raw material for the production of key tutoring services that consider their multimodal nature. Methods for "learning analytics" and production of "tutoring services" that favor one or another facet over others, are incomplete. However, the use of diverse devices generates heterogeneous activity traces. Those latter are hard to model and treat.My doctoral project addresses the challenge related to the production of tutoring services that are congruent to this type of knowledge. I am specifically interested to this type of knowledge in the context of "ill-defined domains". My research case study is the Intelligent Tutoring System TELEOS, a simulation platform dedicated to percutaneous orthopedic surgery.The contributions of this thesis are threefold : (1) the formalization of perceptual-gestural interactions sequences; (2) the implementation of tools capable of reifying the proposed conceptual model; (3) the conception and implementation of algorithmic tools fostering the analysis of these sequences from a didactic point of view
APA, Harvard, Vancouver, ISO, and other styles
43

Krásný, Michal. "Systém pro dolování z dat v prostředí Oracle." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2008. http://www.nusl.cz/ntk/nusl-236015.

Full text
Abstract:
This MSc Project deals with the system of Knowledge Discovery in Databases. It is a client application which uses the Oracle Data Mining Server's 10.g Release 2 (10.2) services. The application is implemented in Java, the graphical user interface is built on the NetBeans Rich Client Platform. The theoretical part introduces the Knowledge Discovery in Databases, while the practical part describes functionality of the original system, it's deficiencies, documents sollutions of theese deficiencies and there are proposed improvements for further development. The goal of this project is to modify the system to increase the application usability.
APA, Harvard, Vancouver, ISO, and other styles
44

Šebek, Michal. "Rozšíření funkcionality systému pro dolování z dat na platformě NetBeans." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2009. http://www.nusl.cz/ntk/nusl-236729.

Full text
Abstract:
Databases increase by new data continually. A process called Knowledge Discovery in Databases has been defined for analyzing these data and new complex systems has been developed for its support. Developing of one of this systems is described in this thesis. Main goal is to analyse the actual state of implementation of this system which is based on the Java NetBeans Platform and the Oracle database system and to extend it by data preprocessing algorithms and the source data analysis. Implementation of data preprocessing components and changes in kernel of this system are described in detail in this thesis.
APA, Harvard, Vancouver, ISO, and other styles
45

Plantié, Michel. "Extraction automatique de connaissances pour la décision multicritère." Phd thesis, Ecole Nationale Supérieure des Mines de Saint-Etienne, 2006. http://tel.archives-ouvertes.fr/tel-00353770.

Full text
Abstract:
Cette thèse, sans prendre parti, aborde le sujet délicat qu'est l'automatisation cognitive. Elle propose la mise en place d'une chaîne informatique complète pour supporter chacune des étapes de la décision. Elle traite en particulier de l'automatisation de la phase d'apprentissage en faisant de la connaissance actionnable--la connaissance utile à l'action--une entité informatique manipulable par des algorithmes.<br />Le modèle qui supporte notre système interactif d'aide à la décision de groupe (SIADG) s'appuie largement sur des traitements automatiques de la connaissance. Datamining, multicritère et optimisation sont autant de techniques qui viennent se compléter pour élaborer un artefact de décision qui s'apparente à une interprétation cybernétique du modèle décisionnel de l'économiste Simon. L'incertitude épistémique inhérente à une décision est mesurée par le risque décisionnel qui analyse les facteurs discriminants entre les alternatives. Plusieurs attitudes dans le contrôle du risque décisionnel peuvent être envisagées : le SIADG peut être utilisé pour valider, vérifier ou infirmer un point de vue. Dans tous les cas, le contrôle exercé sur l'incertitude épistémique n'est pas neutre quant à la dynamique du processus de décision. L'instrumentation de la phase d'apprentissage du processus décisionnel conduit ainsi à élaborer l'actionneur d'une boucle de rétroaction visant à asservir la dynamique de décision. Notre modèle apporte un éclairage formel des liens entre incertitude épistémique, risque décisionnel et stabilité de la décision.<br />Les concepts fondamentaux de connaissance actionnable (CA) et d'indexation automatique sur lesquels reposent nos modèles et outils de TALN sont analysés. La notion de connaissance actionnable trouve dans cette vision cybernétique de la décision une interprétation nouvelle : c'est la connaissance manipulée par l'actionneur du SIADG pour contrôler la dynamique décisionnelle. Une synthèse rapide des techniques d'apprentissage les plus éprouvées pour l'extraction automatique de connaissances en TALN est proposée. Toutes ces notions et techniques sont déclinées sur la problématique spécifique d'extraction automatique de CAs dans un processus d'évaluation multicritère. Enfin, l'exemple d'application d'un gérant de vidéoclub cherchant à optimiser ses investissements en fonction des préférences de sa clientèle reprend et illustre le processus informatisé dans sa globalité.
APA, Harvard, Vancouver, ISO, and other styles
46

Městka, Milan. "Tvorba databázové aplikace a řešení pro Business Intelligence." Master's thesis, Vysoké učení technické v Brně. Fakulta podnikatelská, 2012. http://www.nusl.cz/ntk/nusl-223400.

Full text
Abstract:
Theme of this master’s thesis is design of software support for business intelligence. Design is realized in cooperation with corporation ZZN Pelhřimov a.s. Introduction is focused on theoretical description of business intelligence and datamining and also on development environment in which is project designed. Corporation is characterised also in introduction. Main part contains data collecting and definition of individual modules. In conclusion of this thesis will be several types of analysis from collected data and then according to these analysis, we can draw measures to improve current state of corporation.
APA, Harvard, Vancouver, ISO, and other styles
47

Naňo, Andrej. "Automatické generování testovacích dat informačních systémů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2021. http://www.nusl.cz/ntk/nusl-445520.

Full text
Abstract:
ISAGENis a tool for the automatic generation of structurally complex test inputs that imitate real communication in the context of modern information systems . Complex, typically tree-structured data currently represents the standard means of transmitting information between nodes in distributed information systems. Automatic generator ISAGENis founded on the methodology of data-driven testing and uses concrete data from the production environment as the primary characteristic and specification that guides the generation of new similar data for test cases satisfying given combinatorial adequacy criteria. The main contribution of this thesis is a comprehensive proposal of automated data generation techniques together with an implementation, which demonstrates their usage. The created solution enables testers to create more relevant testing data, representing production-like communication in information systems.
APA, Harvard, Vancouver, ISO, and other styles
48

Kubelka, Martin. "Datamining sociálních sítí." Master's thesis, 2012. http://www.nusl.cz/ntk/nusl-311122.

Full text
Abstract:
This paper is about application of various data mining methods in social networks and social media area. It reveals basic principles of social media with the aim to high information potential of usage of the data from social networks. This is demonstrated on selected data mining methods, especially Social Network Analysis and Sentiment Analysis. Other opportunities of using social media data are shown in chapter about Social Media Monitoring tools. All these chapters are supplemented by practical examples and particular researches. Last chapter reveals visions and threats, which can bring data mining in the future. Keywords Data mining, social networks, social media, social network analysis, sentiment analysis, social media monitoring
APA, Harvard, Vancouver, ISO, and other styles
49

Lin, Chia-lin, and 林佳霖. "Apply Datamining to Union Defense System." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/05776613386519184101.

Full text
Abstract:
碩士<br>國立清華大學<br>資訊工程學系<br>91<br>This thesis discusses a modified Union Defense System. The main purpose of the Union Defense System is to unite the known IDSs on the network. A designated Global Policy Server is used to collect and analyze the events from all IDSs, once a DDoS attack is detected, the Global Policy Server will inform the IDSs and then block the attack. In the original design, the signature used by Global Policy Server to detect is fixed. Although the administrator of the system can adjust the parameters of the signature to fit the status of the network, but it still has drawbacks to confront the highly changeable network. If an aggressor happens to know the signature and the parameters, he will be able to send out a specially designed attack to avoid the detection of the Global Policy Server. We purpose a modification to overcome the leak point. We use a well-developed technology, Datamining, to produce the signature for the use of the Union Defense System. First, some labeling and standardization processes are applied to the event logs from IDSs, and the preprocessed data will be feed into some Datamining tools. Thus the knowledge mined by the tools, will be used by Global Policy Server to detect DDoS attack. Since the information is mined from the same sort of data, the detection rate will be better than the original one. We also develop a prototype of the system. And three Datamining tools are introduced to have some comparison and to exam the correctness of the framework.
APA, Harvard, Vancouver, ISO, and other styles
50

Silva, Daniel Fernando Alves da. "Geração Sintética de Microdados utilizando algorítmos de datamining." Master's thesis, 2015. https://repositorio-aberto.up.pt/handle/10216/80015.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!