Dissertations / Theses on the topic 'Prédiction séquentielle'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 26 dissertations / theses for your research on the topic 'Prédiction séquentielle.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Stoltz, Gilles. "Information incomplète et regret interne en prédiction de suites inidividuelles." Paris 11, 2005. https://tel.archives-ouvertes.fr/tel-00009759.
Full textThis thesis takes place within the theory of prediction of individual sequences. The latter avoids any modelling of the data and aims at providing some techniques of robust prediction and discuss their possibilities, limitations, and difficulties. It considers issues arising from the machine learning as well as from the game-theory communities, and these are dealt with thanks to statistical techniques, including martingale concentration inequalities and minimax lower bound techniques. The obtained results consist, among others, in external and internal regret minimizing strategies for label-efficient prediction or in games with partial monitoring. Such strategies are valuable for the on-line pricing problem or for on-line bandwidth allocation. We then focus on internal regret for general convex losses. We consider first the case of on-line portfolio selection, for which simulations on real data are provided, and generalize later the results to show how players can learn correlated equilibria in games with compact sets of strategies
Stoltz, Gilles. "Information incomplète et regret interne en prédiction de suites individuelles." Phd thesis, Université Paris Sud - Paris XI, 2005. http://tel.archives-ouvertes.fr/tel-00009759.
Full textPrémillieu, Nathanaël. "Améliorer la performance séquentielle à l'ère des processeurs massivement multicœurs." Phd thesis, Université Rennes 1, 2013. http://tel.archives-ouvertes.fr/tel-00914898.
Full textPrémillieu, Nathanaël. "Améliorer la performance séquentielle à l’ère des processeurs massivement multicœurs." Thesis, Rennes 1, 2013. http://www.theses.fr/2013REN1S071/document.
Full textComputers are everywhere and the need for always more computation power has pushed the processor architects to find new ways to increase performance. The today's tendency is to replicate execution core on the same die to parallelize the execution. If it goes on, processors will become manycores featuring hundred to a thousand cores. However, Amdahl's law reminds us that increasing the sequential performance will always be vital to increase global performance. A perfect way to increase sequential performance is to improve how branches are executed because they limit instruction level parallelism. The branch prediction is the most studied solution, its interest greatly depending on its accuracy. In the last years, this accuracy has been continuously improved up to reach a hardly exceeding limit. An other solution is to suppress the branches by replacing them with a construct based on predicated instructions. However, the execution of predicated instructions on out-of-order processors comes up with several problems like the multiple definition problem. This study investigates these two aspects of the branch treatment. The first part is about branch prediction. A way to improve it without increasing the accuracy is to reduce the coast of a branch misprediction. This is possible by exploiting control flow reconvergence and control independence. The work done on the wrong path on instructions common to the two paths is saved to be reused on the correct path. The second part is about predicated instructions. We propose a solution to the multiple definition problem by selectively predicting the predicate values. A selective replay mechanism is used to reduce the cost of a predicate misprediction
Kalaitzidis, Kleovoulos. "Advanced speculation to increase the performance of superscalar processors." Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1S007.
Full textEven in the multicore era, making single cores faster is paramount to achieve high- performance computing, given the existence of programs that are either inherently sequential or expose non-negligible sequential parts. Sequential performance has been essentially improving with the scaling of the processor structures that enable instruction-level parallelism (ILP). However, as modern microarchitectures continue to extract more ILP by employing larger instruction windows, true data dependencies remain a major performance bottleneck. Value Prediction (VP) and Load-Address Prediction (LAP) are two developing techniques that allow to overcome this obstacle and harvest more ILP by enabling the execution of instructions in a data-wise speculative manner. This thesis proposes mechanisms that are related with VP and LAP and lead to effectively higher performance improvements. First, VP is examined in an ISA-aware manner, that discloses the impact of certain ISA particularities on the anticipated speedup. Second, a novel binary-based VP model is introduced, namely VSEP, that allows to exploit certain value patterns that although they are encountered frequently, they cannot be captured by previous works. VSEP improves the obtained speedup by 19% and also, by virtue of its structure, it mitigates the cost of predicting values wider than 64 bits. By adapting this approach to perform LAP allows to predict the memory addresses of 48% of the committed loads. Eventually, a microarchitecture that leverages carefully this LAP mechanism can execute 32% of the committed loads early
Ziat, Ali Yazid. "Apprentissage de représentation pour la prédiction et la classification de séries temporelles." Electronic Thesis or Diss., Paris 6, 2017. http://www.theses.fr/2017PA066324.
Full textThis thesis deals with the development of time series analysis methods. Our contributions focus on two tasks: time series forecasting and classification. Our first contribution presents a method of prediction and completion of multivariate and relational time series. The aim is to be able to simultaneously predict the evolution of a group of time series connected to each other according to a graph, as well as to complete the missing values in these series (which may correspond for example to a failure of a sensor during a given time interval). We propose to use representation learning techniques to forecast the evolution of the series while completing the missing values and taking into account the relationships that may exist between them. Extensions of this model are proposed and described: first in the context of the prediction of heterogeneous time series and then in the case of the prediction of time series with an expressed uncertainty. A prediction model of spatio-temporal series is then proposed, in which the relations between the different series can be expressed more generally, and where these can be learned.Finally, we are interested in the classification of time series. A joint model of metric learning and time-series classification is proposed and an experimental comparison is conducted
Heinrich, Franz. "Modélisation, prédiction et optimisation de la consommation énergétique d'applications MPI à l'aide de SimGrid." Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAM018/document.
Full textThe High-Performance Computing (HPC) community is currently undergoingdisruptive technology changes in almost all fields, including a switch towardsmassive parallelism with several thousand compute cores on a single GPU oraccelerator and new, complex networks. Powering a massively parallel machinebecomesThe energy consumption of these machines will continue to grow in the future,making energy one of the principal cost factors of machine ownership. This explainswhy even the classic metric "flop/s", generally used to evaluate HPC applicationsand machines, is widely regarded as to be replaced by an energy-centric metric"flop/watt".One approach to predict energy consumption is through simulation, however, a pre-cise performance prediction is crucial to estimate the energy faithfully. In this thesis,we contribute to the performance and energy prediction of HPC architectures. Wepropose an energy model which we have implemented in the open source SimGridsimulator. We validate this model by carefully and systematically comparing itwith real experiments. We leverage this contribution to both evaluate existingand propose new DVFS governors that are part*icularly designed to suit the HPCcontext
Bou, Rjeily Carine. "Data mining and learning for markers extraction to improve the medical monitoring platforms." Thesis, Bourgogne Franche-Comté, 2019. http://www.theses.fr/2019UBFCA012.
Full textThe World Health Organization accords that about 31 % of deaths worldwide are caused by heart diseases every year. Data mining is a process of extracting interesting non-trivial, previously unknownand potentially useful information from huge amount of data. Medical data mining is the science of investigating medical data (i.e. vital signs) to explore significant information. Analyzing and interpreting the huge amount of complicated data into an appropriate therapeutic diagnosis with the right results is quite challenging task. Still, the fact that it is possible to combine these factors up to a certain point and extract a usually successful treatment, prevention and recovery plan is a sign of the good things to come. Thanks to that, it is now possible to improve patients’ quality of life, prevent condition worsening while maintaining medical costs at the decrease. This explains the increasing popularity in the usage and application of machine learning techniques to analyze, predict and classify medical data. As a first contribution, we studied many sequential patterns algorithms that are promising techniques in exploring data and we classified them in order to choose an appropriate one for predicting Heart Failure classes and presence. After comparing all the algorithms and implementing them on the same medical dataset, the CPT+ a sequence prediction algorithm has been chosen as it gave the most accurate results reaching an accuracy of 90.5% in predicting heart failure and its classes. By using the CPT+ algorithm with real patients dataset, we predicted heart failure 10 to 12 days prior. Thereafter, we switched our studies to time series strategy, and worked on real data extracted from real patients. 5 parameters were extracted from 3 patients over the course of a few years. The Random Tree algorithm yielded more the 85% correct predictions of heart failure 7 days prior
Zuo, Jingwei. "Apprentissage de représentations et prédiction pour des séries-temporelles inter-dépendantes." Electronic Thesis or Diss., université Paris-Saclay, 2022. http://www.theses.fr/2022UPASG038.
Full textTime series is a common data type that has been applied to enormous real-life applications, such as financial analysis, medical diagnosis, environmental monitoring, astronomical discovery, etc. Due to its complex structure, time series raises several challenges in their data processing and mining. The representation of time series plays a key role in data mining tasks and machine learning algorithms for time series. Yet, a few methods consider the interrelation that may exist between different time series when building the representation. Moreover, the time series mining requires considering not only the time series' characteristics in terms of data complexity but also the concrete application scenarios where the data mining task is performed to build task-specific representations.In this thesis, we will study different time series representation approaches that can be used in various time series mining tasks, while capturing the relationships among them. We focus specifically on modeling the interrelations between different time series when building the representations, which can be the temporal relationship within each data source or the inter-variable relationship between various data sources. Accordingly, we study the time series collected from various application contexts under different forms. First, considering the temporal relationship between the observations, we learn the time series in a dynamic streaming context, i.e., time series stream, for which the time series data is continuously generated from the data source. Second, for the inter-variable relationship, we study the multivariate time series (MTS) with data collected from multiple data sources. Finally, we study the MTS in the Smart City context, when each data source is given a spatial position. The MTS then becomes a geo-located time series (GTS), for which the inter-variable relationship requires more modeling efforts with the external spatial information. Therefore, for each type of time series data collected from distinct contexts, the interrelations between the time series observations are emphasized differently, on the temporal or (and) variable axis.Apart from the data complexity from the interrelations, we study various machine learning tasks on time series in order to validate the learned representations. The high-level learning tasks studied in this thesis consist of time series classification, semi-supervised time series learning, and time series forecasting. We show how the learned representations connect with different time series learning tasks under distinct application contexts. More importantly, we conduct the interdisciplinary study on time series by leveraging real-life challenges in machine learning tasks, which allows for improving the learning model's performance and applying more complex time series scenarios.Concretely, for these time series learning tasks, our main research contributions are the following: (i) we propose a dynamic time series representation learning model in the streaming context, which considers both the characteristics of time series and the challenges in data streams. We claim and demonstrate that the Shapelet, a shape-based time series feature, is the best representation in such a dynamic context; (ii) we propose a semi-supervised model for representation learning in multivariate time series (MTS). The inter-variable relationship over multiple data sources is modeled in a real-life context, where the data annotations are limited; (iii) we design a geo-located time series (GTS) representation learning model for Smart City applications. We study specifically the traffic forecasting task, with a focus on the missing-value treatment within the forecasting algorithm
Ziat, Ali Yazid. "Apprentissage de représentation pour la prédiction et la classification de séries temporelles." Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066324/document.
Full textThis thesis deals with the development of time series analysis methods. Our contributions focus on two tasks: time series forecasting and classification. Our first contribution presents a method of prediction and completion of multivariate and relational time series. The aim is to be able to simultaneously predict the evolution of a group of time series connected to each other according to a graph, as well as to complete the missing values in these series (which may correspond for example to a failure of a sensor during a given time interval). We propose to use representation learning techniques to forecast the evolution of the series while completing the missing values and taking into account the relationships that may exist between them. Extensions of this model are proposed and described: first in the context of the prediction of heterogeneous time series and then in the case of the prediction of time series with an expressed uncertainty. A prediction model of spatio-temporal series is then proposed, in which the relations between the different series can be expressed more generally, and where these can be learned.Finally, we are interested in the classification of time series. A joint model of metric learning and time-series classification is proposed and an experimental comparison is conducted
Issartel, Yann. "Inférence sur des graphes aléatoires." Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASM019.
Full textThis thesis lies at the intersection of the theories of non-parametric statistics and statistical learning. Its goal is to provide an understanding of statistical problems in latent space random graphs. Latent space models have emerged as useful probabilistic tools for modeling large networks in various fields such as biology, marketing or social sciences. We first define an identifiable index of the dimension of the latent space and then a consistent estimator of this index. More generally, such identifiable and interpretable quantities alleviate the absence of identifiability of the latent space itself. We then introduce the pair-matching problem. From a non-observed graph, a strategy sequentially queries pairs of nodes and observes the presence/absence of edges. Its goal is to discover as many edges as possible with a fixed budget of queries. For this bandit type problem, we study optimal regrets in stochastic block models and random geometric graphs. Finally, we are interested in estimating the positions of the nodes in the latent space, in the particular situation where the space is a circle in the Euclidean plane. For each of the three problems, we obtain procedures that achieve the statistical optimal performance, as well as efficient procedures with theoretical guarantees. These algorithms are analysed from a non-asymptotic viewpoint, relying in particular on concentration inequalities
Li, Yang. "Patient-specific gating scheme for thoracoabdominal tumor radiotherapy guided by magnetic resonance imaging." Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS015.
Full textThe ultimate aim of this paper is to develop an end-to-end gating system for real-time motion compensation during lung cancer and liver cancer treatment on the Elekta Unity. This system will monitor and automatically locate the three-dimensional spatial position of the tumor in real-time, and predict the tumor’s motion trajectory in the Superior-Inferior (SI), Left-Right (LR), and Anterior-Posterior (AP) directions in advance. Based on the set gating rules, a unique gating signal will be generated to control the beam on and off during radiotherapy, thereby compensating for the inaccuracy of dose delivery due to respiratory motion. To achieve this goal, the following steps have been carried out : 1. We proposed a tumor tracking workflow based on KCF, addressing the issues of time consumption and accuracy in tumor tracking using 2D Cine-MRI. Firstly, we verified the efficiency and accuracy of KCF in 2D Cine-MRI tumor tracking. By calculating the centroid, we improved the situation where the fixed-size template generated errors when the tumor shape changed, thus enhancing the tracking accuracy. In particular, we focused on the tracking in the SI direction by optimizing the selection of coronal slices or sagittal slices to determine the optimal position of the tumor in the SI direction. 2. We proposed a patient-specific transfer C-NLSTM model for real-time prediction of tumor motion, addressing the issue of insufficient training data. We constructed a C-NLSTM model, and introduced transfer learning to fully leverage the rich knowledge and feature representation capabilities embedded in the pretrained model, while fine-tuning is conducted based on specific patient data to achieve high-precision prediction of tumor motion. Through this approach, the model can be trained with only two minutes of patient-specific data, effectively overcoming the challenge of data acquisition. 3. We proposed an efficient gating signal prediction method, overcoming the challenge of precise predictions in 2D Cine-MRI with limited sampling frequencies. We validated the effectiveness of linear regression for predicting internal organ or tumor motion in 2D MR cine. And we proposed an online gating signal prediction scheme based on ALR to enhance the accuracy of gating radiotherapy for liver and lung cancers. 4. We proposed an end-to-end gating system based on 2D Cine-MRI for the Elekta Unity MRgRT. It enables real-time monitoring and automatic localization of the tumor’s 3D spatial position, prediction of tumor motion in three directions, and fitting an optimal cuboid (gating threshold) for each patient based on the tumor’s motion range. Additionally, we explored various approaches to derive 3D gating signals based on tumor motion in one, two, or three directions, aiming to cater to different patient treatment needs. Finally, the results of dosimetric validation demonstrate that the proposed system can effectively enhance the protection of OAR
Lajugie, Rémi. "Prédiction structurée pour l’analyse de données séquentielles." Thesis, Paris, Ecole normale supérieure, 2015. http://www.theses.fr/2015ENSU0024/document.
Full textIn this manuscript, we consider structured machine learning problems and consider more precisely the ones involving sequential structure. In a first part, we consider the problem of similarity measure learning for two tasks where sequential structure is at stake: (i) the multivariate change-point detection and (ii) the time warping of pairs of time series. The methods generally used to solve these tasks rely on a similarity measure to compare timestamps. We propose to learn a similarity measure from fully labelled data, i.e., signals already segmented or pairs of signals for which the optimal time warping is known. Using standard structured prediction methods, we present algorithmically efficient ways for learning. We propose to use loss functions specifically designed for the tasks. We validate our approach on real-world data. In a second part, we focus on the problem of weak supervision, in which sequential data are not totally labeled. We focus on the problem of aligning an audio recording with its score. We consider the score as a symbolic representation giving: (i) a complete information about the order of events or notes played and (ii) an approximate idea about the expected shape of the alignment. We propose to learn a classifier for each note using this information. Our learning problem is based onthe optimization of a convex function that takes advantage of the weak supervision and of the sequential structure of data. Our approach is validated through experiments on the task of audio-to-score on real musical data
Çinar, Yagmur Gizem. "Prédiction de séquences basée sur des réseaux de neurones récurrents dans le contexte des séries temporelles et des sessions de recherche d'information." Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAM079.
Full textThis thesis investigates challenges of sequence prediction in different scenarios such as sequence prediction using recurrent neural networks (RNNs) in the context of time series and information retrieval (IR) search sessions. Predicting the unknown values that follow some previously observed values is basically called sequence prediction.It is widely applicable to many domains where a sequential behavior is observed in the data. In this study, we focus on two different types of sequence prediction tasks: time series forecasting and next query prediction in an information retrieval search session.Time series often display pseudo-periods, i.e. time intervals with strong correlation between values of time series. Seasonal changes in weather time series or electricity usage at day and night time are some examples of pseudo-periods. In a forecasting scenario, pseudo-periods correspond to the difference between the positions of the output being predicted and specific inputs.In order to capture periods in RNNs, one needs a memory of the input sequence. Sequence-to-sequence RNNs (with attention mechanism) reuse specific (representations of) input values to predict output values. Sequence-to-sequence RNNs with an attention mechanism seem to be adequate for capturing periods. In this manner, we first explore the capability of an attention mechanism in that context. However, according to our initial analysis, a standard attention mechanism did not perform well to capture the periods. Therefore, we propose a period-aware content-based attention RNN model. This model is an extension of state-of-the-art sequence-to-sequence RNNs with attention mechanism and it is aimed to capture the periods in time series with or without missing values.Our experimental results with period-aware content-based attention RNNs show significant improvement on univariate and multivariate time series forecasting performance on several publicly available data sets.Another challenge in sequence prediction is the next query prediction. The next query prediction helps users to disambiguate their search query, to explore different aspects of the information they need or to form a precise and succint query that leads to higher retrieval performance. A search session is dynamic, and the information need of a user might change over a search session as a result of the search interactions. Furthermore, interactions of a user with a search engine influence the user's query reformulations. Considering this influence on the query formulations, we first analyze where the next query words come from? Using the analysis of the sources of query words, we propose two next query prediction approaches: a set view and a sequence view.The set view adapts a bag-of-words approach using a novel feature set defined based on the sources of next query words analysis. Here, the next query is predicted using learning to rank. The sequence view extends a hierarchical RNN model by considering the sources of next query words in the prediction. The sources of next query words are incorporated by using an attention mechanism on the interaction words. We have observed using sequence approach, a natural formulation of the problem, and exploiting all sources of evidence lead to better next query prediction
Gerchinovitz, Sébastien. "Prédiction de suites individuelles et cadre statistique classique : étude de quelques liens autour de la régression parcimonieuse et des techniques d'agrégation." Phd thesis, Université Paris Sud - Paris XI, 2011. http://tel.archives-ouvertes.fr/tel-00653550.
Full textHuard, Malo. "Apprentissage et prévision séquentiels : bornes uniformes pour le regret linéaire et séries temporelles hiérarchiques." Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASM009.
Full textThis work presents some theoretical and practical contributions to the prediction of arbitrary sequences. In this domain, forecasting takes place sequentially at the same time as learning. At each step, the model is fitted on the past data in order to predict the next observation. The goal of this model is to make the best possible predictions, i.e. those that minimize their deviations from the observations, which are made a posteriori. Sequential learning methods are evaluated by their regret, which measures how close strategies are to the best possible, known only after all the data is available. In this thesis, we extend the set of weights vectors a method is compared to when doing sequential linear regression. We have adapted an existing algorithm by improving its theoretical guarantees allowing it to be compared to any constant linear combination without restriction on the norm of its mixing weights. A second work consisted in extending sequential forecasting methods when forcasted data is organized in a hierarchy. We tested these hierarchical methods on two practical applications, household power consumption prediction and demand forecasts in e-commerce
Papoutsis, Panayotis. "Potentiel et prévision des temps d'attente pour le covoiturage sur un territoire." Thesis, Ecole centrale de Nantes, 2021. http://www.theses.fr/2021ECDN0059.
Full textThis thesis focuses on the potential and prediction of carpooling waiting times in a territory using statistical learning methods. Five main themes are covered in this manuscript. The first presents quantile regression techniques to predict waiting times. The second details the construction of a workflow based on Geographic Information Systems (GIS) tools in order to fully leverage the carpooling data. In a third part we develop a hierarchical bayesian model in order to predict traffic flows and waiting times. In the fourth part, we propose a methodology for constructing an informative prior by bayesian transfer to improve the prediction of waiting times for a short dataset situation. Lastly, the final theme focuses on the production and industrial exploitation of the bayesian hierarchical model
Abtini, Mona. "Plans prédictifs à taille fixe et séquentiels pour le krigeage." Thesis, Lyon, 2018. http://www.theses.fr/2018LYSEC019/document.
Full textIn recent years, computer simulation models are increasingly used to study complex phenomena. Such problems usually rely on very large sophisticated simulation codes that are very expensive in computing time. The exploitation of these codes becomes a problem, especially when the objective requires a significant number of evaluations of the code. In practice, the code is replaced by global approximation models, often called metamodels, most commonly a Gaussian Process (kriging) adjusted to a design of experiments, i.e. on observations of the model output obtained on a small number of simulations. Space-Filling-Designs which have the design points evenly spread over the entire feasible input region, are the most used designs. This thesis consists of two parts. The main focus of both parts is on construction of designs of experiments that are adapted to kriging, which is one of the most popular metamodels. Part I considers the construction of space-fillingdesigns of fixed size which are adapted to kriging prediction. This part was started by studying the effect of Latin Hypercube constraint (the most used design in practice with the kriging) on maximin-optimal designs. This study shows that when the design has a small number of points, the addition of the Latin Hypercube constraint will be useful because it mitigates the drawbacks of maximin-optimal configurations (the position of the majority of points at the boundary of the input space). Following this study, an uniformity criterion called Radial discrepancy has been proposed in order to measure the uniformity of the points of the design according to their distance to the boundary of the input space. Then we show that the minimax-optimal design is the closest design to IMSE design (design which is adapted to prediction by kriging) but is also very difficult to evaluate. We then introduce a proxy for the minimax-optimal design based on the maximin-optimal design. Finally, we present an optimised implementation of the simulated annealing algorithm in order to find maximin-optimal designs. Our aim here is to minimize the probability of falling in a local minimum configuration of the simulated annealing. The second part of the thesis concerns a slightly different problem. If XN is space-filling-design of N points, there is no guarantee that any n points of XN (1 6 n 6 N) constitute a space-filling-design. In practice, however, we may have to stop the simulations before the full realization of design. The aim of this part is therefore to propose a new methodology to construct sequential of space-filling-designs (nested designs) of experiments Xn for any n between 1 and N that are all adapted to kriging prediction. We introduce a method to generate nested designs based on information criteria, particularly the Mutual Information criterion. This method ensures a good quality forall the designs generated, 1 6 n 6 N. A key difficulty of this method is that the time needed to generate a MI-sequential design in the highdimension case is very larg. To address this issue a particular implementation, which calculates the determinant of a given matrix by partitioning it into blocks. This implementation allows a significant reduction of the computational cost of MI-sequential designs, has been proposed
Benbouzid, Djalel. "Sequential prediction for budgeted learning : Application to trigger design." Phd thesis, Université Paris Sud - Paris XI, 2014. http://tel.archives-ouvertes.fr/tel-00990245.
Full textCalandriello, Daniele. "Efficient sequential learning in structured and constrained environments." Thesis, Lille 1, 2017. http://www.theses.fr/2017LIL10216/document.
Full textThe main advantage of non-parametric models is that the accuracy of the model (degrees of freedom) adapts to the number of samples. The main drawback is the so-called "curse of kernelization": to learn the model we must first compute a similarity matrix among all samples, which requires quadratic space and time and is unfeasible for large datasets. Nonetheless the underlying effective dimension (effective d.o.f.) of the dataset is often much smaller than its size, and we can replace the dataset with a subset (dictionary) of highly informative samples. Unfortunately, fast data-oblivious selection methods (e.g., uniform sampling) almost always discard useful information, while data-adaptive methods that provably construct an accurate dictionary, such as ridge leverage score (RLS) sampling, have a quadratic time/space cost. In this thesis we introduce a new single-pass streaming RLS sampling approach that sequentially construct the dictionary, where each step compares a new sample only with the current intermediate dictionary and not all past samples. We prove that the size of all intermediate dictionaries scales only with the effective dimension of the dataset, and therefore guarantee a per-step time and space complexity independent from the number of samples. This reduces the overall time required to construct provably accurate dictionaries from quadratic to near-linear, or even logarithmic when parallelized. Finally, for many non-parametric learning problems (e.g., K-PCA, graph SSL, online kernel learning) we we show that we can can use the generated dictionaries to compute approximate solutions in near-linear that are both provably accurate and empirically competitive
Lonjarret, Corentin. "Sequential recommendation and explanations." Thesis, Lyon, 2021. http://theses.insa-lyon.fr/publication/2021LYSEI003/these.pdf.
Full textRecommender systems have received a lot of attention over the past decades with the proposal of many models that take advantage of the most advanced models of Deep Learning and Machine Learning. With the automation of the collect of user actions such as purchasing of items, watching movies, clicking on hyperlinks, the data available for recommender systems is becoming more and more abundant. These data, called implicit feedback, keeps the sequential order of actions. It is in this context that sequence-aware recommender systems have emerged. Their goal is to combine user preference (long-term users' profiles) and sequential dynamics (short-term tendencies) in order to recommend next actions to a user. In this thesis, we investigate sequential recommendation that aims to predict the user's next item/action from implicit feedback. Our main contribution is REBUS, a new metric embedding model, where only items are projected to integrate and unify user preferences and sequential dynamics. To capture sequential dynamics, REBUS uses frequent sequences in order to provide personalized order Markov chains. We have carried out extensive experiments and demonstrate that our method outperforms state-of-the-art models, especially on sparse datasets. Moreover we share our experience on the implementation and the integration of REBUS in myCADservices, a collaborative platform of the French company Visiativ. We also propose methods to explain the recommendations provided by recommender systems in the research line of explainable AI that has received a lot of attention recently. Despite the ubiquity of recommender systems only few researchers have attempted to explain the recommendations according to user input. However, being able to explain a recommendation would help increase the confidence that a user can have in a recommendation system. Hence, we propose a method based on subgroup discovery that provides interpretable explanations of a recommendation for models that use implicit feedback
Bouaziz, Mohamed. "Réseaux de neurones récurrents pour la classification de séquences dans des flux audiovisuels parallèles." Thesis, Avignon, 2017. http://www.theses.fr/2017AVIG0224/document.
Full textIn the same way as TV channels, data streams are represented as a sequence of successive events that can exhibit chronological relations (e.g. a series of programs, scenes, etc.). For a targeted channel, broadcast programming follows the rules defined by the channel itself, but can also be affected by the programming of competing ones. In such conditions, event sequences of parallel streams could provide additional knowledge about the events of a particular stream. In the sphere of machine learning, various methods that are suited for processing sequential data have been proposed. Long Short-Term Memory (LSTM) Recurrent Neural Networks have proven its worth in many applications dealing with this type of data. Nevertheless, these approaches are designed to handle only a single input sequence at a time. The main contribution of this thesis is about developing approaches that jointly process sequential data derived from multiple parallel streams. The application task of our work, carried out in collaboration with the computer science laboratory of Avignon (LIA) and the EDD company, seeks to predict the genre of a telecast. This prediction can be based on the histories of previous telecast genres in the same channel but also on those belonging to other parallel channels. We propose a telecast genre taxonomy adapted to such automatic processes as well as a dataset containing the parallel history sequences of 4 French TV channels. Two original methods are proposed in this work in order to take into account parallel stream sequences. The first one, namely the Parallel LSTM (PLSTM) architecture, is an extension of the LSTM model. PLSTM simultaneously processes each sequence in a separate recurrent layer and sums the outputs of each of these layers to produce the final output. The second approach, called MSE-SVM, takes advantage of both LSTM and Support Vector Machines (SVM) methods. Firstly, latent feature vectors are independently generated for each input stream, using the output event of the main one. These new representations are then merged and fed to an SVM algorithm. The PLSTM and MSE-SVM approaches proved their ability to integrate parallel sequences by outperforming, respectively, the LSTM and SVM models that only take into account the sequences of the main stream. The two proposed approaches take profit of the information contained in long sequences. However, they have difficulties to deal with short ones. Though MSE-SVM generally outperforms the PLSTM approach, the problem experienced with short sequences is more pronounced for MSE-SVM. Finally, we propose to extend this approach by feeding additional information related to each event in the input sequences (e.g. the weekday of a telecast). This extension, named AMSE-SVM, has a remarkably better behavior with short sequences without affecting the performance when processing long ones
Almuhisen, Feda. "Leveraging formal concept analysis and pattern mining for moving object trajectory analysis." Thesis, Aix-Marseille, 2018. http://www.theses.fr/2018AIXM0738/document.
Full textThis dissertation presents a trajectory analysis framework, which includes both a preprocessing phase and trajectory mining process. Furthermore, the framework offers visual functions that reflect trajectory patterns evolution behavior. The originality of the mining process is to leverage frequent emergent pattern mining and formal concept analysis for moving objects trajectories. These methods detect and characterize pattern evolution behaviors bound to time in trajectory data. Three contributions are proposed: (1) a method for analyzing trajectories based on frequent formal concepts is used to detect different trajectory patterns evolution over time. These behaviors are "latent", "emerging", "decreasing", "lost" and "jumping". They characterize the dynamics of mobility related to urban spaces and time. The detected behaviors are automatically visualized on generated maps with different spatio-temporal levels to refine the analysis of mobility in a given area of the city, (2) a second trajectory analysis framework that is based on sequential concept lattice extraction is also proposed to exploit the movement direction in the evolution detection process, and (3) prediction method based on Markov chain is presented to predict the evolution behavior in the future period for a region. These three methods are evaluated on two real-world datasets. The obtained experimental results from these data show the relevance of the proposal and the utility of the generated maps
Labidi, Mouchira. "Optimisation de chaufferies collectives multi-energies : dimensionnement et commande de systèmes de stockage thermique par hydro-accumulation." Thesis, Perpignan, 2015. http://www.theses.fr/2015PERP0007.
Full textThe present work deals with optimizing a multi-energy district boiler by adding to the plant a thermal water storage tank. The effectiveness of such a system depends on how long the stored energy can be kept without considerable degradation. The storage tank should be properly insulated to reduce the rate of heat loss. Thus, firstly, a stratified water thermal storage model is developed and experimentally validated. A parametric study is carried out to determine the influence of geometric and meteorological parameters on heat loss. Next, a reliable sizing method based on a sequential management strategy and a parametric study is proposed. Various energy and economic criteria have been evaluated for a range of thermal storage sizes. The proposed methodology has been applied to many plants managed by Cofely GDF-Suez, our industrial partner. Results highlight the ability of a thermal storage tank (optimally sized and managed) to improve the operation of a multi-energy district boiler and realize significant energy and economic savings. The main drawback of the proposed sequential management strategy lies in not taking into account the future power demand. That is why a strategy based on a Model Predictive Controller (MPC) is likely to improve operation and performance. In order to implement such a controller, the power demand has to be accurately forecasted. As a consequence, a short-term forecast method, based on wavelet-based Multi-Resolution Analysis (MRA) and multilayer Artificial Neural Networks (ANN) is proposed. Both the sequential and the predictive strategies are applied to a northeast France multi-energy district boiler selected as a case study. The main result to retain is that the efficiency of water thermal storage tank is mainly related to its design and the way it is managed. For this case study, the predictive strategy regardless the size of the storage tank, the predictive strategy is more reliable. Furthermore, in all cases an adequately sized and managed thermal storage tank is a profitable investment. It allows the fossil energy consumption to be significantly reduced. The same remark applies to the functioning costs and CO2 emissions
Almuhisen, Feda. "Leveraging formal concept analysis and pattern mining for moving object trajectory analysis." Electronic Thesis or Diss., Aix-Marseille, 2018. http://www.theses.fr/2018AIXM0738.
Full textThis dissertation presents a trajectory analysis framework, which includes both a preprocessing phase and trajectory mining process. Furthermore, the framework offers visual functions that reflect trajectory patterns evolution behavior. The originality of the mining process is to leverage frequent emergent pattern mining and formal concept analysis for moving objects trajectories. These methods detect and characterize pattern evolution behaviors bound to time in trajectory data. Three contributions are proposed: (1) a method for analyzing trajectories based on frequent formal concepts is used to detect different trajectory patterns evolution over time. These behaviors are "latent", "emerging", "decreasing", "lost" and "jumping". They characterize the dynamics of mobility related to urban spaces and time. The detected behaviors are automatically visualized on generated maps with different spatio-temporal levels to refine the analysis of mobility in a given area of the city, (2) a second trajectory analysis framework that is based on sequential concept lattice extraction is also proposed to exploit the movement direction in the evolution detection process, and (3) prediction method based on Markov chain is presented to predict the evolution behavior in the future period for a region. These three methods are evaluated on two real-world datasets. The obtained experimental results from these data show the relevance of the proposal and the utility of the generated maps
Bubeck, Sébastien. "JEUX DE BANDITS ET FONDATIONS DU CLUSTERING." Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2010. http://tel.archives-ouvertes.fr/tel-00845565.
Full text