Dissertations / Theses: 'Stochastic modelling theory'

1

Li, Guangquan. "Stochastic modelling of carcinogenesis : theory and application." Thesis, Imperial College London, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.486378.

Full text

Abstract:

Cancer is a group of diseases characterized by autonomous, uncontrollable cell proliferation, evasion of cell death, self-construction of oxygen and nutrient supply and spreading of cancerous cells through metastasis. It is vital to elucidate pathways that underlie the cancer process. Mechanistic cancer models, to this extent, attempt to relate the occurrence: of malignant neoplasm to diverse risk factors such as genetic alterations, susceptibility of individuals and exogenous and endogenous carcinogenic exposures. The main objectives of this thesis are to examine the validity of the two-hit hypothesis for retinoblastoma (Knudson, 1971, PNAS (68) 820-23), to unify multiple types of genomic instability and to further assess the role of genomic instability played in the process of carcinogenesis. I shall also explore characteristics of existing mechanistic cancer models. This thesis begins with a survey' of basic cancer biology and existing mechanistic models. Utilizing a fully-stochastic two-stage clonal expansion model, the thesis specifically assesses the validity of the two-hit theory for retinoblastoma-(RB), a childhood ocular malignancy. Comparison of fits of a variety of models (in particular those with up to three mutations) to a population-based RB dfltaset demonstrates the superior fit of the two-stage model to others. This result strongly suggests both the necessity and sufficiency of the two RBI mutations to initiate RB and hence validates the two-hit theory. The thesis goes on to develop a comprehensive,. framework to incorporate multiple types of genomic instability, characterized by-numerous numerical and structural damages exhibiting in the cancer cell genome. This generalized model embracesÃƒ?Ã‚?Ãƒ?Ã‚Â·;most, if not all, of the existing MVK-type models. Specific forms of the model are fitted to U.S. white American colon cancer incidence data. Based on comparison of fits to the population-based data, there is little evidence to support the hypothesis that models with more than one type of genomic instability fits better than those with a single type of genomic instability. Since the age-specific incidence data may not possess sufficient information for model discrimination, further investigation is required. The remainder ofthis thesis is concerned with two theoretical aspects. To facilitate a Bayesian implementation for data fitting, a flexible blocking algorithm is developed. In the presence of parameter correlation, the algorithm considerably improves the performance of the Markov chain Monte Carlo simulations. In addition, following a similar approach of Heidenreich et al. (1997, Risk Anal. (17) 391-399), the maximum number of identifiable parameters in the proposed cancer model WIth r types of genomIc InstabIlIty IS r +1 less than the number of biologically-based parameters.

APA, Harvard, Vancouver, ISO, and other styles

2

Zhao, Z. "Integration of neural and stochastic modelling techniques for speech recognition." Thesis, University of Essex, 1992. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.305954.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Malik, Sheheryar. "Essays in time series analysis : modelling stochastic volatility and forecast evaluation." Thesis, University of Warwick, 2009. http://wrap.warwick.ac.uk/2306/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Taleb, B. "The theory and design of a stochastic reliability simulator for large scale systems." Thesis, Open University, 1988. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.383689.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Hopf, Craig. "Exchange traded horserace betting fund with deterministic payoff – a mathematical analysis of a profitable deterministic horserace betting model." Thesis, Griffith University, 2014. http://hdl.handle.net/10072/366832.

Full text

Abstract:

The horserace betting market is a subset of the financial market space, and wagering typically inherits a defined return – to – risk trade-off. For horserace betting input into institutional portfolio to be plausible, the payoff – to – risk trade-off from betting must be acceptable for the fund when compared with the return – risk trade-off from the existing mainstream assets included in portfolio investment. A new paradigm for horserace betting modelling and investing is acclaimed in this thesis, as premiss for betting input into institutional portfolio. An exchange traded betting fund is developed in the thesis that is able to generate pre-race (and within-race) investment arbitrage that offers an acceptable, defined return – risk trade-off for the risk averse investor. The extensive former horserace betting market stochastic modelling theory that forecasts racer expected outcomes and payoff, is today succeeded by this research that develops a deterministic horserace betting model (and algorithm) that generates defined payoff for the fund. This deterministic betting model challenges the existing semi-strong efficient market hypothesis toward horserace betting that no betting strategy consistently outperforms the financial market’s benchmark return. Subsequently, the primary research (alternative) hypothesis tested is H_a: profitable exchange traded horserace betting fund with deterministic payoff exists for acceptable institutional portfolio investment.
Thesis (Masters)
Master of Philosophy (MPhil)
Griffith School of Environment
Science, Environment, Engineering and Technology
Full Text

APA, Harvard, Vancouver, ISO, and other styles

6

Shang, Xiaocheng. "Extended stochastic dynamics : theory, algorithms, and applications in multiscale modelling and data science." Thesis, University of Edinburgh, 2016. http://hdl.handle.net/1842/20422.

Full text

Abstract:

This thesis addresses the sampling problem in a high-dimensional space, i.e., the computation of averages with respect to a defined probability density that is a function of many variables. Such sampling problems arise in many application areas, including molecular dynamics, multiscale models, and Bayesian sampling techniques used in emerging machine learning applications. Of particular interest are thermostat techniques, in the setting of a stochastic-dynamical system, that preserve the canonical Gibbs ensemble defined by an exponentiated energy function. In this thesis we explore theory, algorithms, and numerous applications in this setting. We begin by comparing numerical methods for particle-based models. The class of methods considered includes dissipative particle dynamics (DPD) as well as a newly proposed stochastic pairwise Nosé-Hoover-Langevin (PNHL) method. Splitting methods are developed and studied in terms of their thermodynamic accuracy, two-point correlation functions, and convergence. When computational efficiency is measured by the ratio of thermodynamic accuracy to CPU time, we report significant advantages in simulation for the PNHL method compared to popular alternative schemes in the low-friction regime, without degradation of convergence rate. We propose a pairwise adaptive Langevin (PAdL) thermostat that fully captures the dynamics of DPD and thus can be directly applied in the setting of momentum-conserving simulation. These methods are potentially valuable for nonequilibrium simulation of physical systems. We again report substantial improvements in both equilibrium and nonequilibrium simulations compared to popular schemes in the literature. We also discuss the proper treatment of the Lees-Edwards boundary conditions, an essential part of modelling shear flow. We also study numerical methods for sampling probability measures in high dimension where the underlying model is only approximately identified with a gradient system. These methods are important in multiscale modelling and in the design of new machine learning algorithms for inference and parameterization for large datasets, challenges which are increasingly important in "big data" applications. In addition to providing a more comprehensive discussion of the foundations of these methods, we propose a new numerical method for the adaptive Langevin/stochastic gradient Nosé-Hoover thermostat that achieves a dramatic improvement in numerical efficiency over the most popular stochastic gradient methods reported in the literature. We demonstrate that the newly established method inherits a superconvergence property (fourth order convergence to the invariant measure for configurational quantities) recently demonstrated in the setting of Langevin dynamics. Furthermore, we propose a covariance-controlled adaptive Langevin (CCAdL) thermostat that can effectively dissipate parameter-dependent noise while maintaining a desired target distribution. The proposed method achieves a substantial speedup over popular alternative schemes for large-scale machine learning applications.

APA, Harvard, Vancouver, ISO, and other styles

7

Szekely, Tamas. "Stochastic modelling and simulation in cell biology." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:f9b8dbe6-d96d-414c-ac06-909cff639f8c.

Full text

Abstract:

Modelling and simulation are essential to modern research in cell biology. This thesis follows a journey starting from the construction of new stochastic methods for discrete biochemical systems to using them to simulate a population of interacting haematopoietic stem cell lineages. The first part of this thesis is on discrete stochastic methods. We develop two new methods, the stochastic extrapolation framework and the Stochastic Bulirsch-Stoer methods. These are based on the Richardson extrapolation technique, which is widely used in ordinary differential equation solvers. We believed that it would also be useful in the stochastic regime, and this turned out to be true. The stochastic extrapolation framework is a scheme that admits any stochastic method with a fixed stepsize and known global error expansion. It can improve the weak order of the moments of these methods by cancelling the leading terms in the global error. Using numerical simulations, we demonstrate that this is the case up to second order, and postulate that this also follows for higher order. Our simulations show that extrapolation can greatly improve the accuracy of a numerical method. The Stochastic Bulirsch-Stoer method is another highly accurate stochastic solver. Furthermore, using numerical simulations we find that it is able to better retain its high accuracy for larger timesteps than competing methods, meaning it remains accurate even when simulation time is speeded up. This is a useful property for simulating the complex systems that researchers are often interested in today. The second part of the thesis is concerned with modelling a haematopoietic stem cell system, which consists of many interacting niche lineages. We use a vectorised tau-leap method to examine the differences between a deterministic and a stochastic model of the system, and investigate how coupling niche lineages affects the dynamics of the system at the homeostatic state as well as after a perturbation. We find that larger coupling allows the system to find the optimal steady state blood cell levels. In addition, when the perturbation is applied randomly to the entire system, larger coupling also results in smaller post-perturbation cell fluctuations compared to non-coupled cells. In brief, this thesis contains four main sets of contributions: two new high-accuracy discrete stochastic methods that have been numerically tested, an improvement that can be used with any leaping method that introduces vectorisation as well as how to use a common stepsize adapting scheme, and an investigation of the effects of coupling lineages in a heterogeneous population of haematopoietic stem cell niche lineages.

APA, Harvard, Vancouver, ISO, and other styles

8

Jin, Lei. "Particle systems and SPDEs with application to credit modelling." Thesis, University of Oxford, 2010. http://ora.ox.ac.uk/objects/uuid:07b29609-6941-4aa9-b4bc-29e7b4821b82.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Wright, Gordon Gilbert. "An empirical analysis and stochastic modelling of aggregate demand behaviour in a spare parts inventory system." Thesis, City University London, 1991. http://openaccess.city.ac.uk/7728/.

Full text

Abstract:

The focus of the work here was an empirical analysis of the aggregate independent demand behaviour for spare parts inventories, principally in the automotive industry. In particular, using the pioneering work of RG Brown (1959), who showed that inventory usage values are often log normally distributed, we set out and developed models that go some considerable way to explaining the underlying stochastic basis for this phenomena, why it occurs and some limiting conditions. The justification for this approach was on the grounds that by providing a more fundamental understanding of the underlying stochastic processes that explain the emergent aggregate demand behaviour, a sound starting point would be provided for developing more sophisticated analytical ways to view an inventory range, as a total entity, for planning and control purposes. The analysis was based on extensive data collected from the DAF Trucks (GB) Ltd. spare parts systems spanning the period 1975 to 1986, together with supporting studies from a number of other systems. The analysis showed that in the systems studied spare parts prices are lognormally distributed and this is most likely to be the result of a stochastic process known as the 'theory of breakage'. Analysis also showed that in the DAF Trucks case aggregated and volumes in very short time periods are distributed as a combined Log Series /Negative Binomial distribution (LSD/NBD). The combined LSD/NBD model of aggregate demand volumes is itself fully explained by a stochastic model known as the Afwedson model, which in turn is derived from more elementary conditions based on the Poisson process. We then demonstrated that if these short period aggregate demand distributions are cumulated period by period they converge to a log normal distribution as the stable long run model of aggregate demand volumes. As a result of the lognormality of prices and volumes the resultant inventory usage values are also log normal. Furthermore from insight into the underlying factors that explain the lognormality we have identified the factors and variables that govern the valueso f the parameterso f the particular log normal models of usage values. - The research protocol used in this work incorporated the law verifying process know as 'retroduction' after work and discussions of Uji Ijiri and Herbert Simon (1977); and to a lesser extent we utilised simulation for validation and verification of the derived models. From the proven log normality of demand volumes and usage values we have demonstrated that a number of related key inventory factors are also lognormal, in particular inventory- item turnover rates. Furthermore our conclusions show that some standard inventory performance measures, such as the inventory wide 'stock turnover rate' and the 'stock to sales' ratio, are poor measures to use in the case of highly skewed inventory variables. Finally we have suggested several potentially fruitful areas for developing improved methods of monitoring inventory performance in a variety of circumstances.

APA, Harvard, Vancouver, ISO, and other styles

10

Franz, Benjamin. "Recent modelling frameworks for systems of interacting particles." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:ac76d159-4cdd-40c9-b378-6ea1faf48aed.

Full text

Abstract:

In this thesis we study three different modelling frameworks for biological systems of dispersal and combinations thereof. The three frameworks involved are individual-based models, group-level models in the form of partial differential equations (PDEs) and robot swarms. In the first two chapters of the thesis, we present ways of coupling individual based models with PDEs in so-called hybrid models, with the aim of achieving improved performance of simulations. Two classes of such hybrid models are discussed that allow an efficient simulation of multi-species systems of dispersal with reactions, but involve individual resolution for certain species and in certain parts of a computational domain if desired. We generally consider two types of example systems: bacterial chemotaxis and reaction-diffusion systems, and present results in the respective application area as well as general methods. The third chapter of this thesis introduces swarm robotic experiments as an additional tool to study systems of dispersal. In general, those experiments can be used to mimic animal behaviour and to study the impact of local interactions on the group-level dynamics. We concentrate on a target finding problem for groups of robots. We present how PDE descriptions can be adjusted to incorporate the finite turning times observed in the robotic system and that the adjusted models match well with experimental data. In the fourth and last chapter, we consider interactions between robots in the form of hard-sphere collisions and again derive adjusted PDE descriptions. We show that collisions have a significant impact on the speed with which the group spreads across a domain. Throughout these two chapters, we apply a combination of experiments, individual-based simulations and PDE descriptions to improve our understanding of interactions in systems of dispersal.

APA, Harvard, Vancouver, ISO, and other styles

11

Shabala, Alexander. "Mathematical modelling of oncolytic virotherapy." Thesis, University of Oxford, 2013. http://ora.ox.ac.uk/objects/uuid:cca2c9bc-cbd4-4651-9b59-8a4dea7245d1.

Full text

Abstract:

This thesis is concerned with mathematical modelling of oncolytic virotherapy: the use of genetically modified viruses to selectively spread, replicate and destroy cancerous cells in solid tumours. Traditional spatially-dependent modelling approaches have previously assumed that virus spread is due to viral diffusion in solid tumours, and also neglect the time delay introduced by the lytic cycle for viral replication within host cells. A deterministic, age-structured reaction-diffusion model is developed for the spatially-dependent interactions of uninfected cells, infected cells and virus particles, with the spread of virus particles facilitated by infected cell motility and delay. Evidence of travelling wave behaviour is shown, and an asymptotic approximation for the wave speed is derived as a function of key parameters. Next, the same physical assumptions as in the continuum model are used to develop an equivalent discrete, probabilistic model for that is valid in the limit of low particle concentrations. This mesoscopic, compartment-based model is then validated against known test cases, and it is shown that the localised nature of infected cell bursts leads to inconsistencies between the discrete and continuum models. The qualitative behaviour of this stochastic model is then analysed for a range of key experimentally-controllable parameters. Two-dimensional simulations of in vivo and in vitro therapies are then analysed to determine the effects of virus burst size, length of lytic cycle, infected cell motility, and initial viral distribution on the wave speed, consistency of results and overall success of therapy. Finally, the experimental difficulty of measuring the effective motility of cells is addressed by considering effective medium approximations of diffusion through heterogeneous tumours. Considering an idealised tumour consisting of periodic obstacles in free space, a two-scale homogenisation technique is used to show the effects of obstacle shape on the effective diffusivity. A novel method for calculating the effective continuum behaviour of random walks on lattices is then developed for the limiting case where microscopic interactions are discrete.

APA, Harvard, Vancouver, ISO, and other styles

12

Schwarz, Daniel Christopher. "Price modelling and asset valuation in carbon emission and electricity markets." Thesis, University of Oxford, 2012. http://ora.ox.ac.uk/objects/uuid:7de118d2-a61b-4125-a615-29ff82ac7316.

Full text

Abstract:

This thesis is concerned with the mathematical analysis of electricity and carbon emission markets. We introduce a novel, versatile and tractable stochastic framework for the joint price formation of electricity spot prices and allowance certificates. In the proposed framework electricity and allowance prices are explained as functions of specific fundamental factors, such as the demand for electricity and the prices of the fuels used for its production. As a result, the proposed model very clearly captures the complex dependency of the modelled prices on the aforementioned fundamental factors. The allowance price is obtained as the solution to a coupled forward-backward stochastic differential equation. We provide a rigorous proof of the existence and uniqueness of a solution to this equation and analyse its behaviour using asymptotic techniques. The essence of the model for the electricity price is a carefully chosen and explicitly constructed function representing the supply curve in the electricity market. The model we propose accommodates most regulatory features that are commonly found in implementations of emissions trading systems and we analyse in detail the impact these features have on the prices of allowance certificates. Thereby we reveal a weakness in existing regulatory frameworks, which, in rare cases, can lead to allowance prices that do not conform with the conditions imposed by the regulator. We illustrate the applicability of our model to the pricing of derivative contracts, in particular clean spread options and numerically illustrate its ability to "see" relationships between the fundamental variables and the option contract, which are usually unobserved by other commonly used models in the literature. The results we obtain constitute flexible tools that help to efficiently evaluate the financial impact current or future implementations of emissions trading systems have on participants in these markets.

APA, Harvard, Vancouver, ISO, and other styles

13

Tran, The Truyen. "On conditional random fields: applications, feature selection, parameter estimation and hierarchical modelling." Thesis, Curtin University, 2008. http://hdl.handle.net/20.500.11937/436.

Full text

Abstract:

There has been a growing interest in stochastic modelling and learning with complex data, whose elements are structured and interdependent. One of the most successful methods to model data dependencies is graphical models, which is a combination of graph theory and probability theory. This thesis focuses on a special type of graphical models known as Conditional Random Fields (CRFs) (Lafferty et al., 2001), in which the output state spaces, when conditioned on some observational input data, are represented by undirected graphical models. The contributions of thesis involve both (a) broadening the current applicability of CRFs in the real world and (b) deepening the understanding of theoretical aspects of CRFs. On the application side, we empirically investigate the applications of CRFs in two real world settings. The first application is on a novel domain of Vietnamese accent restoration, in which we need to restore accents of an accent-less Vietnamese sentence. Experiments on half a million sentences of news articles show that the CRF-based approach is highly accurate. In the second application, we develop a new CRF-based movie recommendation system called Preference Network (PN). The PN jointly integrates various sources of domain knowledge into a large and densely connected Markov network. We obtained competitive results against well-established methods in the recommendation field.On the theory side, the thesis addresses three important theoretical issues of CRFs: feature selection, parameter estimation and modelling recursive sequential data. These issues are all addressed under a general setting of partial supervision in that training labels are not fully available. For feature selection, we introduce a novel learning algorithm called AdaBoost.CRF that incrementally selects features out of a large feature pool as learning proceeds. AdaBoost.CRF is an extension of the standard boosting methodology to structured and partially observed data. We demonstrate that the AdaBoost.CRF is able to eliminate irrelevant features and as a result, returns a very compact feature set without significant loss of accuracy. Parameter estimation of CRFs is generally intractable in arbitrary network structures. This thesis contributes to this area by proposing a learning method called AdaBoost.MRF (which stands for AdaBoosted Markov Random Forests). As learning proceeds AdaBoost.MRF incrementally builds a tree ensemble (a forest) that cover the original network by selecting the best spanning tree at a time. As a result, we can approximately learn many rich classes of CRFs in linear time. The third theoretical work is on modelling recursive, sequential data in that each level of resolution is a Markov sequence, where each state in the sequence is also a Markov sequence at the finer grain. One of the key contributions of this thesis is Hierarchical Conditional Random Fields (HCRF), which is an extension to the currently popular sequential CRF and the recent semi-Markov CRF (Sarawagi and Cohen, 2004). Unlike previous CRF work, the HCRF does not assume any fixed graphical structures.Rather, it treats structure as an uncertain aspect and it can estimate the structure automatically from the data. The HCRF is motivated by Hierarchical Hidden Markov Model (HHMM) (Fine et al., 1998). Importantly, the thesis shows that the HHMM is a special case of HCRF with slight modification, and the semi-Markov CRF is essentially a flat version of the HCRF. Central to our contribution in HCRF is a polynomial-time algorithm based on the Asymmetric Inside Outside (AIO) family developed in (Bui et al., 2004) for learning and inference. Another important contribution is to extend the AIO family to address learning with missing data and inference under partially observed labels. We also derive methods to deal with practical concerns associated with the AIO family, including numerical overflow and cubic-time complexity. Finally, we demonstrate good performance of HCRF against rivals on two applications: indoor video surveillance and noun-phrase chunking.

APA, Harvard, Vancouver, ISO, and other styles

14

Rosser, Gabriel A. "Mathematical modelling and analysis of aspects of bacterial motility." Thesis, University of Oxford, 2012. http://ora.ox.ac.uk/objects/uuid:1af98367-aa2f-4af3-9344-8c361311b553.

Full text

Abstract:

The motile behaviour of bacteria underlies many important aspects of their actions, including pathogenicity, foraging efficiency, and ability to form biofilms. In this thesis, we apply mathematical modelling and analysis to various aspects of the planktonic motility of flagellated bacteria, guided by experimental observations. We use data obtained by tracking free-swimming Rhodobacter sphaeroides under a microscope, taking advantage of the availability of a large dataset acquired using a recently developed, high-throughput protocol. A novel analysis method using a hidden Markov model for the identification of reorientation phases in the tracks is described. This is assessed and compared with an established method using a computational simulation study, which shows that the new method has a reduced error rate and less systematic bias. We proceed to apply the novel analysis method to experimental tracks, demonstrating that we are able to successfully identify reorientations and record the angle changes of each reorientation phase. The analysis pipeline developed here is an important proof of concept, demonstrating a rapid and cost-effective protocol for the investigation of myriad aspects of the motility of microorganisms. In addition, we use mathematical modelling and computational simulations to investigate the effect that the microscope sampling rate has on the observed tracking data. This is an important, but often overlooked aspect of experimental design, which affects the observed data in a complex manner. Finally, we examine the role of rotational diffusion in bacterial motility, testing various models against the analysed data. This provides strong evidence that R. sphaeroides undergoes some form of active reorientation, in contrast to the mainstream belief that the process is passive.

APA, Harvard, Vancouver, ISO, and other styles

15

Tran, The Truyen. "On conditional random fields: applications, feature selection, parameter estimation and hierarchical modelling." Curtin University of Technology, Dept. of Computing, 2008. http://espace.library.curtin.edu.au:80/R/?func=dbin-jump-full&object_id=18614.

Full text

Abstract:

There has been a growing interest in stochastic modelling and learning with complex data, whose elements are structured and interdependent. One of the most successful methods to model data dependencies is graphical models, which is a combination of graph theory and probability theory. This thesis focuses on a special type of graphical models known as Conditional Random Fields (CRFs) (Lafferty et al., 2001), in which the output state spaces, when conditioned on some observational input data, are represented by undirected graphical models. The contributions of thesis involve both (a) broadening the current applicability of CRFs in the real world and (b) deepening the understanding of theoretical aspects of CRFs. On the application side, we empirically investigate the applications of CRFs in two real world settings. The first application is on a novel domain of Vietnamese accent restoration, in which we need to restore accents of an accent-less Vietnamese sentence. Experiments on half a million sentences of news articles show that the CRF-based approach is highly accurate. In the second application, we develop a new CRF-based movie recommendation system called Preference Network (PN). The PN jointly integrates various sources of domain knowledge into a large and densely connected Markov network. We obtained competitive results against well-established methods in the recommendation field.
On the theory side, the thesis addresses three important theoretical issues of CRFs: feature selection, parameter estimation and modelling recursive sequential data. These issues are all addressed under a general setting of partial supervision in that training labels are not fully available. For feature selection, we introduce a novel learning algorithm called AdaBoost.CRF that incrementally selects features out of a large feature pool as learning proceeds. AdaBoost.CRF is an extension of the standard boosting methodology to structured and partially observed data. We demonstrate that the AdaBoost.CRF is able to eliminate irrelevant features and as a result, returns a very compact feature set without significant loss of accuracy. Parameter estimation of CRFs is generally intractable in arbitrary network structures. This thesis contributes to this area by proposing a learning method called AdaBoost.MRF (which stands for AdaBoosted Markov Random Forests). As learning proceeds AdaBoost.MRF incrementally builds a tree ensemble (a forest) that cover the original network by selecting the best spanning tree at a time. As a result, we can approximately learn many rich classes of CRFs in linear time. The third theoretical work is on modelling recursive, sequential data in that each level of resolution is a Markov sequence, where each state in the sequence is also a Markov sequence at the finer grain. One of the key contributions of this thesis is Hierarchical Conditional Random Fields (HCRF), which is an extension to the currently popular sequential CRF and the recent semi-Markov CRF (Sarawagi and Cohen, 2004). Unlike previous CRF work, the HCRF does not assume any fixed graphical structures.
Rather, it treats structure as an uncertain aspect and it can estimate the structure automatically from the data. The HCRF is motivated by Hierarchical Hidden Markov Model (HHMM) (Fine et al., 1998). Importantly, the thesis shows that the HHMM is a special case of HCRF with slight modification, and the semi-Markov CRF is essentially a flat version of the HCRF. Central to our contribution in HCRF is a polynomial-time algorithm based on the Asymmetric Inside Outside (AIO) family developed in (Bui et al., 2004) for learning and inference. Another important contribution is to extend the AIO family to address learning with missing data and inference under partially observed labels. We also derive methods to deal with practical concerns associated with the AIO family, including numerical overflow and cubic-time complexity. Finally, we demonstrate good performance of HCRF against rivals on two applications: indoor video surveillance and noun-phrase chunking.

APA, Harvard, Vancouver, ISO, and other styles

16

Komashie, Alexander. "Information-theoretic and stochastic methods for managing the quality of service and satisfaction in healthcare systems." Thesis, Brunel University, 2010. http://bura.brunel.ac.uk/handle/2438/4402.

Full text

Abstract:

This research investigates and develops a new approach to the management of service quality with the emphasis on patient and staff satisfaction in the healthcare sector. The challenge of measuring the quality of service in healthcare requires us to view the problem from multiple perspectives. At the philosophical level, the true nature of quality is still debated; at the psychological level, an accurate conceptual representation is problematic; whilst at the physical level, an accurate measurement of the concept still remains elusive to practitioners and academics. This research focuses on the problem of quality measurement in the healthcare sector. The contributions of this research are fourfold: Firstly, it argues that from the technological point of view the research to date into quality of service in healthcare has not considered methods of real-time measurement and monitoring. This research identifies the key elements that are necessary for developing a real-time quality monitoring system for the healthcare environment.Secondly, a unique index is proposed for the monitoring and improvement of healthcare performance using information-theoretic entropy formalism. The index is formulated based on five key performance indicators and was tested as a Healthcare Quality Index (HQI) based on three key quality indicators of dignity, confidence and communication in an Accident and Emergency department. Thirdly, using an M/G/1 queuing model and its underlying Little’s Law, the concept of Effective Satisfaction in healthcare has been proposed. The concept is based on a Staff-Patient Satisfaction Relation Model (S-PSRM) developed using a patient satisfaction model and an empirically tested model developed for measuring staff satisfaction with workload (service time). The argument is presented that a synergy between patient satisfaction and staff satisfaction is the key to sustainable improvement in healthcare quality. The final contribution is the proposal of a Discrete Event Simulation (DES) modelling platform as a descriptive model that captures the random and stochastic nature of healthcare service provision process to prove the applicability of the proposed quality measurement models.

APA, Harvard, Vancouver, ISO, and other styles

17

Psorakis, Ioannis. "Probabilistic inference in ecological networks : graph discovery, community detection and modelling dynamic sociality." Thesis, University of Oxford, 2013. http://ora.ox.ac.uk/objects/uuid:84741d8b-31ea-4eee-ae44-a0b7b5491700.

Full text

Abstract:

This thesis proposes a collection of analytical and computational methods for inferring an underlying social structure of a given population, observed only via timestamped occurrences of its members across a range of locations. It shows that such data streams have a modular and temporally-focused structure, neither fully ordered nor completely random, with individuals appearing in "gathering events". By exploiting such structure, the thesis proposes an appropriate mapping of those spatio-temporal data streams to a social network, based on the co-occurrences of agents across gathering events, while capturing the uncertainty over social ties via the use of probability distributions. Given the extracted graphs mentioned above, an approach is proposed for studying their community organisation. The method considers communities as explanatory variables for the observed interactions, producing overlapping partitions and node membership scores to groups. The aforementioned models are motivated by a large ongoing experiment at Wytham woods, Oxford, where a population of Parus major wild birds is tagged with RFID devices and a grid of feeding locations generates thousands of spatio-temporal records each year. The methods proposed are applied on such data set to demonstrate how they can be used to explore wild bird sociality, reveal its internal organisation across a variety of different scales and provide insights into important biological processes relating to mating pair formation.

APA, Harvard, Vancouver, ISO, and other styles

18

Dedes, Nonell Irene. "Stochastic approach to the problem of predictive power in the theoretical modeling of the mean-field." Thesis, Strasbourg, 2017. http://www.theses.fr/2017STRAE017/document.

Full text

Abstract:

Les résultats de notre étude des capacités de modélisation théorique axées sur les approches phénoménologiques nucléaires dans le cadre de la théorie du champ-moyen sont présentés. On s’attend à ce qu’une théorie réaliste soit capable de prédire de manière satisfaisante les résultats des expériences à venir, c’est-à-dire avoir ce qu’on appelle un bon pouvoir prédictif. Pour étudier le pouvoir prédictif d’un modèle théorique, nous avons dû tenir compte non seulement des erreurs des données expérimentales, mais aussi des incertitudes issues des approximations du formalisme théorique et de l’existence de corrélations paramétriques. L’une des techniques centrales dans l’ajustement des paramètres est la solution de ce qu’on appelle le Problème Inverse. Les corrélations paramétriques induisent généralement un problème inverse mal-posé; elles doivent être étudiées et le modèle doit être régularisé. Nous avons testé deux types de hamiltoniens phénoménologiques réalistes montrant comment éliminer théoriquement et en pratique les corrélations paramétriques.Nous calculons les intervalles de confiance de niveau, les distributions d’incertitude des prédictions des modèles et nous avons montré comment améliorer les capacités de prédiction et la stabilité de la théorie
Results of our study of the theoretical modelling capacities focussing on the nuclear phenomenological mean-field approaches are presented. It is expected that a realistic theory should be capable of predicting satisfactorily the results of the experiments to come, i.e., having what is called a good predictive power. To study the predictive power of a theoretical model, we had to take into account not only the errors of the experimental data but also the uncertainties originating from approximations of the theoretical formalism and the existence of parametric correlations. One of the central techniques in the parameter adjustment is the solution of what is called the Inverse Problem. Parametric correlations usually induce ill-posedness of the inverse problem; they need to be studied and the model regularised. We have tested two types of realistic phenomenological Hamiltonians showing how to eliminate the parametric correlations theoretically and in practice. We calculate the level confidence intervals, the uncertainty distributions of model predictions and have shown how to improve theory’s prediction capacities and stability

APA, Harvard, Vancouver, ISO, and other styles

19

Rylander, Andreas, and Liam Persson. "Modelling the Impact of Drug Resistance on Treatment as Prevention as an HIV Control Strategy." Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254237.

Full text

Abstract:

Uganda is using a strategy called treatment as prevention where as many individuals as possible that are infected with HIV receive treatment. As a result, the number of newly infected individuals has decreased significantly. However, there is a discussion about a potential problem regarding transmitted drug resistance. This work aims to investigate if this in fact will be a problem in the future, and to estimate the costs for different scenarios. Through developing a population-based mathematical model that describes transmission dynamics of HIV in Uganda, stochastic simulations are made for different conditions. Through analysing our simulations, we can see that Uganda may have to change their approach to HIV treatment.
För att minska smittoriskerna av HIV nyttjar Uganda en strategi som syftar till att behandla så många smittade personer som möjligt. Detta har lett till en signifikant minskning av antalet smittade personer. Det har dock uppstått en diskussion angående om läkemedels-resistent smitta kan komma att utgöra ett problem. Detta arbete syftar till att undersöka om detta kan utgöra ett problem i framtiden samt till att uppskatta de kostnader som kan uppstå i olika typer av scenarion. Under olika förutsättningar genomförs stokastiska simuleringar med hjälp av en matematisk populationsmodell framtagen för att beskriva spridningen av HIV i Uganda. Genom att analysera resultaten från olika simuleringar dras slutsatsen att Uganda kan behöva omvärdera sitt tillvägagångssätt gällande behandling av HIV.

APA, Harvard, Vancouver, ISO, and other styles

20

Mélykúti, Bence. "Theoretical advances in the modelling and interrogation of biochemical reaction systems : alternative formulations of the chemical Langevin equation and optimal experiment design for model discrimination." Thesis, University of Oxford, 2010. http://ora.ox.ac.uk/objects/uuid:d368c04c-b611-41b2-8866-cde16b283b0d.

Full text

Abstract:

This thesis is concerned with methodologies for the accurate quantitative modelling of molecular biological systems. The first part is devoted to the chemical Langevin equation (CLE), a stochastic differential equation driven by a multidimensional Wiener process. The CLE is an approximation to the standard discrete Markov jump process model of chemical reaction kinetics. It is valid in the regime where molecular populations are abundant enough to assume their concentrations change continuously, but stochastic fluctuations still play a major role. We observe that the CLE is not a single equation, but a family of equations with shared finite-dimensional distributions. On the theoretical side, we prove that as many Wiener processes are sufficient to formulate the CLE as there are independent variables in the equation, which is just the rank of the stoichiometric matrix. On the practical side, we show that in the case where there are m_1 pairs of reversible reactions and m_2 irreversible reactions, there is another, simple formulation of the CLE with only m_1+m_2 Wiener processes, whereas the standard approach uses 2m_1+m_2. Considerable computational savings are achieved with this latter formulation. A flaw of the CLE model is identified: trajectories may leave the nonnegative orthant with positive probability. The second part addresses the challenge when alternative, structurally different ordinary differential equation models of similar complexity fit the available experimental data equally well. We review optimal experiment design methods for choosing the initial state and structural changes on the biological system to maximally discriminate between the outputs of rival models in terms of L_2-distance. We determine the optimal stimulus (input) profile for externally excitable systems. The numerical implementation relies on sum of squares decompositions and is demonstrated on two rival models of signal processing in starving Dictyostelium amoebae. Such experiments accelerate the perfection of our understanding of biochemical mechanisms.

APA, Harvard, Vancouver, ISO, and other styles

21

Dyson, Louise. "Mathematical models of cranial neural crest cell migration." Thesis, University of Oxford, 2013. http://ora.ox.ac.uk/objects/uuid:66955fb9-691f-4d27-ad26-39bb2b089c64.

Full text

Abstract:

From the developing embryo to the evacuation of football stadiums, the migration and movement of populations of individuals is a vital part of human life. Such movement often occurs in crowded conditions, where the space occupied by each individual impacts on the freedom of others. This thesis aims to analyse and understand the effects of occupied volume (volume exclusion) on the movement of the individual and the population. We consider, as a motivating system, the rearrangement of individuals required to turn a clump of cells into a functioning embryo. Specifically, we consider the migration of cranial neural crest cells in the developing chick embryo. Working closely with experimental collaborators we construct a hybrid model of the system, consisting of a continuum chemoattractant and individual-based cell description and find that multiple cell phenotypes are required for successful migration. In the crowded environment of the migratory system, volume exclusion is highly important and significantly enhances the speed of cell migration in our model, whilst reducing the numbers of individuals that can enter the domain. The developed model is used to make experimental predictions, that are tested in vivo, using cycles of modelling and experimental work to give greater insight into the biological system. Our formulated model is computational, and is thus difficult to analyse whilst considering different parameter regimes. The second part of the thesis is driven by the wish to systematically analyse our model. As such, it concentrates on developing new techniques to derive continuum equations from diffusive and chemotactic individual-based and hybrid models in one and two spatial dimensions with the incorporation of volume exclusion. We demonstrate the accuracy of our techniques under different parameter regimes and using different mechanisms of movement. In particular, we show that our derived continuum equations almost always compare better to data averaged over multiple simulations than the equivalent equations without volume exclusion. Thus we establish that volume exclusion has a substantial effect on the evolution of a migrating population.

APA, Harvard, Vancouver, ISO, and other styles

22

Öhman, Adam. "The Calibrated SSVI Method - Implied Volatility Surface Construction." Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-257501.

Full text

Abstract:

In this thesis will the question of how to construct implied volatility surfaces in a robust and arbitrage free way be investigated. To be able to know if the solutions are arbitrage free was an initial investigation about arbitrage in volatility surfaces made. From this investigation where two comprehensive theorems found. These theorems came from Roper in \cite{Roper2010}. Based on these where then two applicable arbitrage tests created. These tests came to be very important tools in the remaining thesis.The most reasonable classes of models for modeling the implied volatility surface where then investigated. It was concluded that the classes that seemed to have the best potential where the stochastic volatility models and the parametric representation models. The choice between these two classes where concluded to be based on a trade-off between simplicity and quality of the result. If it where possible to make the parametric representation models improve its result the best applicable choice would be that class. For the remaining thesis was it therefore decided to investigate this class. The parametric representation model that was chosen to be investigated where the SVI parametrization family since it seemed to have the most potential outside of its already strong foundation.The SVI parametrization family is diveded into 3 parametrizations, the raw SVI parametrization, the SSVI parametrization and the eSSVI parametrization. It was concluded that the raw SVI parametrization even though it gives very good market fits, was not robust enough to be chosen. This ment that the raw SVI parametrization would in most cases generate arbitrage in its surfaces. The SSVI model was concluded to be a very strong model compared to the raw SVI, since it was able to generate completely arbitrage free solutions with good enough results. The eSSVI is an extended parametrization of the SSVI with purpose to improve its short maturity results. It was concluded to give small improvements but with the trade of making the optimization procedure harder. It was therefore concluded that the SSVI parametrization might be the better application.To try to improve the results of the SSVI parametrization was a complementary procedure developed which got named the calibrated SSVI method. This method compared to the eSSVI parametrization would not change the parametrization but instead focusing on calibrating the initial fit that the SSVI generated. This method would heavily improve the initial fit of the SSVI surface but was less robust since it generated harder cases for the interpolation and extrapolation.
I det här examensarbetet undersöks frågan om hur man bör modellera implied volatilitetsytor på ett robust och arbitragefritt sätt. För att kunna veta om lösningarna är arbigtragefria börjades arbetet med en undersökning inom arbitrageområdet. De mest heltäckande resultatet som hittades var två theorem av Roper i \cite{Roper2010}. Baserat på dessa theorem kunde två applicerbara arbitragetester skapas som sedan kom att bli en av hörnstenarna i detta arbete. Genom att undersöka de modellklasser som verkade vara de bästa inom området valdes den parametriseringsbeskrivande modellklassen. I denna klass valdes sedan SVI parametriseringsfamiljen för vidare undersökning eftersom det verkade vara den familj av modeller som hade störst potential att uppnå jämnvikt mellan enkel applikation samt bra resultat. För den klassiska SVI modellen i SVI familjen drogs slutsatsen att modellen inte var tillräcklig för att kunna rekommenderas. Detta berodde på att SVI modellen i princip alltid genererade lösningar med arbitrage i. SVI modellen genererar dock väldigt bra lösningar mot marknadsdatan enskilt och kan därför vara ett bra alternativ om man bara ska modellera ett implied volatilitetssmil. SSVI modellen ansågs däremot vara ett väldigt bra alternativ. SSVI modellen genererar komplett aribragefria lösningar men har samtidigt rimligt bra marknadspassning. För att försöka förbättra resultaten från SSVI modellen, var en kompleterande metod kallad den kalibrerade SSVI metoden skapad. Denna metod kom att förbättra marknadspassningen som SSVI modellen genererade men som resultat kom robustheten att sjunka, då interpoleringen och extrapoleringen blev svårare att genomföra arbitragefritt.

APA, Harvard, Vancouver, ISO, and other styles

23

Lohier, Théophile. "Analyse temporelle de la dynamique de communautés végétales à l'aide de modèles individus-centrés." Thesis, Clermont-Ferrand 2, 2016. http://www.theses.fr/2016CLF22683/document.

Full text

Abstract:

Les communautés végétales constituent des systèmes complexes au sein desquels de nombreuses espèces, pouvant présenter une large variété de traits fonctionnels, interagissent entre elles et avec leur environnement. En raison de la quantité et de la diversité de ces interactions les mécanismes qui gouvernent les dynamiques des ces communautés sont encore mal connus. Les approches basées sur la modélisation permettent de relier de manière mécaniste les processus gouvernant les dynamiques des individus ou des populations aux dynamiques des communautés qu'ils forment. L'objectif de cette thèse était de développer de telles approches et de les mettre en oeuvre pour étudier les mécanismes sous-jacents aux dynamiques des communautés. Nous avons ainsi développés deux approches de modélisation. La première s'appuie sur un cadre de modélisation stochastique permettant de relier les dynamiques de populations aux dynamiques des communautés en tenant compte des interactions intra- et interspécifiques et de l'impact des variations environnementale et démographique. Cette approche peut-être aisément appliquée à des systèmes réels et permet de caractériser les populations végétales à l'aide d'un petit nombre de paramètres démographiques. Cependant nos travaux suggèrent qu'il n'existe pas de relation simple entre ces paramètres et les traits fonctionnels des espèces, qui gouvernent pourtant leur réponse aux facteurs externes. La seconde approche a été développée pour dépasser cette limite et s'appuie sur le modèle individu-centré Nemossos qui représente de manière explicite le lien entre le fonctionnement des individus et les dynamiques de la communauté qu'ils forment. Afin d'assurer un grand potentiel d'application à Nemossos, nous avons apportés une grande attention au compromis entre réalisme et coût de paramétrisation. Nemossos a ainsi pu être entièrement paramétré à partir de valeur de traits issues de la littérature , son réalisme a été démontré, et il a été utilisé pour mener des expériences de simulations numériques sur l'importance de la variabilité temporelle des conditions environnementales pour la coexistence d'espèces fonctionnellement différentes. La complémentarité des deux approches nous a permis de proposer des éléments de réponse à divers questions fondamentales de l'écologie des communautés incluant le rôle de la compétition dans les dynamiques des communautés, l'effet du filtrage environnementale sur leur composition fonctionnel ou encore les mécanismes favorisant la coexistence des espèces végétales. Ici ces approches ont été utilisées séparément mais leur couplage peut offrir des perspectives intéressantes telles que l'étude du lien entre le fonctionnement des plantes et les dynamiques des populations. Par ailleurs chacune des approches peut être utilisée dans une grande variété d'expériences de simulation susceptible d'améliorer notre compréhension des mécanismes gouvernant les communautés végétales
Plant communities are complex systems in which multiple species differing by their functional attributes interact with their environment and with each other. Because of the number and the diversity of these interactions the mechanisms that drive the dynamics of theses communities are still poorly understood. Modelling approaches enable to link in a mechanistic fashion the process driving individual plant or population dynamics to the resulting community dynamics. This PhD thesis aims at developing such approaches and to use them to investigate the mechanisms underlying community dynamics. We therefore developed two modelling approaches. The first one is based on a stochastic modelling framework allowing to link the population dynamics to the community dynamics whilst taking account of intra- and interspecific interactions as well as environmental and demographic variations. This approach is easily applicable to real systems and enables to describe the properties of plant population through a small number of demographic parameters. However our work suggests that there is no simple relationship between these parameters and plant functional traits, while they are known to drive their response to extrinsic factors. The second approach has been developed to overcome this limitation and rely on the individual-based model Nemossos that explicitly describes the link between plant functioning and community dynamics. In order to ensure that Nemossos has a large application potential, a strong emphasis has been placed on the tradeoff between realism and parametrization cost. Nemossos has then been successfully parameterized from trait values found in the literature, its realism has been demonstrated and it has been used to investigate the importance of temporal environmental variability for the coexistence of functionally differing species. The complementarity of the two approaches allows us to explore various fundamental questions of community ecology including the impact of competitive interactions on community dynamics, the effect of environmental filtering on their functional composition, or the mechanisms favoring the coexistence of plant species. In this work, the two approaches have been used separately but their coupling might offer interesting perspectives such as the investigation of the relationships between plant functioning and population dynamics. Moreover each of the approaches might be used to run various simulation experiments likely to improve our understanding of mechanisms underlying community dynamics

APA, Harvard, Vancouver, ISO, and other styles

24

Paulin, Carl. "Detecting anomalies in data streams driven by ajump-diffusion process." Thesis, Umeå universitet, Institutionen för fysik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-184230.

Full text

Abstract:

Jump-diffusion processes often model financial time series as they can simulate the random jumps that they frequently exhibit. These jumps can be seen as anomalies and are essential for financial analysis and model building, making them vital to detect.The realized variation, realized bipower variation, and realized semi-variation were tested to see if one could use them to detect jumps in a jump-diffusion process and if anomaly detection algorithms can use them as features to improve their accuracy. The algorithms tested were Isolation Forest, Robust Random Cut Forest, and Isolation Forest Algorithm for Streaming Data, where the latter two use streaming data. This was done by generating a Merton jump-diffusion process with a varying jump-rate and tested using each algorithm with each of the features. The performance of each algorithm was measured using the F1-score to compare the difference between features and algorithms. It was found that the algorithms were improved from using the features; Isolation Forest saw improvement from using one, or more, of the named features. For the streaming algorithms, Robust Random Cut Forest performed the best for every jump-rate except the lowest. Using a combination of the features gave the highest F1-score for both streaming algorithms. These results show one can use these features to extract jumps, as anomaly scores, and improve the accuracy of the algorithms, both in a batch and stream setting.
Hopp-diffusionsprocesser används regelbundet för att modellera finansiella tidsserier eftersom de kan simulera de slumpmässiga hopp som ofta uppstår. Dessa hopp kan ses som anomalier och är viktiga för finansiell analys och modellbyggnad, vilket gör dom väldigt viktiga att hitta. Den realiserade variationen, realiserade bipower variationen, och realiserade semi-variationen är faktorer av en tidsserie som kan användas för att hitta hopp i hopp-diffusionprocesser. De används här för att testa om anomali-detektionsalgoritmer kan använda funktionerna för att förbättra dess förmåga att detektera hopp. Algoritmerna som testades var Isolation Forest, Robust Random Cut Forest, och Isolation Forest Algoritmen för Strömmande data, där de två sistnämnda använder strömmande data. Detta gjordes genom att genera data från en Merton hopp-diffusionprocess med varierande hoppfrekvens där de olika algoritmerna testades med varje funktion samt med kombinationer av funktioner. Prestationen av varje algoritm beräknades med hjälp av F1-värde för att kunna jämföra algoritmerna och funktionerna med varandra. Det hittades att funktionerna kan användas för att extrahera hopp från hopp-diffusionprocesser och även använda de som en indikator för när hopp skulle ha hänt. Algoritmerna fick även ett högre F1-värde när de använde funktionerna. Isolation Forest fick ett förbättrat F1-värde genom att använda en eller fler utav funktionerna och hade ett högre F1-värde än att bara använda funktionerna för att detektera hopp. Robust Random Cut Forest hade högst F1-värde av de två algoritmer som använde strömmande data och båda fick högst F1-värde när man använde en kombination utav alla funktioner. Resultatet visar att dessa funktioner fungerar för att extrahera hopp från hopprocesser, använda dem för att detektera hopp, och att algoritmernas förmåga att detektera hoppen ökade med hjälp av funktionerna.

APA, Harvard, Vancouver, ISO, and other styles

25

Häfner, Reinhold. "Stochastic implied volatility : a factor-based model /." Berlin ; New York : Springer, 2004. http://www.loc.gov/catdir/enhancements/fy0813/2004109369-d.html.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Henter, Gustav Eje. "Probabilistic Sequence Models with Speech and Language Applications." Doctoral thesis, KTH, Kommunikationsteori, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-134693.

Full text

Abstract:

Series data, sequences of measured values, are ubiquitous. Whenever observations are made along a path in space or time, a data sequence results. To comprehend nature and shape it to our will, or to make informed decisions based on what we know, we need methods to make sense of such data. Of particular interest are probabilistic descriptions, which enable us to represent uncertainty and random variation inherent to the world around us. This thesis presents and expands upon some tools for creating probabilistic models of sequences, with an eye towards applications involving speech and language. Modelling speech and language is not only of use for creating listening, reading, talking, and writing machines---for instance allowing human-friendly interfaces to future computational intelligences and smart devices of today---but probabilistic models may also ultimately tell us something about ourselves and the world we occupy. The central theme of the thesis is the creation of new or improved models more appropriate for our intended applications, by weakening limiting and questionable assumptions made by standard modelling techniques. One contribution of this thesis examines causal-state splitting reconstruction (CSSR), an algorithm for learning discrete-valued sequence models whose states are minimal sufficient statistics for prediction. Unlike many traditional techniques, CSSR does not require the number of process states to be specified a priori, but builds a pattern vocabulary from data alone, making it applicable for language acquisition and the identification of stochastic grammars. A paper in the thesis shows that CSSR handles noise and errors expected in natural data poorly, but that the learner can be extended in a simple manner to yield more robust and stable results also in the presence of corruptions. Even when the complexities of language are put aside, challenges remain. The seemingly simple task of accurately describing human speech signals, so that natural synthetic speech can be generated, has proved difficult, as humans are highly attuned to what speech should sound like. Two papers in the thesis therefore study nonparametric techniques suitable for improved acoustic modelling of speech for synthesis applications. Each of the two papers targets a known-incorrect assumption of established methods, based on the hypothesis that nonparametric techniques can better represent and recreate essential characteristics of natural speech. In the first paper of the pair, Gaussian process dynamical models (GPDMs), nonlinear, continuous state-space dynamical models based on Gaussian processes, are shown to better replicate voiced speech, without traditional dynamical features or assumptions that cepstral parameters follow linear autoregressive processes. Additional dimensions of the state-space are able to represent other salient signal aspects such as prosodic variation. The second paper, meanwhile, introduces KDE-HMMs, asymptotically-consistent Markov models for continuous-valued data based on kernel density estimation, that additionally have been extended with a fixed-cardinality discrete hidden state. This construction is shown to provide improved probabilistic descriptions of nonlinear time series, compared to reference models from different paradigms. The hidden state can be used to control process output, making KDE-HMMs compelling as a probabilistic alternative to hybrid speech-synthesis approaches. A final paper of the thesis discusses how models can be improved even when one is restricted to a fundamentally imperfect model class. Minimum entropy rate simplification (MERS), an information-theoretic scheme for postprocessing models for generative applications involving both speech and text, is introduced. MERS reduces the entropy rate of a model while remaining as close as possible to the starting model. This is shown to produce simplified models that concentrate on the most common and characteristic behaviours, and provides a continuum of simplifications between the original model and zero-entropy, completely predictable output. As the tails of fitted distributions may be inflated by noise or empirical variability that a model has failed to capture, MERS's ability to concentrate on high-probability output is also demonstrated to be useful for denoising models trained on disturbed data.

QC 20131128

ACORNS: Acquisition of Communication and Recognition Skills
LISTA – The Listening Talker

APA, Harvard, Vancouver, ISO, and other styles

27

Backman, Emil, and David Petersson. "Evaluation of methods for quantifying returns within the premium pension." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-288499.

Full text

Abstract:

Pensionsmyndigheten's (the Swedish Pensions Agency) current calculation of the internal rate of return for 7.7 million premium pension savers is both time and resource consuming. This rate of return mirrors the overall performance of the funded part of the pension system and is analyzed internally, but also reported to the public monthly and yearly based on differently sized data samples. This thesis aims to investigate the possibility of utilizing other approaches in order to improve the performance of these calculations. Further, the study aims to verify the results stemming from said calculations and investigate their robustness. In order to investigate competitive matrix methods, a sample of approaches are compared to the more classical numerical methods. The approaches are compared in different scenarios aimed to mirror real practice. The robustness of the results are then analyzed by a stochastic modeling approach, where a small error term is introduced aimed to mimic possible errors which could arise in data management. It is concluded that a combination of Halley's method and the Jacobi-Davidson algorithm is the most robust and high performing method. The proposed method combines the speed and robustness from numerical and matrix methods, respectively. The result show a performance improvement of 550% in time, while maintaining the accuracy of the current server computations. The analysis of error propagation suggests the output error to be less than 0.12 percentage points in 99 percent of the cases, considering an introduced error term of large proportions. In this extreme case, the modeled expected number of individuals with an error exceeding 1 percentage point is estimated to be 212 out of the whole population.
Pensionsmyndighetens nuvarande beräkning av internräntan för 7,7 miljoner pensionssparare är både tid- och resurskrävande. Denna avkastning ger en översikt av hur väl den fonderade delen av pensionssystemet fungerar. Detta analyseras internt men rapporteras även till allmänheten varje månad samt årligen baserat på olika urval av data. Denna uppsats avser att undersöka möjligheten att använda andra tillvägagångssätt för att förbättra prestanda för denna typ av beräkningar. Vidare syftar studien till att verifiera resultaten som härrör från dessa beräkningar och undersöka deras stabilitet. För att undersöka om det finns konkurrerande matrismetoder jämförs ett urval av tillvägagångssätt med de mer klassiska numeriska metoderna. Metoderna jämförs i flera olika scenarier som syftar till att spegla verklig praxis. Stabiliteten i resultaten analyseras med en stokastisk modellering där en felterm införs för att efterlikna möjliga fel som kan uppstå i datahantering. Man drar slutsatsen att en kombination av Halleys metod och Jacobi-Davidson-algoritmen är den mest robusta och högpresterande metoden. Den föreslagna metoden kombinerar hastigheten från numeriska metoder och tillförlitlighet från matrismetoder. Resultatet visar en prestandaförbättring på 550 % i tid, samtidigt som samma noggrannhet som ses i de befintliga serverberäkningarna bibehålls. Analysen av felutbredning föreslår att felet i 99 procent av fallen är mindre än 0,12 procentenheter i det fall där införd felterm har stora proportioner. I detta extrema fall uppskattas det förväntade antalet individer med ett fel som överstiger 1 procentenhet vara 212 av hela befolkningen.

APA, Harvard, Vancouver, ISO, and other styles

28

Boano, Danquah Jerry. "Stochastic Modelling of Daily Peak Electricity Demand Using Value Theory." Diss., 2018. http://hdl.handle.net/11602/1209.

Full text

Abstract:

MSc (Statistics)
Department of Statistics
Daily peak electricity data from ESKOM, South African power utility company for the period, January 1997 to December 2013 consisting of 6209 observations were used in this dissertation. Since 1994, the increased electricity demand has led to sustainability issues in South Africa. In addition, the electricity demand continues to rise everyday due to a variety of driving factors. Considering this, if the electricity generating capacity in South Africa does not show potential signs of meeting the country’s demands in the subsequent years, this may have a significant impact on the national grid causing it to operate in a risky and vulnerable state, leading to disturbances, such as load shedding as experienced during the past few years. In particular, it is of greater interest to have sufficient information about the extreme value of the stochastic load process in time for proper planning, designing the generation and distribution system, and the storage devices as these would ensure efficiency in the electrical energy in order to maintain discipline in the grid systems. More importantly, electricity is an important commodity used mainly as a source of energy in industrial, residential and commercial sectors. Effective monitoring of electricity demand is of great importance because demand that exceeds maximum power generated will lead to power outage and load shedding. It is in the light of this that the study seeks to assess the frequency of occurrence of extreme peak electricity demand in order to come up with a full electricity demand distribution capable of managing uncertainties in the grid system. In order to achieve stationarity in the daily peak electricity demand (DPED), we apply a penalized regression cubic smoothing spline to ensure the data is non-linearly detrended. The R package “evmix” is used to estimate the thresholds using the bounded corrected kernel density plot. The non-linear detrended datasets were divided into summer, spring, winter and autumn according to the calender dates in the Southern Hemisphere for frequency analysis. The data is declustered using Ferro and Segers automatic declustering method. The cluster maxima is extracted using the R package “evd”. We fit Poisson GPD and stationary point process to the cluster maxima and the intensity function of the point process which measures the frequency of occurrence of the daily peak electricity demand per year is calculated for each dataset. The formal goodness-of-fit test based on Cramer-Von Mises statistics and Anderson-Darling statistics supported the null hypothesis that each dataset follow Poisson GPD (σ, ξ) at 5 percent level of significance. The modelling framework, which is easily extensible to other peak load parameters, is based on the assumption that peak power follows a Poisson process. The parameters of the developed i models were estimated using the Maximum Likelihood. The usual asymptotic properties underlying the Poisson GPD were satisfied by the model.
NRF

APA, Harvard, Vancouver, ISO, and other styles

29

"Modelling and analysis of system state estimation with communication constraints." 1996. http://library.cuhk.edu.hk/record=b6073065.

Full text

Abstract:

by Li Xia.
Thesis (Ph.D.)--Chinese University of Hong Kong, 1996.
Includes bibliographical references (p. 129-134).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Mode of access: World Wide Web.

APA, Harvard, Vancouver, ISO, and other styles

30

Webb, Ryan G. "Towards a Neural Measure of Value and the Modelling of Choice in Strategic Games." Thesis, 2011. http://hdl.handle.net/1974/6565.

Full text

Abstract:

Neuroeconomic models take economic theory literally, interpreting hypothesized quantities as observables in the brain in order to provide insight into choice behaviour. This thesis develops a model of the neural decision process in strategic games with a unique mixed strategy equilibrium. In such games, players face both an incentive to best-respond to valuations and to act unpredictably. Similarly, we model choice as the result of the interaction between action value and the noise inherent in networks of spiking neurons. Our neural model generates any ratio of choices through the specification of action value, including the equilibrium ratio, and provides an explanation for why we observe equilibrium behaviour in some contexts and not others. The model generalizes to a random-utility model which gives a structural specification to the error term and makes action value observable in the spike rates of neurons. Action value is measured in the spike activity of the Superior Colliculus (SC) while monkeys play a saccade version of matching pennies. We find SC activity predicts upcoming choices and is influenced by the history of events in the game, correlating with a behaviourally-established model of learning, and choice simulations based on neural measures of value exhibit similar biases to our behavioural data. A neural measure of value yields a glimpse at how valuations are updated in response to new information and compared stochastically, providing us with unique insight into modelling choice in strategic games.
Thesis (Ph.D, Economics) -- Queen's University, 2011-06-14 14:48:36.226

APA, Harvard, Vancouver, ISO, and other styles

31

Prasad, H. L. "Algorithms For Stochastic Games And Service Systems." Thesis, 2012. http://etd.iisc.ernet.in/handle/2005/2301.

Full text

Abstract:

This thesis is organized into two parts, one for my main area of research in the field of stochastic games, and the other for my contributions in the area of service systems. We first provide an abstract for my work in stochastic games. The field of stochastic games has been actively pursued over the last seven decades because of several of its important applications in oligopolistic economics. In the past, zero-sum stochastic games have been modelled and solved for Nash equilibria using the standard techniques of Markov decision processes. General-sum stochastic games on the contrary have posed difficulty as they cannot be reduced to Markov decision processes. Over the past few decades the quest for algorithms to compute Nash equilibria in general-sum stochastic games has intensified and several important algorithms such as stochastic tracing procedure [Herings and Peeters, 2004], NashQ [Hu and Wellman, 2003], FFQ [Littman, 2001], etc., and their generalised representations such as the optimization problem formulations for various reward structures [Filar and Vrieze, 1997] have been proposed. However, they suffer from either lack of generality or are intractable for even medium sized problems or both. In our venture towards algorithms for stochastic games, we start with a non-linear optimization problem and then design a simple gradient descent procedure for the same. Though this procedure gives the Nash equilibrium for a sample problem of terrain exploration, we observe that, in general, it need not be true. We characterize the necessary conditions and define KKT-N point. KKT-N points are those Karush-Kuhn-Tucker (KKT) points which corresponding to Nash equilibria. Thus, for a simple gradient based algorithm to guarantee convergence to Nash equilibrium, all KKT points of the optimization problem need to be KKT-N points, which restricts the applicability of such algorithms. We then take a step back and start looking at better characterization of those points of the optimization problem which correspond to Nash equilibria of the underlying game. As a result of this exploration, we derive two sets of necessary and sufficient conditions. The first set, KKT-SP conditions, is inspired from KKT conditions itself and is obtained by breaking down the main optimization problem into several sub-problems and then applying KKT conditions to each one of those sub-problems. The second set, SG-SP conditions, is a simplified set of conditions which characterize those Nash points more compactly. Using both KKT-SP and SG-SP conditions, we propose three algorithms, OFF-SGSP, ON-SGSP and DON-SGSP, respectively, which we show provide Nash equilibrium strategies for general-sum discounted stochastic games. Here OFF-SGSP is an off-line algorithm while ONSGSP and DON-SGSP are on-line algorithms. In particular, we believe that DON-SGSP is the first decentralized on-line algorithm for general-sum discounted stochastic games. We show that both our on-line algorithms are computationally efficient. In fact, we show that DON-SGSP is not only applicable for multi-agent scenarios but is also directly applicable for the single-agent case, i.e., MDPs (Markov Decision Processes). The second part of the thesis focuses on formulating and solving the problem of minimizing the labour-cost in service systems. We define the setting of service systems and then model the labour-cost problem as a constrained discrete parameter Markov-cost process. This Markov process is parametrized by the number of workers in various shifts and with various skill levels. With the number of workers as optimization variables, we provide a detailed formulation of a constrained optimization problem where the objective is the expected long-run averages of the single-stage labour-costs, and the main set of constraints are the expected long-run average of aggregate SLAs (Service Level Agreements). For this constrained optimization problem, we provide two stochastic optimization algorithms, SASOC-SF-N and SASOC-SF-C, which use smoothed functional approaches to estimate gradient and perform gradient descent in the aforementioned constrained optimization problem. SASOC-SF-N uses Gaussian distribution for smoothing while SASOC-SF-C uses Cauchy distribution for the same. SASOC-SF-C is the first Cauchy based smoothing algorithm which requires a fixed number (two) of simulations independent of the number of optimization variables. We show that these algorithms provide an order of magnitude better performance than existing industrial standard tool, OptQuest. We also show that SASOC-SF-C gives overall better performance.

APA, Harvard, Vancouver, ISO, and other styles

32

(6368468), Daesung Kim. "Stability for functional and geometric inequalities and a stochastic representation of fractional integrals and nonlocal operators." Thesis, 2019.

Find full text

Abstract:

The dissertation consists of two research topics.

The first research direction is to study stability of functional and geometric inequalities. Stability problem is to estimate the deficit of a functional or geometric inequality in terms of the distance from the class of optimizers or a functional that identifies the optimizers. In particular, we investigate the logarithmic Sobolev inequality, the Beckner-Hirschman inequality (the entropic uncertainty principle), and isoperimetric type inequalities for the expected lifetime of Brownian motion.

The second topic of the thesis is a stochastic representation of fractional integrals and nonlocal operators. We extend the Hardy-Littlewood-Sobolev inequality to symmetric Markov semigroups. To this end, we construct a stochastic representation of the fractional integral using the background radiation process. The inequality follows from a new inequality for the fractional Littlewood-Paley square function. We also prove the Hardy-Stein identity for non-symmetric pure jump Levy processes and the L^p boundedness of a certain class of Fourier multiplier operators arising from non-symmetric pure jump Levy processes. The proof is based on Ito's formula for general jump processes and the symmetrization of Levy processes.

APA, Harvard, Vancouver, ISO, and other styles

33

(6630578), Yellamraju Tarun. "n-TARP: A Random Projection based Method for Supervised and Unsupervised Machine Learning in High-dimensions with Application to Educational Data Analysis." Thesis, 2019.

Find full text

Abstract:

Analyzing the structure of a dataset is a challenging problem in high-dimensions as the volume of the space increases at an exponential rate and typically, data becomes sparse in this high-dimensional space. This poses a significant challenge to machine learning methods which rely on exploiting structures underlying data to make meaningful inferences. This dissertation proposes the n-TARP method as a building block for high-dimensional data analysis, in both supervised and unsupervised scenarios.

The basic element, n-TARP, consists of a random projection framework to transform high-dimensional data to one-dimensional data in a manner that yields point separations in the projected space. The point separation can be tuned to reflect classes in supervised scenarios and clusters in unsupervised scenarios. The n-TARP method finds linear separations in high-dimensional data. This basic unit can be used repeatedly to find a variety of structures. It can be arranged in a hierarchical structure like a tree, which increases the model complexity, flexibility and discriminating power. Feature space extensions combined with n-TARP can also be used to investigate non-linear separations in high-dimensional data.

The application of n-TARP to both supervised and unsupervised problems is investigated in this dissertation. In the supervised scenario, a sequence of n-TARP based classifiers with increasing complexity is considered. The point separations are measured by classification metrics like accuracy, Gini impurity or entropy. The performance of these classifiers on image classification tasks is studied. This study provides an interesting insight into the working of classification methods. The sequence of n-TARP classifiers yields benchmark curves that put in context the accuracy and complexity of other classification methods for a given dataset. The benchmark curves are parameterized by classification error and computational cost to define a benchmarking plane. This framework splits this plane into regions of "positive-gain" and "negative-gain" which provide context for the performance and effectiveness of other classification methods. The asymptotes of benchmark curves are shown to be optimal (i.e. at Bayes Error) in some cases (Theorem 2.5.2).

In the unsupervised scenario, the n-TARP method highlights the existence of many different clustering structures in a dataset. However, not all structures present are statistically meaningful. This issue is amplified when the dataset is small, as random events may yield sample sets that exhibit separations that are not present in the distribution of the data. Thus, statistical validation is an important step in data analysis, especially in high-dimensions. However, in order to statistically validate results, often an exponentially increasing number of data samples are required as the dimensions increase. The proposed n-TARP method circumvents this challenge by evaluating statistical significance in the one-dimensional space of data projections. The n-TARP framework also results in several different statistically valid instances of point separation into clusters, as opposed to a unique "best" separation, which leads to a distribution of clusters induced by the random projection process.

The distributions of clusters resulting from n-TARP are studied. This dissertation focuses on small sample high-dimensional problems. A large number of distinct clusters are found, which are statistically validated. The distribution of clusters is studied as the dimensionality of the problem evolves through the extension of the feature space using monomial terms of increasing degree in the original features, which corresponds to investigating non-linear point separations in the projection space.

A statistical framework is introduced to detect patterns of dependence between the clusters formed with the features (predictors) and a chosen outcome (response) in the data that is not used by the clustering method. This framework is designed to detect the existence of a relationship between the predictors and response. This framework can also serve as an alternative cluster validation tool.

The concepts and methods developed in this dissertation are applied to a real world data analysis problem in Engineering Education. Specifically, engineering students' Habits of Mind are analyzed. The data at hand is qualitative, in the form of text, equations and figures. To use the n-TARP based analysis method, the source data must be transformed into quantitative data (vectors). This is done by modeling it as a random process based on the theoretical framework defined by a rubric. Since the number of students is small, this problem falls into the small sample high-dimensions scenario. The n-TARP clustering method is used to find groups within this data in a statistically valid manner. The resulting clusters are analyzed in the context of education to determine what is represented by the identified clusters. The dependence of student performance indicators like the course grade on the clusters formed with n-TARP are studied in the pattern dependence framework, and the observed effect is statistically validated. The data obtained suggests the presence of a large variety of different patterns of Habits of Mind among students, many of which are associated with significant grade differences. In particular, the course grade is found to be dependent on at least two Habits of Mind: "computation and estimation" and "values and attitudes."

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Stochastic modelling theory'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles