To see the other types of publications on this topic, follow the link: Data center.

Dissertations / Theses on the topic 'Data center'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Data center.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Wiswell, Shane. "Data center migration." [Denver, Colo.] : Regis University, 2007. http://165.236.235.140/lib/SWiswell2007.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Sehery, Wile Ali. "OneSwitch Data Center Architecture." Diss., Virginia Tech, 2018. http://hdl.handle.net/10919/94376.

Full text
Abstract:
In the last two-decades data center networks have evolved to become a key element in improving levels of productivity and competitiveness for different types of organizations. Traditionally data center networks have been constructed with 3 layers of switches, Edge, Aggregation, and Core. Although this Three-Tier architecture has worked well in the past, it poses a number of challenges for current and future data centers. Data centers today have evolved to support dynamic resources such as virtual machines and storage volumes from any physical location within the data center. This has led to highly volatile and unpredictable traffic patterns. Also The emergence of "Big Data" applications that exchange large volumes of information have created large persistent flows that need to coexist with other traffic flows. The Three-Tier architecture and current routing schemes are no longer sufficient for achieving high bandwidth utilization. Data center networks should be built in a way where they can adequately support virtualization and cloud computing technologies. Data center networks should provide services such as, simplified provisioning, workload mobility, dynamic routing and load balancing, equidistant bandwidth and latency. As data center networks have evolved the Three-Tier architecture has proven to be a challenge not only in terms of complexity and cost, but it also falls short of supporting many new data center applications. In this work we propose OneSwitch: A switch architecture for the data center. OneSwitch is backward compatible with current Ethernet standards and uses an OpenFlow central controller, a Location Database, a DHCP Server, and a Routing Service to build an Ethernet fabric that appears as one switch to end devices. This allows the data center to use switches in scale-out topologies to support hosts in a plug and play manner as well as provide much needed services such as dynamic load balancing, intelligent routing, seamless mobility, equidistant bandwidth and latency.
PHD
APA, Harvard, Vancouver, ISO, and other styles
3

Sergejev, Ivan. "Exposing the Data Center." Thesis, Virginia Tech, 2014. http://hdl.handle.net/10919/51838.

Full text
Abstract:
Given the rapid growth in the importance of the Internet, data centers - the buildings that store information on the web - are quickly becoming the most critical infrastructural objects in the world. However, so far they have received very little, if any, architectural attention. This thesis proclaims data centers to be the 'churches' of the digital society and proposes a new type of a publicly accessible data center. The thesis starts with a brief overview of the history of data centers and the Internet in general, leading to a manifesto for making data centers into public facilities with an architecture of their own. After, the paper proposes a roadmap for the possible future development of the building type with suggestions for placing future data centers in urban environments, incorporating public programs as a part of the building program, and optimizing the inside workings of a typical data center. The final part of the work, concentrates on a design for an exemplary new data center, buildable with currently available technologies. This thesis aims to: 1) change the public perception of the internet as a non-physical thing, and data centers as purely functional infrastructural objects without any deeper cultural significance and 2) propose a new architectural language for the type.
Master of Architecture
APA, Harvard, Vancouver, ISO, and other styles
4

Wang, Qinjin. "Multi Data center Transaction Chain : Achieving ACID for cross data center multi-key transactions." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-198664.

Full text
Abstract:
Transaction support for Geo-replicated storage system is one of the most popular challenges in the last few years. Some systems gave up for supporting transactions and let upper application layer to handle it. While some other systems tried with different solutions on guaranteeing the correctness of transactions and paid some efforts on performance improvements. However, there are very few systems that claim the supporting of ACID in the global scale. In this thesis, we have studied on various data consistency and transaction design theories such as Paxos, transaction chopping, transaction chain, etc. We have also analyzed several recent distributed transactional systems. As the result, a Geo-replicated transactional framework, namely Multi Data center Transaction Chain (MDTC), is designed and implemented. MDTC adopts transaction chopping approach, which brings more concurrency by chopping transactions into pieces. A two phase traversal mechanism is designed to validate and maintain dependencies. For cross data center consistency, a Paxos like majority vote protocol is designed and implemented as a state machine. Moreover, some tuning such as executing read-only transaction locally helps to improve performance of MDTC in different scenarios. MDTC only requires 1 cross data center message roundtrip for executing a distributed transaction globally. ACID properties are kept in MDTC. We have evaluated MDTC with an extended TPC-C benchmark on top of Cassandra. The results from various setups have been evaluated and the result shows that MDTC achieves a good performance on throughout and latency. Meanwhile it has very low abort rate and scales well for transactions executed in a global scale.
APA, Harvard, Vancouver, ISO, and other styles
5

Talarico, Gui. "Urban Data Center: A Architectural Celebration of Data." Thesis, Virginia Tech, 2011. http://hdl.handle.net/10919/42855.

Full text
Abstract:
Throughout the last century, the popularization of the automobile and development of roads and highways has changed the way we live, and how cities develop. Bridges, aqueducts, and power plants had comparable impact in the past. I consider each of these examples to be â iconsâ of infrastructures that we humans build to improve our living environments and to fulfill our urge to become better.Fast forward to now. The last decades showed us the development of new sophisticated networks that connect people and continents. Communication grids, satellite communication, high speed fiber optics and many other technologies have made possible the existence of the ultimate human network - the internet. A network created by us to satisfy our needs to connect, to share, to socialize and communicate over distances never before imagined. The data center is the icon of this network.Through modern digitalization methods, text, sounds, images, and knowledge can be converted into zeroâ s and oneâ s and distributed almost instantly to all corners of the world. The data center is the center piece in the storage, processing, and distribution of this data.The Urban Data Center hopes to bring this icon closer to its creators and users. Let us celebrate its existence and shed some light into the inner workings of the worldâ s largest network. Let the users that inhabit this critical network come inside of it and understand where it lives. This thesis explores the expressive potential of networks and data through the design of a data center in Washington, DC.
Master of Architecture
APA, Harvard, Vancouver, ISO, and other styles
6

Müller, Thomas. "Innovative Technologien im Data Center." Universitätsbibliothek Chemnitz, 2009. http://nbn-resolving.de/urn:nbn:de:bsz:ch1-200900947.

Full text
Abstract:
Vorgestellt wird die Architektur des Data Centers der TU Chemnitz, die auf den neuen Technologie Data Center Bridging (DCB) und Fibre Channel over Ethernet (FCoE) basiert. Es werden die entsprechenden Standards dargestellt und ein Überblick zur gegenwärtig verfügbaren Technik gegeben. Das Rechenzentrum der TU Chemnitz setzt diese Technologien bereits erfolgreich im Kontext von VMware-Virtualisierung und bei Betrieb I/O-intensiver Systeme ein.
APA, Harvard, Vancouver, ISO, and other styles
7

Bjarnadóttir, Margrét Vilborg. "Data-driven approach to health care : applications using claims data." Thesis, Massachusetts Institute of Technology, 2008. http://hdl.handle.net/1721.1/45946.

Full text
Abstract:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2008.
Includes bibliographical references (p. 123-130).
Large population health insurance claims databases together with operations research and data mining methods have the potential of significantly impacting health care management. In this thesis we research how claims data can be utilized in three important areas of health care and medicine and apply our methods to a real claims database containing information of over two million health plan members. First, we develop forecasting models for health care costs that outperform previous results. Secondly, through examples we demonstrate how large-scale databases and advanced clustering algorithms can lead to discovery of medical knowledge. Lastly, we build a mathematical framework for a real-time drug surveillance system, and demonstrate with real data that side effects can be discovered faster than with the current post-marketing surveillance system.
by Margrét Vilborg Bjarnadóttir.
Ph.D.
APA, Harvard, Vancouver, ISO, and other styles
8

Le, Guen Thibault. "Data-driven pricing." Thesis, Massachusetts Institute of Technology, 2008. http://hdl.handle.net/1721.1/45627.

Full text
Abstract:
Thesis (S.M.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2008.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references (p. 143-146).
In this thesis, we develop a pricing strategy that enables a firm to learn the behavior of its customers as well as optimize its profit in a monopolistic setting. The single product case as well as the multi product case are considered under different parametric forms of demand, whose parameters are unknown to the manager. For the linear demand case in the single product setting, our main contribution is an algorithm that guarantees almost sure convergence of the estimated demand parameters to the true parameters. Moreover, the pricing strategy is also asymptotically optimal. Simulations are run to study the sensitivity to different parameters.Using our results on the single product case, we extend the approach to the multi product case with linear demand. The pricing strategy we introduce is easy to implement and guarantees not only learning of the demand parameters but also maximization of the profit. Finally, other parametric forms of the demand are considered. A heuristic that can be used for many parametric forms of the demand is introduced, and is shown to have good performance in practice.
by Thibault Le Guen.
S.M.
APA, Harvard, Vancouver, ISO, and other styles
9

Javanshir, Marjan. "DC distribution system for data center." Thesis, Click to view the E-thesis via HKUTO, 2007. http://sunzi.lib.hku.hk/hkuto/record/B39344952.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Bennion, Laird. "Identifying data center supply and demand." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/103457.

Full text
Abstract:
Thesis: S.M. in Real Estate Development, Massachusetts Institute of Technology, Program in Real Estate Development in conjunction with the Center for Real Estate, 2016.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 66-69).
This thesis documents new methods for gauging supply and demand of data center capacity and addresses issues surrounding potential threats to data center demand. This document is divided between a primer on the composition and engineering of a current data center, discussion of issues surrounding data center demand, Moore's Law and cloud computing, and then transitions to presentation of research on data center demand and supply.
by Laird Bennion.
S.M. in Real Estate Development
APA, Harvard, Vancouver, ISO, and other styles
11

Mahood, Christian. "Data center design & enterprise networking /." Online version of thesis, 2009. http://hdl.handle.net/1850/8699.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Soares, Maria José. "Data center - a importância de uma arquitectura." Master's thesis, Universidade de Évora, 2011. http://hdl.handle.net/10174/11604.

Full text
Abstract:
Este trabalho, apresenta um estudo sob a forma de overview, abordando a temática dos Data Centers no que concerne à importância da sua arquitectura. Foram elencados os principais factores críticos a considerar numa arquitectura, bem como as melhores práticas a implementar no sentido de se avaliar a importância de uma certificação pela entidade de certificação - Uptime Institute. Aborda-se ainda o eventual interesse em como expandir essa certificação/qualificação aos recursos humanos, como garantia de qualidade de serviços e estratégia de marketing. Como forma de consubstanciar a temática, foi criado um Case Study, observando-se um universo de sete Data Centers em Portugal, pertencentes ao sector público e privado, permitindo a verificação e comparação de boas práticas, bem como os aspectos menos positivos a considerar dentro da área. Finalmente, são deixadas algumas reflexões sobre o que pode ser a tendência de evolução dos Data Centers numa perspectiva de qualidade; ### Abstract: This is presents a study, in the form of overview, addressing the issue of the importance of architecture in Data Centers. The main critical factors in architecture were considered as well as the best practices to implement it in order to assess the value of a recognized certificate. It also discusses the possible interest in expanding the certification/qualification of human resources as a guarantee for quality of the services provided and marketing strategies. To support this work we analyzed seven Case Studies, where it was possible to observe a representative universe of Data Centers in Portugal, belonging to the public and private sectors, allowing the verification and comparison of good practices as well as the less positive aspects to consider within this area. At the end of the document we present conclusions on what may be the trend for the evolution of Data Center as far as quality is concerned.
APA, Harvard, Vancouver, ISO, and other styles
13

Pipkin, Everest R. "It Was Raining in the Data Center." Research Showcase @ CMU, 2018. http://repository.cmu.edu/theses/138.

Full text
Abstract:
Stemming from a 2011 incident inside of a Facebook data facility in which hyper-cooled air formed a literal (if somewhat transient) rain cloud in the stacks, It was raining in the data center examines ideas of non-places and supermodernity applied to contemporary network infrastructure. It was raining in the data center argues that the problem of the rain cloud is as much a problem of psychology as it is a problem of engineering. Although humidity-management is a predictable snag for any data center, the cloud was a surprise; a self-inflicted side-effect of a strategy of distance. The rain cloud was a result of the same rhetoric of ephemerality that makes it easy to imagine the inside of a data center to be both everywhere and nowhere. This conceit of internet data being placeless shares roots with Marc Augé’s idea of non-places (airports, highways, malls), which are predicated on the qualities of excess and movement. Without long-term inhabitants, these places fail to tether themselves to their locations, instead existing as a markers of everywhere. Such a premise allows the internet to exist as an other-space that is not conceptually beholden to the demands of energy and landscape. It also liberates the idea of ‘the network’ from a similar history of industry. However, the network is deeply rooted in place, as well as in industry and transit. Examining the prevalence of network overlap in American fiber-optic cabling, it becomes easy to trace routes of cables along major US freight train lines and the US interstate highway system. The historical origin of this network technology is in weaponization and defense, from highways as a nuclear-readiness response to ARPANET’s Pentagon-based funding. Such a linkage with the military continues today, with data centers likely to be situated near military installations— sharing similar needs electricity, network connectivity, fair climate, space, and invisibility. We see the repetition of militarized tropes across data structures. Fiber-optic network locations are kept secret; servers are housed in cold-war bunkers; data centers nest next to military black-sites. Similarly, Augé reminds us that non-places are a particular target of terrorism, populated as they are with cars, trains, drugs and planes that turn into weapons. When the network itself is at threat of weaponization, the effect is an ambient and ephemeral fear; a paranoia made of over-connection.
APA, Harvard, Vancouver, ISO, and other styles
14

Johansson, Jennifer. "Cooling storage for 5G EDGE data center." Thesis, Luleå tekniska universitet, Institutionen för teknikvetenskap och matematik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-79126.

Full text
Abstract:
Data centers requires a lot of energy due to that data centers count as the buildings also contains servers, cooling equipment, IT-equipment and power equipment. As the cooling solution for many data centers around the world right now, the compressor-based cooling solution counts for around 40% of the total energy consumption. A non-compressor-based solution that is used in some data centers, but also is in a research phase is the free cooling application. Free cooling means that the outside air is utilized to cool down the data center and there are two main technologies that contains within free cooling: airside free cooling and waterside free cooling. The purpose of this master thesis is to analyze two types of coils; one corrugated and the other one smooth, providing from Bensby Rostfria, to investigate if it is possible to use free cooling in 5G EDGE data center in Luleå, with one of these coils. The investigation will be done during the warmest day in summer. This because, according to weather data, Luleå is one candidate where this type of cooling system could be of use. The project was done through RISE ICE Datacenter where two identical systems was built next to each other with two corrugated hoses of different diameter and two smooth tubes with different diameter. The variables that was measured was the ambient temperature within the data hall, the water temperature in both water tanks, the temperature out from the system, the temperature in to the system and the mass flow of the air that was going to go through the system. The first thing that was done was to do fan curves to easier choose which input voltages for the fans that was of interest to do further analysis on. After that was done, three point was taken where the fan curve was of most increase. The tests were done by letting the corrugated hoses and smooth tubes to be in each of the water tanks and fill it with cold water. It was thereafter the coils that should warm the water from 4,75 °C – 9,75 °C, because of that the temperature in the data center was around 15 °C. The rising in particularly these temperatures was chosen because it is seen that to use free cooling the temperature differences must be at least 5 °C. The tests were done three times to get a more reliable result. All the data was further taken in to Zabbix and to further analysis in Grafana. When one test was done the files was saved from Grafana to Excel for compilation, and thereafter to Matlab for further analysis. The first thing that was analyzed was if the three different tests with the same input voltages gave similar results in the water temperature in the tank and the temperature out from the system. Thereafter, trendlines was built to investigate the temperature differences in and out of the system, the temperature differences in and the water temperature in the tank, the mass flow and the cooling power. That trendline was further in comparison to each other, which was 2D-plots between the cooling power and the temperature differences between the inlet and the water. Thereafter the both coils could compare to each other to see which of them that gave the largest cooling power and was most efficient to install in a future 5G data center module.  The conclusion for this master thesis is that the corrugated hose will give a higher cooling power with higher temperature differences outside, but during the warmest summer day it was distinctly the smooth tube that gave the largest cooling power and therefore the best result. The smooth tube also got, through hand calculations, the larger amount of pipe that was necessary to cool down the 5G module, but the smallest water tank. It was also shown that for the warmest summer day, a temperature in the water tank of 24 °C is the best, compared to 20 °C and 18 °C. The amount of coil that is needed to cool down the data center with a temperature in the water tank at 24 °C and how large the water tank differs between the two types of coils. For the corrugated hose a length of 1.8 km and a water tank of 9.4 m3. As for the smooth tube a length of 1.7 km and a water tank volume of 12 m3.  As can be seen throughout this project is that this type of cooling equipment is not the most efficient for the warmest summer day but could easily be used for other seasons.
APA, Harvard, Vancouver, ISO, and other styles
15

Li, Yi. "Speaker Diarization System for Call-center data." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286677.

Full text
Abstract:
To answer the question who spoke when, speaker diarization (SD) is a critical step for many speech applications in practice. The task of our project is building a MFCC-vector based speaker diarization system on top of a speaker verification system (SV), which is an existing Call-centers application to check the customer’s identity from a phone call. Our speaker diarization system uses 13-Dimensional MFCCs as Features, performs Voice Active Detection (VAD), segmentation, Linear Clustering and the Hierarchical Clustering based on GMM and the BIC score. By applying it, we decrease the Equal Error Rate (EER) of the SV from 18.1% in the baseline experiment to 3.26% on the general call-center conversations. To better analyze and evaluate the system, we also simulated a set of call-center data based on the public audio databases ICSI corpus.
För att svara på frågan vem som talade när är högtalardarisering (SD) ett kritiskt steg för många talapplikationer i praktiken. Uppdraget med vårt projekt är att bygga ett MFCC-vektorbaserat högtalar-diariseringssystem ovanpå ett högtalarverifieringssystem (SV), som är ett befintligt Call-center-program för att kontrollera kundens identitet från ett telefonsamtal. Vårt högtalarsystem använder 13-dimensionella MFCC: er som funktioner, utför Voice Active Detection (VAD), segmentering, linjär gruppering och hierarkisk gruppering baserat på GMM och BIC-poäng. Genom att tillämpa den minskar vi EER (Equal Error Rate) från 18,1 % i baslinjeexperimentet till 3,26 % för de allmänna samtalscentret. För att bättre analysera och utvärdera systemet simulerade vi också en uppsättning callcenter-data baserat på de offentliga ljuddatabaserna ICSI corpus.
APA, Harvard, Vancouver, ISO, and other styles
16

LeBlanc, Robert-Lee Daniel. "Analysis of Data Center Network Convergence Technologies." BYU ScholarsArchive, 2014. https://scholarsarchive.byu.edu/etd/4150.

Full text
Abstract:
The networks in traditional data centers have remained unchanged for decades and have grown large, complex and costly. Many data centers have a general purpose Ethernet network and one or more additional specialized networks for storage or high performance low latency applications. Network convergence promises to lower the cost and complexity of the data center network by virtualizing the different networks onto a single wire. There is little evidence, aside from vendors' claims, that validate network convergence actually achieves these goals. This work defines a framework for creating a series of unbiased tests to validate converged technologies and compare them to traditional configurations. A case study involving two different network converged technologies was developed to validate the defined methodology and framework. The study also shows that these two technologies do indeed perform similarly to non-virtualized network, reduce costs, cabling, power consumption and are easy to operate.
APA, Harvard, Vancouver, ISO, and other styles
17

Desmouceaux, Yoann. "Network-Layer Protocols for Data Center Scalability." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLX011/document.

Full text
Abstract:
Du fait de la croissance de la demande en ressources de calcul, les architectures de centres de données gagnent en taille et complexité.Dès lors, cette thèse prend du recul par rapport aux architectures réseaux traditionnelles, et montre que fournir des primitives génériques directement à la couche réseau permet d'améliorer l'utilisation des ressources, et de diminuer le trafic réseau et le surcoût administratif.Deux architectures réseaux récentes, Segment Routing (SR) et Bit-Indexed Explicit Replication (BIER), sont utilisées pour construire et analyser des protocoles de couche réseau, afin de fournir trois primitives: (1) mobilité des tâches, (2) distribution fiable de contenu, et (3) équilibre de charge.Premièrement, pour la mobilité des tâches, SR est utilisé pour fournir un service de migration de machine virtuelles sans perte.Cela ouvre l'opportunité d'étudier comment orchestrer le placement et la migration de tâches afin de (i) maximiser le débit inter-tâches, tout en (ii) maximisant le nombre de nouvelles tâches placées, mais (iii) minimisant le nombre de tâches migrées.Deuxièmement, pour la distribution fiable de contenu, BIER est utilisé pour fournir un protocole de multicast fiable, dans lequel les retransmissions de paquets perdus sont ciblés vers l'ensemble précis de destinations n'ayant pas reçu ce packet : ainsi, le surcoût de trafic est minimisé.Pour diminuer la charge sur la source, cette approche est étendue en rendant possible des retransmissions par des pairs locaux, utilisant SR afin de trouver un pair capable de retransmettre.Troisièmement, pour l'équilibre de charge, SR est utilisé pour distribuer des requêtes à travers plusieurs applications candidates, chacune prenant une décision locale pour accepter ou non ces requêtes, fournissant ainsi une meilleure équité de répartition comparé aux approches centralisées.La faisabilité d'une implémentation matérielle de cette approche est étudiée, et une solution (utilisant des canaux cachés pour transporter de façon invisible de l'information vers l'équilibreur) est implémentée pour une carte réseau programmable de dernière génération.Finalement, la possibilité de fournir de l'équilibrage automatique comme service réseau est étudiée : en faisant passer (avec SR) des requêtes à travers une chaîne fixée d'applications, l'équilibrage est initié par la dernière instance, selon son état local
With the development of demand for computing resources, data center architectures are growing both in scale and in complexity.In this context, this thesis takes a step back as compared to traditional network approaches, and shows that providing generic primitives directly within the network layer is a great way to improve efficiency of resource usage, and decrease network traffic and management overhead.Using recently-introduced network architectures, Segment Routing (SR) and Bit-Indexed Explicit Replication (BIER), network layer protocols are designed and analyzed to provide three high-level functions: (1) task mobility, (2) reliable content distribution and (3) load-balancing.First, task mobility is achieved by using SR to provide a zero-loss virtual machine migration service.This then opens the opportunity for studying how to orchestrate task placement and migration while aiming at (i) maximizing the inter-task throughput, while (ii) maximizing the number of newly-placed tasks, but (iii) minimizing the number of tasks to be migrated.Second, reliable content distribution is achieved by using BIER to provide a reliable multicast protocol, in which retransmissions of lost packets are targeted towards the precise set of destinations having missed that packet, thus incurring a minimal traffic overhead.To decrease the load on the source link, this is then extended to enable retransmissions by local peers from the same group, with SR as a helper to find a suitable retransmission candidate.Third, load-balancing is achieved by way of using SR to distribute queries through several application candidates, each of which taking local decisions as to whether to accept those, thus achieving better fairness as compared to centralized approaches.The feasibility of hardware implementation of this approach is investigated, and a solution using covert channels to transparently convey information to the load-balancer is implemented for a state-of-the-art programmable network card.Finally, the possibility of providing autoscaling as a network service is investigated: by letting queries go through a fixed chain of applications using SR, autoscaling is triggered by the last instance, depending on its local state
APA, Harvard, Vancouver, ISO, and other styles
18

RUIU, PIETRO. "Energy Management in Large Data Center Networks." Doctoral thesis, Politecnico di Torino, 2018. http://hdl.handle.net/11583/2706336.

Full text
Abstract:
In the era of digitalization, one of the most challenging research topic regards the energy consumption reduction of ICT equipment to contrast the global climate change. The ICT world is very sensitive to the problem of Greenhouse Gas emissions (GHG) and for several years has begun to implement some countermeasures to reduce consumption waste and increase efficiency of infrastructure: the total embodied emissions of end-use devices have significantly decreased, networks have become more energy efficient, and trends such as virtualization and dematerialization will continue to make equipment more efficient. One of the main contributor to GHG emissions is data centers industry, which provision end users with the necessary computing and communication resources to access the vast majority of services online and on a pay-as-you-go basis. Data centers require a tremendous amount of energy to operate, since the efficiency of cooling systems is increasing, more research efforts should be put in making green the IT system, which is becoming the major contributor to energy consumption. Being the network one of the non-negligible contributors to energy consumption in data centers, several architectures have been designed with the goal of improving energy-efficient of data centers. These architectures are called Data Center Networks (DCNs) and provide interconnections among the computing servers and between the servers and the Internet, according to specific layouts.In my PhD I have extensively investigated on energy efficiency of data center, working on different projects which try to tackle the problems from different views. The research can be divided into two main parts with the Energy Proportionality as connection argument. The main focus of the work is about the trade-off between size and energy efficiency of data centers, with the aim to find a relationship between scalability and energy proportionality of data centers. In this regard, the energy consumption of different data center architectures have been analyzed, varying the dimension in terms of number of server and switches. Extensive simulation experiments, performed in small and large scale scenarios, unveil the ability of network-aware allocation policies in loading the the data center in a energy-proportional manner and the robustness of classical two- and three-tier design under network-oblivious allocation strategies. The concept of energy proportionality, applied to the whole DCN and used as efficiency metric, is one of the main contributions of the work. Energy proportionality is a property defining the degree of proportionality between load and the energy spent to support such load, thus devices are energy proportional when any increase of the load corresponds to a proportional increase of energy consumption. A peculiar feature of our analysis is in the consideration of the whole data center, i.e., both computing and communication devices are taken into account. Our methodology consists of an asymptotic analysis of data center consumption, whenever its size (in terms of servers) become very large. In our analysis, we investigate the impact of three different allocation policies on the energy proportionality of computing and networking equipment for different DCNs, including 2-Tier, 3-Tier and Jupiter topologies. For evaluation, the size of the DCNs varies to accommodate up to several thousands of computing servers. Validation of the analysis is conducted through simulations. We propose new metrics with the objective to characterize in a holistic manner the energy proportionality in data centers. The experiments unveil that, when consolidation policies are in place and regardless of the type of architecture, the size of the DCN plays a key role, i.e., larger DCNs containing thousands of servers are more energy proportional than small DCNs.
APA, Harvard, Vancouver, ISO, and other styles
19

Shioda, Romy 1977. "Integer optimization in data mining." Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/17579.

Full text
Abstract:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2003.
Includes bibliographical references (p. 103-107).
While continuous optimization methods have been widely used in statistics and data mining over the last thirty years, integer optimization has had very limited impact in statistical computation. Thus, our objective is to develop a methodology utilizing state of the art integer optimization methods to exploit the discrete character of data mining problems. The thesis consists of two parts: The first part illustrates a mixed-integer optimization method for classification and regression that we call Classification and Regression via Integer Optimization (CRIO). CRIO separates data points in different polyhedral regions. In classification each region is assigned a class, while in regression each region has its own distinct regression coefficients. Computational experimentation with real data sets shows that CRIO is comparable to and often outperforms the current leading methods in classification and regression. The second part describes our cardinality-constrained quadratic mixed-integer optimization algorithm, used to solve subset selection in regression and portfolio selection in asset allocation. We take advantage of the special structures of these problems by implementing a combination of implicit branch-and-bound, Lemke's pivoting method, variable deletion and problem reformulation. Testing against popular heuristic methods and CPLEX 8.0's quadratic mixed-integer solver, we see that our tailored approach to these quadratic variable selection problems have significant advantages over simple heuristics and generalized solvers.
by Romy Shioda.
Ph.D.
APA, Harvard, Vancouver, ISO, and other styles
20

Ehret, Anna. "Entwicklung und Evaluation eines Förder-Assessment-Centers für Mitarbeiter der internationalen Jugendarbeit (FAIJU)." Berlin wvb, Wiss. Verl, 2006. http://www.wvberlin.de/data/inhalt/ehret.htm.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Green, George Michael. "Reducing Peak Power Consumption in Data Centers." The Ohio State University, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=osu1386068818.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Cheung, Wang Chi. "Data-driven algorithms for operational problems." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/108916.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2017.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 173-180).
In this thesis, we propose algorithms for solving revenue maximization and inventory control problems in data-driven settings. First, we study the choice-based network revenue management problem. We propose the Approximate Column Generation heuristic (ACG) and Potential Based algorithm (PB) for solving the Choice-based Deterministic Linear Program, an LP relaxation to the problem, to near-optimality. Both algorithms only assume the ability to approximate the underlying single period problem. ACG inherits the empirical efficiency from the Column Generation heuristic, while PB enjoys provable efficiency guarantee. Building on these tractability results, we design an earning-while-learning policy for the online problem under a Multinomial Logit choice model with unknown parameters. The policy is efficient, and achieves a regret sublinear in the length of the sales horizon. Next, we consider the online dynamic pricing problem, where the underlying demand function is not known to the monopolist. The monopolist is only allowed to make a limited number of price changes during the sales horizon, due to administrative constraints. For any integer m, we provide an information theoretic lower bound on the regret incurred by any pricing policy with at most m price changes. The bound is the best possible, as it matches the regret upper bound incurred by our proposed policy, up to a constant factor. Finally, we study the data-driven capacitated stochastic inventory control problem, where the demand distributions can only be accessed through sampling from offline data. We apply the Sample Average Approximation (SAA) method, and establish a polynomial size upper bound on the number of samples needed to achieve a near-optimal expected cost. Nevertheless, the underlying SAA problem is shown to be #P hard. Motivated by the SAA analysis, we propose a randomized polynomial time approximation scheme which also uses polynomially many samples. To complement our results, we establish an information theoretic lower bound on the number of samples needed to achieve near optimality.
by Wang Chi Cheung.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
23

Snyder, Ashley M. (Ashley Marie). "Data mining and visualization : real time predictions and pattern discovery in hospital emergency rooms and immigration data." Thesis, Massachusetts Institute of Technology, 2010. http://hdl.handle.net/1721.1/61199.

Full text
Abstract:
Thesis (S.M.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2010.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 163-166).
Data mining is a versatile and expanding field of study. We show the applications and uses of a variety of techniques in two very different realms: Emergency department (ED) length of stay prediction and visual analytics. For the ED, we investigate three data mining techniques to predict a patient's length of stay based solely on the information available at the patient's arrival. We achieve good predictive power using Decision Tree Analysis. Our results show that by using main characteristics about the patient, such as chief complaint, age, time of day of the arrival, and the condition of the ED, we can predict overall patient length of stay to specific hourly ranges with an accuracy of 80%. For visual analytics, we demonstrate how to mathematically determine the optimal number of clusters for a geospatial dataset containing both numeric and categorical data and then how to compare each cluster to the entire dataset as well as consider pairwise differences. We then incorporate our analytical methodology in visual display. Our results show that we can quickly and effectively measure differences between clusters and we can accurately find the optimal number of clusters in non-noisy datasets.
by Ashley M. Snyder.
S.M.
APA, Harvard, Vancouver, ISO, and other styles
24

König, Ralf. "HP UDC - Standardizing and Automizing Data Center Operations." Universitätsbibliothek Chemnitz, 2004. http://nbn-resolving.de/urn:nbn:de:swb:ch1-200400394.

Full text
Abstract:
This presentation contains some common facts about the Utility Data Center by HP as well as my the results of practical work at HP Labs in preparation of my diploma thesis
Workshop "Netz- und Service-Infrastrukturen" Die Präsentation beinhaltet einige allg. Fakten zum Utility Data Center von HP sowie die Ergebnisse meiner praktischen Arbeit in den HP Labs in Vorbereitung auf die Diplomarbeit
APA, Harvard, Vancouver, ISO, and other styles
25

Zhuang, Hao. "Performance Evaluation of Virtualization in Cloud Data Center." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-104206.

Full text
Abstract:
Amazon Elastic Compute Cloud (EC2) has been adopted by a large number of small and medium enterprises (SMEs), e.g. foursquare, Monster World, and Netflix, to provide various kinds of services. There has been some existing work in the current literature investigating the variation and unpredictability of cloud services. These work demonstrated interesting observations regarding cloud offerings. However, they failed to reveal the underlying essence of the various appearances for the cloud services. In this thesis, we looked into the underlying scheduling mechanisms, and hardware configurations, of Amazon EC2, and investigated their impact on the performance of virtual machine instances running atop. Specifically, several instances with the standard and high-CPU instance families are covered to shed light on the hardware upgrade and replacement of Amazon EC2. Then large instance from the standard family is selected to conduct focus analysis. To better understand the various behaviors of the instances, a local cluster environment is set up, which consists of two Intel Xeon servers, using different scheduling algorithms. Through a series of benchmark measurements, we observed the following findings: (1) Amazon utilizes highly diversified hardware to provision different instances. It results in significant performance variation, which can reach up to 30%. (2) Two different scheduling mechanisms were observed, one is similar to Simple Earliest Deadline Fist (SEDF) scheduler, whilst the other one analogies Credit scheduler in Xen hypervisor. These two scheduling mechanisms also arouse variations in performance. (3) By applying a simple "trial-and-failure" instance selection strategy, the cost saving is surprisingly significant. Given certain distribution of fast-instances and slow-instances, the achievable cost saving can reach 30%, which is attractive to SMEs which use Amazon EC2 platform.
Amazon Elastic Compute Cloud (EC2) har antagits av ett stort antal små och medelstora företag (SMB), t.ex. foursquare, Monster World, och Netflix, för att ge olika typer av tjänster. Det finns en del tidigare arbeten i den aktuella litteraturen som undersöker variationen och oförutsägbarheten av molntjänster. Dessa arbetenhar visat intressanta iakttagelser om molnerbjudanden, men de har misslyckats med att avslöja den underliggande kärnan hos de olika utseendena för molntjänster. I denna avhandling tittade vi på de underliggande schemaläggningsmekanismerna och maskinvarukonfigurationer i Amazon EC2, och undersökte deras inverkan på resultatet för de virtuella maskiners instanser som körs ovanpå. Närmare bestämt är det flera fall med standard- och hög-CPU instanser som omfattas att belysa uppgradering av hårdvara och utbyte av Amazon EC2. Stora instanser från standardfamiljen är valda för att genomföra en fokusanalys. För att bättre förstå olika beteenden av de olika instanserna har lokala kluster miljöer inrättas, dessa klustermiljöer består av två Intel Xeonservrar och har inrättats med hjälp av olika schemaläggningsalgoritmer. Genom en serie benchmarkmätningar observerade vi följande slutsatser: (1) Amazon använder mycket diversifierad hårdvara för att tillhandahållandet olika instanser. Från de olika instans-sub-typernas perspektiv leder hårdvarumångfald till betydande prestationsvariation som kan nå upp till 30%. (2) Två olika schemaläggningsmekanismer observerades, en liknande Simple Earliest Deadline Fist(SEDF) schemaläggare, medan den andra mer liknar Credit-schemaläggaren i Xenhypervisor. Dessa två schemaläggningsmekanismer ger även upphov till variationer i prestanda. (3) Genom att tillämpa en enkel "trial-and-failure" strategi för val av instans, är kostnadsbesparande förvånansvärt stor. Med tanke på fördelning av snabba och långsamma instanser kan kostnadsbesparingen uppgå till 30%, vilket är attraktivt för små och medelstora företag som använder Amazon EC2 plattform.
APA, Harvard, Vancouver, ISO, and other styles
26

Mohammadnezhad, Mahdi. "Evaluating Stream Protocol for a Data Stream Center." Thesis, Linnéuniversitetet, Institutionen för datavetenskap (DV), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-55761.

Full text
Abstract:
Linnaeus University is aiming at implementing a Data Stream Centre to provide streaming of accumulated data from the websites’ newspapers and articles in order to help its scientists of University to have faster and easier access to the mentioned data. This mentioned project consists of multiple parts and the part we are responsible to research about is first nominating some text streaming protocols based on the criteria that are important for Linnaeus University and then evaluating them. Those protocols are responsible to transfer text stream from the robots (that read articles from the websites) to the data stream center and from them to the scientists. Some KPIs (Key Performance Indicators) are defined and the protocols are evaluated based on those KPIs. In this study we address evaluation of network streaming protocol by starting to read about the protocol’s specifications and nominating four protocols including TCP, HTTP1.1, Server-Sent Events and Websocket. Then, fake robot and server are implemented by each protocol to simulate the functionality of real robots, servers and scientists in LNU data stream center project. Later, the evaluation is done in the mentioned simulated environment using RawCAP, Wireshark and Message Analyzer. The results of this study indicated that the best suited protocols for transferring text stream data from robot to data stream center and from data stream center to scientist are TCP and Server-Sent Events, respectively. In the concluding part, other protocols are also suggested in the order of priority.
APA, Harvard, Vancouver, ISO, and other styles
27

He, Chunzhi, and 何春志. "Load-balanced switch design and data center networking." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2014. http://hdl.handle.net/10722/198826.

Full text
Abstract:
High-speed routers and high-performance data centers share a common system-level architecture in which multiple processing nodes are connected by an interconnection network for high-speed communications. Load balancing is an important technique for maximizing throughput and minimizing delay of the interconnection network. In this thesis, efficient load balancing schemes are designed and analyzed for next-generation routers and data centers. In high-speed router design, two preferred switch architectures are input-queued switch and load-balanced switch. In an input-queued switch, time-domain load balancing can be carried out by an iterative algorithm that schedules packets for sending in different time slots. The complexity of an iterative algorithm increases rapidly with the number of scheduling iterations. To address this problem, a single-iteration scheduling algorithm called D-LQF is designed, in which exhaustive service policy is adopted for reusing the matched input-output pairs in the previous time slots to grow the match size. Unlike an input-queued switch, a load-balanced switch consists of two stages of crossbar switch fabrics, where load balancing is carried out in both time and space domains. Among various load-balanced switches, the feedback-based switch gives the best delay-throughput performance. In this thesis, the feedback-based switch is enhanced in three aspects. Firstly, we focus on reducing its switch fabric complexity. Instead of using crossbars, a dual-banyan network is proposed. The complexity of dual-banyan can be further reduced by merging the two banyans to form a Clos network, resulting in a Clos-banyan network. Secondly, we target at improving the delay performance of the feedback-based switch. A Clos-feedback switch architecture is devised where each switch module in the Clos network is a small feedback-based switch. With application-flow based load balancing, packet order is ensured and the average packet delay is reduced from O(N) to O(n), where N and n are the switch and switch module sizes, respectively. Thirdly, we extend the feedback-based switch to support multicast traffic. Based on the notion of pointer-based multicast VOQ, an efficient multicast scheduling algorithm with packet replication at the middle-stage ports only is proposed. In order to provide close-to-100% throughput for any admissible multicast traffic patterns, a three-stage implementation of feedback-based switch is also designed. In designing load balancing schemes for data centers, we focus on the most popular fat-tree based data centers. Notably, packet-based load balancing is widely considered infeasible for data centers. This is because the associated packet out-of-order problem will cause unnecessary TCP fast retransmits, and as a result, severely undermine TCP performance. In this thesis, we show that if packet-based load balancing is performed properly, the packet out-of-order problem can be easily addressed by slightly increasing the number of duplicate ACKs required for triggering fast retransmit. Admittedly, in case of a real packet loss, the loss recovery time will be increased. But our simulation results show that such an increase is far less than the reduction in the network queueing delay (due to a better load-balanced network). As compared to a flow-based load balancing scheme, our packet-based scheme consistently provides significantly higher goodput and noticeably smaller delay.
published_or_final_version
Electrical and Electronic Engineering
Doctoral
Doctor of Philosophy
APA, Harvard, Vancouver, ISO, and other styles
28

Mitteff, Eric. "AUTOMATED ADAPTIVE DATA CENTER GENERATION FOR MESHLESS METHODS." Master's thesis, University of Central Florida, 2006. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/2635.

Full text
Abstract:
Meshless methods have recently received much attention but are yet to reach their full potential as the required problem setup (i.e. collocation point distribution) is still significant and far from automated. The distribution of points still closely resembles the nodes of finite volume-type meshes and the free parameter, c, of the radial-basis expansion functions (RBF) still must be tailored specifically to a problem. The localized meshless collocation method investigated requires a local influence region, or topology, used as the expansion medium to produce the required field derivatives. Tests have shown a regular cartesian point distribution produces optimal results, however, in order to maintain a locally cartesian point distribution a recursive quadtree scheme is herein proposed. The quadtree method allows modeling of irregular geometries and refinement of regions of interest and it lends itself for full automation, thus, reducing problem setup efforts. Furthermore, the construction of the localized expansion regions is closely tied up to the point distribution process and, hence, incorporated into the automated sequence. This also allows for the optimization of the RBF free parameter on a local basis to achieve a desired level of accuracy in the expansion. In addition, an optimized auto-segmentation process is adopted to distribute and balance the problem loads throughout a parallel computational environment while minimizing communication requirements.
M.S.M.E.
Department of Mechanical, Materials and Aerospace Engineering;
Engineering and Computer Science
Mechanical Engineering
APA, Harvard, Vancouver, ISO, and other styles
29

Humr, Scott A. "Understanding return on investment for data center consolidation." Thesis, Monterey, California: Naval Postgraduate School, 2013. http://hdl.handle.net/10945/37641.

Full text
Abstract:
Approved for public release; distribution is unlimited
The federal government has mandated that agencies consolidate data centers in order to gain efficiencies and cost savings. It is a well-established fact that both public and private organizations have reported considerable cost savings from consolidating data centers; however, in the case of federal agencies, no established methodology for valuing the benefits has been delineated. Nevertheless, numerous federal policies mandate that investments in IT demonstrate a positive return on investment (ROI). The problem is that the Department of Defense does not have clear instructions on how to measure ROI in order to evaluate an opportunity to consolidate data centers. While calculating ROI for IT can be very challenging, most private and public firms have methods for demonstrating a return ratio and not only cost savings. Therefore, choosing metrics and methodologies for calculating ROI is an important step in the decision-making process. This complexity complicates estimating a data centers utility and the true value generation of merging data centers. This thesis will explore the challenges that the Marine Corps faces for calculating ROI for data center consolidation.
APA, Harvard, Vancouver, ISO, and other styles
30

Eriksson, Martin. "Monitoring, Modelling and Identification of Data Center Servers." Thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-69342.

Full text
Abstract:
Energy efficient control of server rooms in modern data centers can help reducing the energy usage of this fast growing industry. Efficient control, however, cannot be achieved without: i) continuously monitoring in real-time the behaviour of the basic thermal nodes within these infras- tructures, i.e., the servers; ii) analyzing the acquired data to model the thermal dynamics within the data center. Accurate data and accurate models are indeed instrumental for implementing efficient data centers cooling strategies. In this thesis we focus on Open Compute Servers, a class of servers designed in an open-source fashion and used by big players like Facebook. We thus propose a set of appropriate methods for collecting real-time data from these platforms and a dedicated thermal model describing the thermal dynamics of the CPUs and RAMs of these servers as a function of both controllable and non-controllable inputs (e.g., the CPU utilization levels and the air mass flow of the server’s fans). We also identify this model from real data and provide the results so to be reusable by other researchers.
APA, Harvard, Vancouver, ISO, and other styles
31

Hassen, Fadoua. "Multistage packet-switching fabrics for data center networks." Thesis, University of Leeds, 2017. http://etheses.whiterose.ac.uk/17620/.

Full text
Abstract:
Recent applications have imposed stringent requirements within the Data Center Network (DCN) switches in terms of scalability, throughput and latency. In this thesis, the architectural design of the packet-switches is tackled in different ways to enable the expansion in both the number of connected endpoints and traffic volume. A cost-effective Clos-network switch with partially buffered units is proposed and two packet scheduling algorithms are described. The first algorithm adopts many simple and distributed arbiters, while the second approach relies on a central arbiter to guarantee an ordered packet delivery. For an improved scalability, the Clos switch is build using a Network-on-Chip (NoC) fabric instead of the common crossbar units. The Clos-UDN architecture made with Input-Queued (IQ) Uni-Directional NoC modules (UDNs) simplifies the input line cards and obviates the need for the costly Virtual Output Queues (VOQs). It also avoids the need for complex, and synchronized scheduling processes, and offers speedup, load balancing, and good path diversity. Under skewed traffic, a reliable micro load-balancing contributes to boosting the overall network performance. Taking advantage of the NoC paradigm, a wrapped-around multistage switch with fully interconnected Central Modules (CMs) is proposed. The architecture operates with a congestion-aware routing algorithm that proactively distributes the traffic load across the switching modules, and enhances the switch performance under critical packet arrivals. The implementation of small on-chip buffers has been made perfectly feasible using the current technology. This motivated the implementation of a large switching architecture with an Output-Queued (OQ) NoC fabric. The design merges assets of the output queuing, and NoCs to provide high throughput, and smooth latency variations. An approximate analytical model of the switch performance is also proposed. To further exploit the potential of the NoC fabrics and their modularity features, a high capacity Clos switch with Multi-Directional NoC (MDN) modules is presented. The Clos-MDN switching architecture exhibits a more compact layout than the Clos-UDN switch. It scales better and faster in port count and traffic load. Results achieved in this thesis demonstrate the high performance, expandability and programmability features of the proposed packet-switches which makes them promising candidates for the next-generation data center networking infrastructure.
APA, Harvard, Vancouver, ISO, and other styles
32

Pfeiffer, Jessica. "Datascapes: Envisioning a New Kind of Data Center." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin158399900447231.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Rudén, Philip. "FLUID SIMULATIONS FOR A AIRRECIRULATED DATA CENTER-GREENHOUSE." Thesis, Luleå tekniska universitet, Institutionen för teknikvetenskap och matematik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-85514.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Lundin, Lowe. "Artificial Intelligence for Data Center Power Consumption Optimisation." Thesis, Uppsala universitet, Avdelningen för systemteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-447627.

Full text
Abstract:
The aim of the project was to implement a machine learning model to optimise the power consumption of Ericsson’s Kista data center. The approach taken was to use a Reinforcement Learning agent trained in a simulation environment based on data specific to the data center. In this manner, the machine learning model could find interactions between parameters, both general and site specific in ways that a sophisticated algorithm designed by a human never could. In this work it was found that a neural network can effectively mimic a real data center and that the Reinforcement Learning policy "TD3" could, within the simulated environment, consistently and convincingly outperform the control policy currently in use at Ericsson’s Kista data center.
APA, Harvard, Vancouver, ISO, and other styles
35

Pamboris, Andreas. "LDP location discovery protocol for data center networks /." Diss., [La Jolla] : University of California, San Diego, 2009. http://wwwlib.umi.com/cr/ucsd/fullcit?p1467934.

Full text
Abstract:
Thesis (M.S.)--University of California, San Diego, 2009.
Title from first page of PDF file (viewed September 17, 2009). Available via ProQuest Digital Dissertations. Includes bibliographical references (p. 47).
APA, Harvard, Vancouver, ISO, and other styles
36

Pulice, Alessandro. "Il problema del risparmio energetico nei data center." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2011. http://amslaurea.unibo.it/2388/.

Full text
Abstract:
Con la crescente diffusione del web e dei servizi informatici offerti via internet, è aumentato in questi anni l’utilizzo dei data center e conseguentemente, il consumo di energia elettrica degli stessi. Il problema ambientale che comporta l’alto fabbisogno energetico, porta gli operatori di data center ad utilizzare tecniche a basso consumo e sistemi efficienti. Organizzazioni ambientali hanno rilevato che nel 2011 i consumi derivanti dai data center raggiungeranno i 100 milioni di kWh, con un costo complessivo di 7,4 milioni di dollari nei soli Stati Uniti, con una proiezione simile anche a livello globale. La seguente tesi intende valutare le tecniche in uso per diminuire il consumo energetico nei data center, e quali tecniche vengono maggiormente utilizzate per questo scopo. Innanzitutto si comincerà da una panoramica sui data center, per capire il loro funzionamento e per mostrare quali sono i componenti fondamentali che lo costituiscono; successivamente si mostrerà quali sono le parti che incidono maggiormente nei consumi, e come si devono effettuare le misurazioni per avere dei valori affidabili attraverso la rilevazione del PUE, unità di misura che valuta l’efficienza di un data center. Dal terzo capitolo si elencheranno le varie tecniche esistenti e in uso per risolvere il problema dell’efficienza energetica, mostrando alla fine una breve analisi sui metodi che hanno utilizzato le maggiori imprese del settore per risolvere il problema dei consumi nei loro data center. Lo scopo di questo elaborato è quello di capire quali sono le tecniche e le strategie per poter ridurre i consumi e aumentare l’efficienza energetica dei data center.
APA, Harvard, Vancouver, ISO, and other styles
37

Gupta, Vishal Ph D. Massachusetts Institute of Technology. "Data-driven models for uncertainty and behavior." Thesis, Massachusetts Institute of Technology, 2014. http://hdl.handle.net/1721.1/91301.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2014.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
117
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 173-180).
The last decade has seen an explosion in the availability of data. In this thesis, we propose new techniques to leverage these data to tractably model uncertainty and behavior. Specifically, this thesis consists of three parts: In the first part, we propose a novel schema for utilizing data to design uncertainty sets for robust optimization using hypothesis testing. The approach is flexible and widely applicable, and robust optimization problems built from our new data driven sets are computationally tractable, both theoretically and practically. Optimal solutions to these problems enjoy a strong, finite-sample probabilistic guarantee. Computational evidence from classical applications of robust optimization { queuing and portfolio management { confirm that our new data-driven sets significantly outperform traditional robust optimization techniques whenever data is available. In the second part, we examine in detail an application of the above technique to the unit commitment problem. Unit commitment is a large-scale, multistage optimization problem under uncertainty that is critical to power system operations. Using real data from the New England market, we illustrate how our proposed data-driven uncertainty sets can be used to build high-fidelity models of the demand for electricity, and that the resulting large-scale, mixed-integer adaptive optimization problems can be solved efficiently. With respect to this second contribution, we propose new data-driven solution techniques for this class of problems inspired by ideas from machine learning. Extensive historical back-testing confirms that our proposed approach generates high quality solutions that compare with state-of-the-art methods. In the third part, we focus on behavioral modeling. Utility maximization (single agent case) and equilibrium modeling (multi-agent case) are by far the most common behavioral models in operations research. By combining ideas from inverse optimization with the theory of variational inequalities, we develop an efficient, data-driven technique for estimating the primitives of these models. Our approach supports both parametric and nonparametric estimation through kernel learning. We prove that our estimators enjoy a strong generalization guarantee even when the model is misspecified. Finally, we present computational evidence from applications in economics and transportation science illustrating the effectiveness of our approach and its scalability to large-scale instances.
by Vishal Gupta.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
38

McCord, Christopher George. "Data-driven dynamic optimization with auxiliary covariates." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/122098.

Full text
Abstract:
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2019
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 183-190).
Optimization under uncertainty forms the foundation for many of the fundamental problems the operations research community seeks to solve. In this thesis, we develop and analyze algorithms that incorporate ideas from machine learning to optimize uncertain objectives directly from data. In the first chapter, we consider problems in which the decision affects the observed outcome, such as in personalized medicine and pricing. We present a framework for using observational data to learn to optimize an uncertain objective over a continuous and multi-dimensional decision space. Our approach accounts for the uncertainty in predictions, and we provide theoretical results that show this adds value. In addition, we test our approach on a Warfarin dosing example, and it outperforms the leading alternative methods.
In the second chapter, we develop an approach for solving dynamic optimization problems with covariates that uses machine learning to approximate the unknown stochastic process of the uncertainty. We provide theoretical guarantees on the effectiveness of our method and validate the guarantees with computational experiments. In the third chapter, we introduce a distributionally robust approach for incorporating covariates in large-scale, data-driven dynamic optimization. We prove that it is asymptotically optimal and provide a tractable general-purpose approximation scheme that scales to problems with many temporal stages. Across examples in shipment planning, inventory management, and finance, our method achieves improvements of up to 15% over alternatives. In the final chapter, we apply the techniques developed in previous chapters to the problem of optimizing the operating room schedule at a major US hospital.
Our partner institution faces significant census variability throughout the week, which limits the amount of patients it can accept due to resource constraints at peak times. We introduce a data-driven approach for this problem that combines machine learning with mixed integer optimization and demonstrate that it can reliably reduce the maximal weekly census.
by Christopher George McCord.
Ph. D.
Ph.D. Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center
APA, Harvard, Vancouver, ISO, and other styles
39

Blanks, Zachary D. "A generalized hierarchical approach for data labeling." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/122386.

Full text
Abstract:
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Thesis: S.M., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2019
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 85-90).
The goal of this thesis was to develop a data type agnostic classification algorithm best suited for problems where there are a large number of similar labels (e.g., classifying a port versus a shipyard). The most common approach to this issue is to simply ignore it, and attempt to fit a classifier against all targets at once (a "flat" classifier). The problem with this technique is that it tends to do poorly due to label similarity. Conversely, there are other existing approaches, known as hierarchical classifiers (HCs), which propose clustering heuristics to group the labels. However, the most common HCs require that a "flat" model be trained a-priori before the label hierarchy can be learned. The primary issue with this approach is that if the initial estimator performs poorly then the resulting HC will have a similar rate of error.
To solve these challenges, we propose three new approaches which learn the label hierarchy without training a model beforehand and one which generalizes the standard HC. The first technique employs a k-means clustering heuristic which groups classes into a specified number of partitions. The second method takes the previously developed heuristic and formulates it as a mixed integer program (MIP). Employing a MIP allows the user to have greater control over the resulting label hierarchy by imposing meaningful constraints. The third approach learns meta-classes by using community detection algorithms on graphs which simplifies the hyper-parameter space when training an HC. Finally, the standard HC methodology is generalized by relaxing the requirement that the original model must be a "flat" classifier; instead, one can provide any of the HC approaches detailed previously as the initializer.
By giving the model a better starting point, the final estimator has a greater chance of yielding a lower error rate. To evaluate the performance of our methods, we tested them on a variety of data sets which contain a large number of similar labels. We observed the k-means clustering heuristic or community detection algorithm gave statistically significant improvements in out-of-sample performance against a flat and standard hierarchical classifier. Consequently our approach offers a solution to overcome problems for labeling data with similar classes.
by Zachary D. Blanks.
S.M.
S.M. Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center
APA, Harvard, Vancouver, ISO, and other styles
40

Ng, Yee Sian. "Advances in data-driven models for transportation." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/122100.

Full text
Abstract:
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2019
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 163-176).
With the rising popularity of ride-sharing and alternative modes of transportation, there has been a renewed interest in transit planning to improve service quality and stem declining ridership. However, it often takes months of manual planning for operators to redesign and reschedule services in response to changing needs. To this end, we provide four models of transportation planning that are based on data and driven by optimization. A key aspect is the ability to provide certificates of optimality, while being practical in generating high-quality solutions in a short amount of time. We provide approaches to combinatorial problems in transit planning that scales up to city-sized networks. In transit network design, current tractable approaches only consider edges that exist, resulting in proposals that are closely tethered to the original network. We allow new transit links to be proposed and account for commuters transferring between different services. In integrated transit scheduling, we provide a way for transit providers to synchronize the timing of services in multimodal networks while ensuring regularity in the timetables of the individual services. This is made possible by taking the characteristics of transit demand patterns into account when designing tractable formulations. We also advance the state of the art in demand models for transportation optimization. In emergency medical services, we provide data-driven formulations that outperforms their probabilistic counterparts in ensuring coverage. This is achieved by replacing independence assumptions in probabilistic models and capturing the interactions of services in overlapping regions. In transit planning, we provide a unified framework that allows us to optimize frequencies and prices jointly in transit networks for minimizing total waiting time.
by Yee Sian Ng.
Ph. D.
Ph.D. Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center
APA, Harvard, Vancouver, ISO, and other styles
41

Lindberg, Therese. "Modelling and Evaluation of Distributed Airflow Control in Data Centers." Thesis, Karlstads universitet, Institutionen för ingenjörsvetenskap och fysik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-36479.

Full text
Abstract:
In this work a suggested method to reduce the energy consumption of the cooling system in a data center is modelled and evaluated. Introduced is different approaches to distributed airflow control, in which different amounts of airflow can be supplied in different parts of the data center (instead of an even airflow distribution). Two different kinds of distributed airflow control are compared to a traditional approach without airflow control. The difference between the two control approaches being the type of server rack used, either traditional ones or a new kind of rack with vertically placed servers. A model capable of describing the power consumption of the data center cooling system for these different approaches to airflow control was constructed. Based on the model, MATLAB simulations of three different server work load scenarios were then carried out. It was found that introducing distributed airflow control reduced the power consumption for all scenarios and that the control approach with the new kind of rack had the largest reduction. For this case the power consumption of the cooling system could be reduced to 60% - 69% of the initial consumption, depending on the workload scenario. Also examined was the effect on the data center of different parameters and process variables (parameters held fixed with the help of feedback loops), as well as optimal set point values.
APA, Harvard, Vancouver, ISO, and other styles
42

Ma, Wei (Will Wei). "Dynamic, data-driven decision-making in revenue management." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/120224.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2018.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 233-241).
Motivated by applications in Revenue Management (RM), this thesis studies various problems in sequential decision-making and demand learning. In the first module, we consider a personalized RM setting, where items with limited inventories are recommended to heterogeneous customers sequentially visiting an e-commerce platform. We take the perspective of worst-case competitive ratio analysis, and aim to develop algorithms whose performance guarantees do not depend on the customer arrival process. We provide the first solution to this problem when there are both multiple items and multiple prices at which they could be sold, framing it as a general online resource allocation problem and developing a system of forecast-independent bid prices (Chapter 2). Second, we study a related assortment planning problem faced by Walmart Online Grocery, where before checkout, customers are recommended "add-on" items that are complementary to their current shopping cart (Chapter 3). Third, we derive inventory-dependent priceskimming policies for the single-leg RM problem, which extends existing competitive ratio results to non-independent demand (Chapter 4). In this module, we test our algorithms using a publicly-available data set from a major hotel chain. In the second module, we study bundling, which is the practice of selling different items together, and show how to learn and price using bundles. First, we introduce bundling as a new, alternate method for learning the price elasticities of items, which does not require any changing of prices; we validate our method on data from a large online retailer (Chapter 5). Second, we show how to sell bundles of goods profitably even when the goods have high production costs, and derive both distribution-dependent and distribution-free guarantees on the profitability (Chapter 6). In the final module, we study the Markovian multi-armed bandit problem under an undiscounted finite time horizon (Chapter 7). We improve existing approximation algorithms using LP rounding and random sampling techniques, which result in a (1/2 - eps)- approximation for the correlated stochastic knapsack problem that is tight relative to the LP. In this work, we introduce a framework for designing self-sampling algorithms, which is also used in our chronologically-later-to-appear work on add-on recommendation and single-leg RM.
by Will (Wei) Ma.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
43

Papush, Anna. "Data-driven methods for personalized product recommendation systems." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/115655.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2018.
Cataloged from PDF version of thesis.
Includes bibliographical references.
The online market has expanded tremendously over the past two decades across all industries ranging from retail to travel. This trend has resulted in the growing availability of information regarding consumer preferences and purchase behavior, sparking the development of increasingly more sophisticated product recommendation systems. Thus, a competitive edge in this rapidly growing sector could be worth up to millions of dollars in revenue for an online seller. Motivated by this increasingly prevalent problem, we propose an innovative model that selects, prices and recommends a personalized bundle of products to an online consumer. This model captures the trade-off between myopic profit maximization and inventory management, while selecting relevant products from consumer preferences. We develop two classes of approximation algorithms that run efficiently in real-time and provide analytical guarantees on their performance. We present practical applications through two case studies using: (i) point-of-sale transaction data from a large U.S. e-tailer, and, (ii) ticket transaction data from a premier global airline. The results demonstrate that our approaches result in significant improvements on the order of 3-7% lifts in expected revenue over current industry practices. We then extend this model to the setting in which consumer demand is subject to uncertainty. We address this challenge using dynamic learning and then improve upon it with robust optimization. We first frame our learning model as a contextual nonlinear multi-armed bandit problem and develop an approximation algorithm to solve it in real-time. We provide analytical guarantees on the asymptotic behavior of this algorithm's regret, showing that with high probability it is on the order of O([square root of] T). Our computational studies demonstrate this algorithm's tractability across various numbers of products, consumer features, and demand functions, and illustrate how it significantly out performs benchmark strategies. Given that demand estimates inherently contain error, we next consider a robust optimization approach under row-wise demand uncertainty. We define the robust counterparts under both polynomial and ellipsoidal uncertainty sets. Computational analysis shows that robust optimization is critical in highly constrained inventory settings, however the price of robustness drastically grows as a result of pricing strategies if the level of conservatism is too high.
by Anna Papush.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
44

Sturt, Bradley Eli. "Dynamic optimization in the age of big data." Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/127292.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, May, 2020
Cataloged from the official PDF of thesis.
Includes bibliographical references (pages 241-249).
This thesis revisits a fundamental class of dynamic optimization problems introduced by Dantzig (1955). These decision problems remain widely studied in many applications domains (e.g., inventory management, finance, energy planning) but require access to probability distributions that are rarely known in practice. First, we propose a new data-driven approach for addressing multi-stage stochastic linear optimization problems with unknown probability distributions. The approach consists of solving a robust optimization problem that is constructed from sample paths of the underlying stochastic process. As more sample paths are obtained, we prove that the optimal cost of the robust problem converges to that of the underlying stochastic problem. To the best of our knowledge, this is the first data-driven approach for multi-stage stochastic linear optimization problems which is asymptotically optimal when uncertainty is arbitrarily correlated across time.
Next, we develop approximation algorithms for the proposed data-driven approach by extending techniques from the field of robust optimization. In particular, we present a simple approximation algorithm, based on overlapping linear decision rules, which can be reformulated as a tractable linear optimization problem with size that scales linearly in the number of data points. For two-stage problems, we show the approximation algorithm is also asymptotically optimal, meaning that the optimal cost of the approximation algorithm converges to that of the underlying stochastic problem as the number of data points tends to infinity. Finally, we extend the proposed data-driven approach to address multi-stage stochastic linear optimization problems with side information. The approach combines predictive machine learning methods (such as K-nearest neighbors, kernel regression, and random forests) with the proposed robust optimization framework.
We prove that this machine learning-based approach is asymptotically optimal, and demonstrate the value of the proposed methodology in numerical experiments in the context of inventory management, scheduling, and finance.
by Bradley Eli Sturt.
Ph. D.
Ph.D. Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center
APA, Harvard, Vancouver, ISO, and other styles
45

Deselaers, Johannes. "Deep Learning Pupil Center Localization." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-287538.

Full text
Abstract:
This project strives to achieve high performance object localization with Convolutional Neural Networks (CNNs) - in particular for pupil centers in the context of remote eye tracking systems. Three different network architectures suitable to the task are developed, evaluated and compared - one based on regression using fully connected layers, one Fully Convolutional Network and one Deconvolutional Network. The best performing model achieves a mean error of only 0.52 pixel distance and a median error of 0.42 pixel distance compared to the ground truth annotations. The 95th percentile lies at 1.12 pixel error. This exceeds the performance of current state-of-the-art pupil center detection algorithms by an order of magnitude, a result that can be accredited both to the algorithm as well as to the dataset which exceeds datasets used for this purpose in prior publications in suitability, quality and size. Opportunities for further improvements of the computational cost based on recent model compression research are suggested.
Detta projekt strävar efter att uppnå högpresterande objektlokalisering med djupa faltningsnätverker/Convolutional Neural Networks (CNNs) - särskilt för pupillcenter i samband med eyetracking. Tre olika nätverksarkitekturer som passar uppgiften utvecklas, utvärderas och jämförs - en baserad på regression med fullt anslutna lager, ett Fully Convolutional Network och ett Deconvolutional Network. Den bäst presterande modellen uppnår ett medelfel på endast 0.52 pixelavstånd och ett medianfel på 0.42 pixelavstånd jämfört med marken sanningsetiketten. Den 95:e percentilen ligger på 1.12 pixelfel. Detta överträffar prestandan hos nuvarande toppmoderna detekteringsalgoritmer för pupillcentrum med en storleksordning, ett resultat som kan ackrediteras både till algoritmen såväl som till dataset som överstiger datasets som används för detta ändamål i tidigare publikationer i lämplighet, kvalitet och storlek. Möjligheter till ytterligare förbättringar av beräkningskostnaden baserad på ny kompressionsforskning föreslås.
APA, Harvard, Vancouver, ISO, and other styles
46

Tudoran, Radu-Marius. "High-Performance Big Data Management Across Cloud Data Centers." Electronic Thesis or Diss., Rennes, École normale supérieure, 2014. http://www.theses.fr/2014ENSR0004.

Full text
Abstract:
La puissance de calcul facilement accessible offerte par les infrastructures clouds, couplés à la révolution du "Big Data", augmentent l'échelle et la vitesse auxquelles l'analyse des données est effectuée. Les ressources de cloud computing pour le calcul et le stockage sont répartis entre plusieurs centres de données de par le monde. Permettre des transferts de données rapides devient particulièrement important dans le cadre d'applications scientifiques pour lesquels déplacer le traitement proche de données est coûteux voire impossible. Les principaux objectifs de cette thèse consistent à analyser comment les clouds peuvent devenir "Big Data - friendly", et quelles sont les meilleures options pour fournir des services de gestion de données aptes à répondre aux besoins des applications. Dans cette thèse, nous présentons nos contributions pour améliorer la performance de la gestion de données pour les applications exécutées sur plusieurs centres de données géographiquement distribués. Nous commençons avec les aspects concernant l'échelle du traitement de données sur un site, et poursuivons avec le développements de solutions de type MapReduce permettant la distribution des calculs entre plusieurs centres. Ensuite, nous présentons une architecture de service de transfert qui permet d'optimiser le rapport coût-performance des transferts. Ce service est exploité dans le contexte de la diffusion de données en temps-réel entre des centres de données de clouds. Enfin, nous étudions la viabilité, pour une fournisseur de cloud, de la solution consistant à intégrer cette architecture comme un service basé sur un paradigme de tarification flexible, qualifiée de "Transfert-as-a-Service"
The easily accessible computing power offered by cloud infrastructures, coupled with the "Big Data" revolution, are increasing the scale and speed at which data analysis is performed. Cloud computing resources for compute and storage are spread across multiple data centers around the world. Enabling fast data transfers becomes especially important in scientific applications where moving the processing close to data is expensive or even impossible. The main objectives of this thesis are to analyze how clouds can become "Big Data - friendly", and what are the best options to provide data management services able to meet the needs of applications. In this thesis, we present our contributions to improve the performance of data management for applications running on several geographically distributed data centers. We start with aspects concerning the scale of data processing on a site, and continue with the development of MapReduce type solutions allowing the distribution of calculations between several centers. Then, we present a transfer service architecture that optimizes the cost-performance ratio of transfers. This service is operated in the context of real-time data streaming between cloud data centers. Finally, we study the viability, for a cloud provider, of the solution consisting in integrating this architecture as a service based on a flexible pricing paradigm, qualified as "Transfer-as-a-Service"
APA, Harvard, Vancouver, ISO, and other styles
47

Anderson, Ross Michael. "Stochastic models and data driven simulations for healthcare operations." Thesis, Massachusetts Institute of Technology, 2014. http://hdl.handle.net/1721.1/92055.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2014.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 251-257).
This thesis considers problems in two areas in the healthcare operations: Kidney Paired Donation (KPD) and scheduling medical residents in hospitals. In both areas, we explore the implications of policy change through high fidelity simulations. We then build stochastic models to provide strategic insight into how policy decisions affect the operations of these healthcare systems. KPD programs enable patients with living but incompatible donors (referred to as patient-donor pairs) to exchange kidneys with other such pairs in a centrally organized clearing house. Exchanges involving two or more pairs are performed by arranging the pairs in a cycle, where the donor from each pair gives to the patient from the next pair. Alternatively, a so called altruistic donor can be used to initiate a chain of transplants through many pairs, ending on a patient without a willing donor. In recent years, the use of chains has become pervasive in KPD, with chains now accounting for the majority of KPD transplants performed in the United States. A major focus of our work is to understand why long chains have become the dominant method of exchange in KPD, and how to best integrate their use into exchange programs. In particular, we are interested in policies that KPD programs use to determine which exchanges to perform, which we refer to as matching policies. First, we devise a new algorithm using integer programming to maximize the number of transplants performed on a fixed pool of patients, demonstrating that matching policies which must solve this problem are implementable. Second, we evaluate the long run implications of various matching policies, both through high fidelity simulations and analytic models. Most importantly, we find that: (1) using long chains results in more transplants and reduced waiting time, and (2) the policy of maximizing the number of transplants performed each day is as good as any batching policy. Our theoretical results are based on introducing a novel model of a dynamically evolving random graph. The analysis of this model uses classical techniques from Erdos-Renyi random graph theory as well as tools from queueing theory including Lyapunov functions and Little's Law. In the second half of this thesis, we consider the problem of how hospitals should design schedules for their medical residents. These schedules must have capacity to treat all incoming patients, provide quality care, and comply with regulations restricting shift lengths. In 2011, the Accreditation Council for Graduate Medical Education (ACGME) instituted a new set of regulations on duty hours that restrict shift lengths for medical residents. We consider two operational questions for hospitals in light of these new regulations: will there be sufficient staff to admit all incoming patients, and how will the continuity of patient care be affected, particularly in a first day of a patients hospital stay, when such continuity is critical? To address these questions, we built a discrete event simulation tool using historical data from a major academic hospital, and compared several policies relying on both long and short shifts. The simulation tool was used to inform staffing level decisions at the hospital, which was transitioning away from long shifts. Use of the tool led to the following strategic insights. We found that schedules based on shorter more frequent shifts actually led to a larger admitting capacity. At the same time, such schedules generally reduce the continuity of care by most metrics when the departments operate at normal loads. However, in departments which operate at the critical capacity regime, we found that even the continuity of care improved in some metrics for schedules based on shorter shifts, due to a reduction in the use of overtime doctors. We develop an analytically tractable queueing model to capture these insights. The analysis of this model requires analyzing the steady-state behavior of the fluid limit of a queueing system, and proving a so called "interchange of limits" result.
by Ross Michael Anderson.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
48

Harris, William Ray. "Anomaly detection methods for unmanned underwater vehicle performance data." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/98718.

Full text
Abstract:
Thesis: S.M., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2015.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 101-102).
This thesis considers the problem of detecting anomalies in performance data for unmanned underwater vehicles(UUVs). UUVs collect a tremendous amount of data, which operators are required to analyze between missions to determine if vehicle systems are functioning properly. Operators are typically under heavy time constraints when performing this data analysis. The goal of this research is to provide operators with a post-mission data analysis tool that automatically identifies anomalous features of performance data. Such anomalies are of interest because they are often the result of an abnormal condition that may prevent the vehicle from performing its programmed mission. In this thesis, we consider existing one-class classification anomaly detection techniques since labeled training data from the anomalous class is not readily available. Specifically, we focus on two anomaly detection techniques: (1) Kernel Density Estimation (KDE) Anomaly Detection and (2) Local Outlier Factor. Results are presented for selected UUV systems and data features, and initial findings provide insight into the effectiveness of these algorithms. Lastly, we explore ways to extend our KDE anomaly detection algorithm for various tasks, such as finding anomalies in discrete data and identifying anomalous trends in time-series data.
by William Ray Harris.
S.M.
APA, Harvard, Vancouver, ISO, and other styles
49

Uichanco, Joline Ann Villaranda. "Data-driven optimization and analytics for operations management applications." Thesis, Massachusetts Institute of Technology, 2013. http://hdl.handle.net/1721.1/85695.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2013.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 163-166).
In this thesis, we study data-driven decision making in operation management contexts, with a focus on both theoretical and practical aspects. The first part of the thesis analyzes the well-known newsvendor model but under the assumption that, even though demand is stochastic, its probability distribution is not part of the input. Instead, the only information available is a set of independent samples drawn from the demand distribution. We analyze the well-known sample average approximation (SAA) approach, and obtain new tight analytical bounds on the accuracy of the SAA solution. Unlike previous work, these bounds match the empirical performance of SAA observed in extensive computational experiments. Our analysis reveals that a distribution's weighted mean spread (WMS) impacts SAA accuracy. Furthermore, we are able to derive distribution parametric free bound on SAA accuracy for log-concave distributions through an innovative optimization-based analysis which minimizes WMS over the distribution family. In the second part of the thesis, we use spread information to introduce new families of demand distributions under the minimax regret framework. We propose order policies that require only a distribution's mean and spread information. These policies have several attractive properties. First, they take the form of simple closed-form expressions. Second, we can quantify an upper bound on the resulting regret. Third, under an environment of high profit margins, they are provably near-optimal under mild technical assumptions on the failure rate of the demand distribution. And finally, the information that they require is easy to estimate with data. We show in extensive numerical simulations that when profit margins are high, even if the information in our policy is estimated from (sometimes few) samples, they often manage to capture at least 99% of the optimal expected profit. The third part of the thesis describes both applied and analytical work in collaboration with a large multi-state gas utility. We address a major operational resource allocation problem in which some of the jobs are scheduled and known in advance, and some are unpredictable and have to be addressed as they appear. We employ a novel decomposition approach that solves the problem in two phases. The first is a job scheduling phase, where regular jobs are scheduled over a time horizon. The second is a crew assignment phase, which assigns jobs to maintenance crews under a stochastic number of future emergencies. We propose heuristics for both phases using linear programming relaxation and list scheduling. Using our models, we develop a decision support tool for the utility which is currently being piloted in one of the company's sites. Based on the utility's data, we project that the tool will result in 55% reduction in overtime hours.
by Joline Ann Villaranda Uichanco.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
50

Menjoge, Rajiv (Rajiv Shailendra). "New procedures for visualizing data and diagnosing regression models." Thesis, Massachusetts Institute of Technology, 2010. http://hdl.handle.net/1721.1/61190.

Full text
Abstract:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2010.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 97-103).
This thesis presents new methods for exploring data using visualization techniques. The first part of the thesis develops a procedure for visualizing the sampling variability of a plot. The motivation behind this development is that reporting a single plot of a sample of data without a description of its sampling variability can be uninformative and misleading in the same way that reporting a sample mean without a confidence interval can be. Next, the thesis develops a method for simplifying large scatter plot matrices, using similar techniques as the above procedure. The second part of the thesis introduces a new diagnostic method for regression called backward selection search. Backward selection search identifies a relevant feature set and a set of influential observations with good accuracy, given the difficulty of the problem, and additionally provides a description, in the form of a set of plots, of how the regression inferences would be affected with other model choices, which are close to optimal. This description is useful, because an observation, that one analyst identifies as an outlier, could be identified as the most important observation in the data set by another analyst. The key idea behind backward selection search has implications for methodology improvements beyond the realm of visualization. This is described following the presentation of backward selection search. Real and simulated examples, provided throughout the thesis, demonstrate that the methods developed in the first part of the thesis will improve the effectiveness and validity of data visualization, while the methods developed in the second half of the thesis will improve analysts' abilities to select robust models.
by Rajiv Menjoge.
Ph.D.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography