Увійти

Готові списки джерел за темами / Boosting window algorithm / Статті в журналах

Щоб переглянути інші типи публікацій з цієї теми, перейдіть за посиланням: Boosting window algorithm.

Статті в журналах з теми "Boosting window algorithm"

Автор: Grafiati

Опубліковано: 10 січня 2023

Оновлено: 28 січня 2023

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся з топ-49 статей у журналах для дослідження на тему "Boosting window algorithm".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Переглядайте статті в журналах для різних дисциплін та оформлюйте правильно вашу бібліографію.

1

Deng, Shangkun, Chenguang Wang, Jie Li, Haoran Yu, Hongyu Tian, Yu Zhang, Yong Cui, Fangjie Ma, and Tianxiang Yang. "Identification of Insider Trading Using Extreme Gradient Boosting and Multi-Objective Optimization." Information 10, no. 12 (November 25, 2019): 367. http://dx.doi.org/10.3390/info10120367.

Повний текст джерела

Анотація:

Illegal insider trading identification presents a challenging task that attracts great interest from researchers due to the serious harm of insider trading activities to the investors’ confidence and the sustainable development of security markets. In this study, we proposed an identification approach which integrates XGboost (eXtreme Gradient Boosting) and NSGA-II (Non-dominated Sorting Genetic Algorithm II) for insider trading regulation. First, the insider trading cases that occurred in the Chinese security market were automatically derived, and their relevant indicators were calculated and obtained. Then, the proposed method trained the XGboost model and it employed the NSGA-II for optimizing the parameters of XGboost by using multiple objective functions. Finally, the testing samples were identified using the XGboost with optimized parameters. Its performances were empirically measured by both identification accuracy and efficiency over multiple time window lengths. Results of experiments showed that the proposed approach successfully achieved the best accuracy under the time window length of 90-days, demonstrating that relevant features calculated within the 90-days time window length could be extremely beneficial for insider trading regulation. Additionally, the proposed approach outperformed all benchmark methods in terms of both identification accuracy and efficiency, indicating that it could be used as an alternative approach for insider trading regulation in the Chinese security market. The proposed approach and results in this research is of great significance for market regulators to improve their supervision efficiency and accuracy on illegal insider trading identification.

Стилі APA, Harvard, Vancouver, ISO та ін.

2

Li, Yongfeng, Hang Shu, Jérôme Bindelle, Beibei Xu, Wenju Zhang, Zhongming Jin, Leifeng Guo, and Wensheng Wang. "Classification and Analysis of Multiple Cattle Unitary Behaviors and Movements Based on Machine Learning Methods." Animals 12, no. 9 (April 20, 2022): 1060. http://dx.doi.org/10.3390/ani12091060.

Повний текст джерела

Анотація:

The behavior of livestock on farms is the primary representation of animal welfare, health conditions, and social interactions to determine whether they are healthy or not. The objective of this study was to propose a framework based on inertial measurement unit (IMU) data from 10 dairy cows to classify unitary behaviors such as feeding, standing, lying, ruminating-standing, ruminating-lying, and walking, and identify movements during unitary behaviors. Classification performance was investigated for three machine learning algorithms (K-nearest neighbors (KNN), random forest (RF), and extreme boosting algorithm (XGBoost)) in four time windows (5, 10, 30, and 60 s). Furthermore, feed tossing, rolling biting, and chewing in the correctly classified feeding segments were analyzed by the magnitude of the acceleration. The results revealed that the XGBoost had the highest performance in the 60 s time window with an average F1 score of 94% for the six unitary behavior classes. The F1 score of movements is 78% (feed tossing), 87% (rolling biting), and 87% (chewing). This framework offers a possibility to explore more detailed movements based on the unitary behavior classification.

Стилі APA, Harvard, Vancouver, ISO та ін.

3

Song, Moon Kyou, and Md Mostafa Kamal Sarker. "Modeling and Implementing Two-Stage AdaBoost for Real-Time Vehicle License Plate Detection." Journal of Applied Mathematics 2014 (2014): 1–8. http://dx.doi.org/10.1155/2014/697658.

Повний текст джерела

Анотація:

License plate (LP) detection is the most imperative part of the automatic LP recognition system. In previous years, different methods, techniques, and algorithms have been developed for LP detection (LPD) systems. This paper proposes to automatical detection of car LPs via image processing techniques based on classifier or machine learning algorithms. In this paper, we propose a real-time and robust method for LPD systems using the two-stage adaptive boosting (AdaBoost) algorithm combined with different image preprocessing techniques. Haar-like features are used to compute and select features from LP images. The AdaBoost algorithm is used to classify parts of an image within a search window by a trained strong classifier as either LP or non-LP. Adaptive thresholding is used for the image preprocessing method applied to those images that are of insufficient quality for LPD. This method is of a faster speed and higher accuracy than most of the existing methods used in LPD. Experimental results demonstrate that the average LPD rate is 98.38% and the computational time is approximately 49 ms.

Стилі APA, Harvard, Vancouver, ISO та ін.

4

WANG, CHI-CHEN RAXLE, JIN-YI WU, and JENN-JIER JAMES LIEN. "PEDESTRIAN DETECTION SYSTEM USING CASCADED BOOSTING WITH INVARIANCE OF ORIENTED GRADIENTS." International Journal of Pattern Recognition and Artificial Intelligence 23, no. 04 (June 2009): 801–23. http://dx.doi.org/10.1142/s0218001409007363.

Повний текст джерела

Анотація:

This study presents a novel learning-based pedestrian detection system capable of automatically detecting individuals of different sizes and orientations against a wide variety of backgrounds, including crowds, even when the individual is partially occluded. To render the detection performance robust toward the effects of geometric and rotational variations in the original image, the feature extraction process is performed using both rectangular- and circular-type blocks of various sizes and aspect ratios. The extracted blocks are rotated in accordance with their dominant orientation(s) such that all the blocks extracted from the input images are rotationally invariant. The pixels within the cells in each block are then voted into rectangular- and circular-type 9-bin histograms of oriented gradients (HOGs) in accordance with their gradient magnitudes and corresponding multivariate Gaussian-weighted windows. Finally, four cell-based histograms are concatenated using a tri-linear interpolation technique to form one 36-dimensional normalized HOG feature vector for each block. The experimental results show that the use of the Gaussian-weighted window approach and tri-linear interpolation technique in constructing the HOG feature vectors improves the detection performance from 91% to 94.5%. In the proposed scheme, the detection process is performed using a cascaded detector structure in which the weak classifiers and corresponding weights of each stage are established using the AdaBoost self-learning algorithm. The experimental results reveal that the cascaded structure not only provides a better detection performance than many of the schemes presented in the literature, but also achieves a significant reduction in the computational time required to classify each input image.

Стилі APA, Harvard, Vancouver, ISO та ін.

5

Chen, Zhiwei, Wei Zheng, Wenjie Yin, Xiaoping Li, Gangqiang Zhang, and Jing Zhang. "Improving the Spatial Resolution of GRACE-Derived Terrestrial Water Storage Changes in Small Areas Using the Machine Learning Spatial Downscaling Method." Remote Sensing 13, no. 23 (November 24, 2021): 4760. http://dx.doi.org/10.3390/rs13234760.

Повний текст джерела

Анотація:

Gravity Recovery and Climate Experiment (GRACE) satellites can effectively monitor terrestrial water storage (TWS) changes in large-scale areas. However, due to the coarse resolution of GRACE products, there is still a large number of deficiencies that need to be considered when investigating TWS changes in small-scale areas. Hence, it is necessary to downscale the GRACE products with a coarse resolution. First, in order to solve this problem, the present study employs modeling windows of different sizes (Window Size, WS) combined with multiple machine learning algorithms to develop a new machine learning spatial downscaling method (MLSDM) in the spatial dimension. Second, The MLSDM is used to improve the spatial resolution of GRACE observations from 0.5° to 0.25°, which is applied to Guantao County. The present study has verified the downscaling accuracy of the model developed through the combination of WS3, WS5, WS7, and WS9 and jointed with Random Forest (RF), Extra Tree Regressor (ETR), Adaptive Boosting Regressor (ABR), and Gradient Boosting Regressor (GBR) algorithms. The analysis shows that the accuracy of each combined model is improved after adding the residuals to the high-resolution downscaled results. In each modeling window, the accuracy of RF is better than that of ETR, ABR, and GBR. Additionally, compared to the changes in the TWS time series that are derived by the model before and after downscaling, the results indicate that the downscaling accuracy of WS5 is slightly more superior compared to that of WS3, WS7, and WS9. Third, the spatial resolution of the GRACE data was increased from 0.5° to 0.05° by integrating the WS5 and RF algorithm. The results are as follows: (1) The TWS (GWS) changes before and after downscaling are consistent, decreasing at −20.86 mm/yr and −21.79 mm/yr (−14.53 mm/yr and −15.46 mm/yr), respectively, and the Nash–Sutcliffe efficiency coefficient (NSE) and correlation coefficient (CC) values of both are above 0.99 (0.98). (2) The CC between the 80% deep groundwater well data and the downscaled GWS changes are above 0.70. Overall, the MLSDM can not only effectively improve the spatial resolution of GRACE products but also can preserve the spatial distribution of the original signal, which can provide a reference scheme for research focusing on the downscaling of GRACE products.

Стилі APA, Harvard, Vancouver, ISO та ін.

6

Kim, Yeongmin, Minsu Chae, Namjun Cho, Hyowook Gil, and Hwamin Lee. "Machine Learning-Based Prediction Models of Acute Respiratory Failure in Patients with Acute Pesticide Poisoning." Mathematics 10, no. 24 (December 7, 2022): 4633. http://dx.doi.org/10.3390/math10244633.

Повний текст джерела

Анотація:

The prognosis of patients with acute pesticide poisoning depends on their acute respiratory condition. Here, we propose machine learning models to predict acute respiratory failure in patients with acute pesticide poisoning using a decision tree, logistic regression, and random forests, support vector machine, adaptive boosting, gradient boosting, multi-layer boosting, recurrent neural network, long short-term memory, and gated recurrent gate. We collected medical records of patients with acute pesticide poisoning at the Soonchunhyang University Cheonan Hospital from 1 January 2016 to 31 December 2020. We applied the k-Nearest Neighbor Imputer algorithm, MissForest Impuer and average imputation method to handle the problems of missing values and outliers in electronic medical records. In addition, we used the min–max scaling method for feature scaling. Using the most recent medical research, p-values, tree-based feature selection, and recursive feature reduction, we selected 17 out of 81 features. We applied a sliding window of 3 h to every patient’s medical record within 24 h. As the prevalence of acute respiratory failure in our dataset was 8%, we employed oversampling. We assessed the performance of our models in predicting acute respiratory failure. The proposed long short-term memory demonstrated a positive predictive value of 98.42%, a sensitivity of 97.91%, and an F1 score of 0.9816.

Стилі APA, Harvard, Vancouver, ISO та ін.

7

Wei, Guanhao, Li Zhou, Lynn L. Lu, and Robert G. Steen. "Advanced tumor progression detection by leveraging EHR based convolutional neural network boosting approaches." Journal of Clinical Oncology 39, no. 15_suppl (May 20, 2021): e13581-e13581. http://dx.doi.org/10.1200/jco.2021.39.15_suppl.e13581.

Повний текст джерела

Анотація:

e13581 Background: Medical research in the areas of immunotherapies and targeted therapies has been drawing heavy attention and investment as the number of approved therapies continue to rise. Advanced findings on cancer treatment based on rich healthcare records help doctors to understand patient’s characteristics. The more accurate one can make a mathematical model predicting the likelihood of disease appearance, progression or treatment initiation, the better a proactive and effective targeting solution can be provided. Methods: Create positive/negative labels for patients who start to receive treatment or not within a time-sensitive window. Based on EHR records as Procedures, Prescription, and Diagnosis (PPD), patient’s feature interrelationship can be determined by a conditional probability matrix. Map PPD data to a well-designed patient-image profiles based a modified genetic algorithm to optimize and measure closeness. A Convolutional Neural Networks (CNN) model is then used to extract characteristics from patients’ feature image and learn the local patterns. Training the CNN model together with other upgraded AIML models is used to enhance overall prediction precision. Results: 144 different aggregated PPD features for patients with Chronic Lymphocytic Leukemia (CLL) were selected from the databases. Overall, the number of positive patients, who received treatment due to disease progression, is around 3% among 60K cases in each cohort, which is an extremely unbalanced dataset that is a challenge task for model training. In practice, we care about the model precision at k, which is the percentage of truly identified patients with k highest prediction scores. Comparing to common machine models like Random Forest, XGBoost and CatBoost, our proposed CNN model boosts the ensemble of baseline model performance in terms of average prediction precision at 1000 from 14.9% to 17.1%, which is about 15% relative percentage increase. Using this approach, many more truly identified patients could potentially receive targeted treatment on time. Patients’ top features and key feature interactions can also be identified as important references. Conclusions: The novel CNN boosting algorithm considers both aggregated PPD feature pairs using the graphical structure, significantly improves prediction model performance, and increases the model interpretability. In summary, the proposed model can help identify more potential patient candidates and determine precise treatment options.

Стилі APA, Harvard, Vancouver, ISO та ін.

8

Xiang, Tao, Tao Li, Mao Ye, and Zijian Liu. "Random Forest with Adaptive Local Template for Pedestrian Detection." Mathematical Problems in Engineering 2015 (2015): 1–11. http://dx.doi.org/10.1155/2015/767423.

Повний текст джерела

Анотація:

Pedestrian detection with large intraclass variations is still a challenging task in computer vision. In this paper, we propose a novel pedestrian detection method based on Random Forest. Firstly, we generate a few local templates with different sizes and different locations in positive exemplars. Then, the Random Forest is built whose splitting functions are optimized by maximizing class purity of matching the local templates to the training samples, respectively. To improve the classification accuracy, we adopt a boosting-like algorithm to update the weights of the training samples in a layer-wise fashion. During detection, the trained Random Forest will vote the category when a sliding window is input. Our contributions are the splitting functions based on local template matching with adaptive size and location and iteratively weight updating method. We evaluate the proposed method on 2 well-known challenging datasets: TUD pedestrians and INRIA pedestrians. The experimental results demonstrate that our method achieves state-of-the-art or competitive performance.

Стилі APA, Harvard, Vancouver, ISO та ін.

9

Li, Lihua, Mengzui Di, Hao Xue, Zixuan Zhou, and Ziqi Wang. "Feature Selection Model Based on IWOA for Behavior Identification of Chicken." Sensors 22, no. 16 (August 17, 2022): 6147. http://dx.doi.org/10.3390/s22166147.

Повний текст джерела

Анотація:

In order to reduce the influence of redundant features on the performance of the model in the process of accelerometer behavior recognition, and to improve the recognition accuracy of the model, this paper proposes an improved Whale Optimization algorithm with mixed strategy (IWOA) combined with the extreme gradient boosting algorithm (XGBoost) as a preferred method for chicken behavior identification features. A nine-axis inertial sensor was used to obtain the chicken behavior data. After noise reduction, the sliding window was used to extract 44 dimensional features in the time domain and frequency domain. To improve the search ability of the Whale Optimization algorithm for optimal solutions, the introduction of the good point set improves population diversity and expands the search range; the introduction of adaptive weight balances the search ability of the optimal solution in the early and late stages; the introduction of dimension-by-dimension lens imaging learning based on the adaptive weight factor perturbs the optimal solution and enhances the ability to jump out of the local optimal solution. This method’s effectiveness was verified by recognizing cage breeders’ feeding and drinking behaviors. The results show that the number of feature dimensions is reduced by 72.73%. At the same time, the behavior recognition accuracy is increased by 2.41% compared with the original behavior feature dataset, which is 95.58%. Compared with other dimensionality reduction methods, the IWOA–XGBoost model proposed in this paper has the highest recognition accuracy. The dimension reduction results have a certain degree of universality for different classification algorithms. This provides a method for behavior recognition based on acceleration sensor data.

Стилі APA, Harvard, Vancouver, ISO та ін.

10

Dave, Pritul, Arjun Chandarana, Parth Goel, and Amit Ganatra. "An amalgamation of YOLOv4 and XGBoost for next-gen smart traffic management system." PeerJ Computer Science 7 (June 18, 2021): e586. http://dx.doi.org/10.7717/peerj-cs.586.

Повний текст джерела

Анотація:

The traffic congestion and the rise in the number of vehicles have become a grievous issue, and it is focused worldwide. One of the issues with traffic management is that the traffic light’s timer is not dynamic. As a result, one has to remain longer even if there are no or fewer vehicles, on a roadway, causing unnecessary waiting time, fuel consumption and leads to pollution. Prior work on smart traffic management systems repurposes the use of Internet of things, Time Series Forecasting, and Digital Image Processing. Computer Vision-based smart traffic management is an emerging area of research. Therefore a real-time traffic light optimization algorithm that uses Machine Learning and Deep Learning Techniques to predict the optimal time required by the vehicles to clear the lane is presented. This article concentrates on a two-step approach. The first step is to obtain the count of the independent category of the class of vehicles. For this, the You Only Look Once version 4 (YOLOv4) object detection technique is employed. In the second step, an ensemble technique named eXtreme Gradient Boosting (XGBoost) for predicting the optimal time of the green light window is implemented. Furthermore, the different implemented versions of YOLO and different prediction algorithms are compared with the proposed approach. The experimental analysis signifies that YOLOv4 with the XGBoost algorithm produces the most precise outcomes with a balance of accuracy and inference time. The proposed approach elegantly reduces an average of 32.3% of waiting time with usual traffic on the road.

Стилі APA, Harvard, Vancouver, ISO та ін.

11

Purde, Vedud, Elena Kudryashova, David B. Heisler, Reena Shakya, and Dmitri S. Kudryashov. "Intein-mediated cytoplasmic reconstitution of a split toxin enables selective cell ablation in mixed populations and tumor xenografts." Proceedings of the National Academy of Sciences 117, no. 36 (August 24, 2020): 22090–100. http://dx.doi.org/10.1073/pnas.2006603117.

Повний текст джерела

Анотація:

The application of proteinaceous toxins for cell ablation is limited by their high on- and off-target toxicity, severe side effects, and a narrow therapeutic window. The selectivity of targeting can be improved by intein-based toxin reconstitution from two dysfunctional fragments provided their cytoplasmic delivery via independent, selective pathways. While the reconstitution of proteins from genetically encoded elements has been explored, exploiting cell-surface receptors for boosting selectivity has not been attained. We designed a robust splitting algorithm and achieved reliable cytoplasmic reconstitution of functional diphtheria toxin from engineered intein-flanked fragments upon receptor-mediated delivery of one of them to the cells expressing the counterpart. Retargeting the delivery machinery toward different receptors overexpressed in cancer cells enables selective ablation of specific subpopulations in mixed cell cultures. In a mouse model, the transmembrane delivery of a split-toxin construct potently inhibits the growth of xenograft tumors expressing the split counterpart. Receptor-mediated delivery of engineered split proteins provides a platform for precise therapeutic and experimental ablation of tumors or desired cell populations while also greatly expanding the applicability of the intein-based protein transsplicing.

Стилі APA, Harvard, Vancouver, ISO та ін.

12

Shang, Wentian, Lijun Deng, and Jian Liu. "A Novel Air-Door Opening and Closing Identification Algorithm Using a Single Wind-Velocity Sensor." Sensors 22, no. 18 (September 9, 2022): 6837. http://dx.doi.org/10.3390/s22186837.

Повний текст джерела

Анотація:

The air-door is an important device for adjusting the air flow in a mine. It opens and closes within a short time owing to transportation and other factors. Although the switching sensor alone can identify the air-door opening and closing, it cannot relate it to abnormal fluctuations in the wind speed. Large fluctuations in the wind-velocity sensor data during this time can lead to false alarms. To overcome this problem, we propose a method for identifying air-door opening and closing using a single wind-velocity sensor. A multi-scale sliding window (MSSW) is employed to divide the samples. Then, the data global features and fluctuation features are extracted using statistics and the discrete wavelet transform (DWT). In addition, a machine learning model is adopted to classify each sample. Further, the identification results are selected by merging the classification results using the non-maximum suppression method. Finally, considering the safety accidents caused by the air-door opening and closing in an actual production mine, a large number of experiments were carried out to verify the effect of the algorithm using a simulated tunnel model. The results show that the proposed algorithm exhibits superior performance when the gradient boosting decision tree (GBDT) is selected for classification. In the data set composed of air-door opening and closing experimental data, the accuracy, precision, and recall rates of the air-door opening and closing identification are 91.89%, 93.07%, and 91.07%, respectively. In the data set composed of air-door opening and closing and other mine production activity experimental data, the accuracy, precision, and recall rates of the air-door opening and closing identification are 89.61%, 90.31%, and 88.39%, respectively.

Стилі APA, Harvard, Vancouver, ISO та ін.

13

Ding, Luyu, Yang Lv, Ruixiang Jiang, Wenjie Zhao, Qifeng Li, Baozhu Yang, Ligen Yu, Weihong Ma, Ronghua Gao, and Qinyang Yu. "Predicting the Feed Intake of Cattle Based on Jaw Movement Using a Triaxial Accelerometer." Agriculture 12, no. 7 (June 21, 2022): 899. http://dx.doi.org/10.3390/agriculture12070899.

Повний текст джерела

Анотація:

The use of an accelerometer is considered as a promising method for the automatic measurement of the feeding behavior or feed intake of cattle, with great significance in facilitating daily management. To address further need for commercial use, an efficient classification algorithm at a low sample frequency is needed to reduce the amount of recorded data to increase the battery life of the monitoring device, and a high-precision model needs to be developed to predict feed intake on the basis of feeding behavior. Accelerograms for the jaw movement and feed intake of 13 mid-lactating cows were collected during feeding with a sampling frequency of 1 Hz at three different positions: the nasolabial levator muscle (P1), the right masseter muscle (P2), and the left lower lip muscle (P3). A behavior identification framework was developed to recognize jaw movements including ingesting, chewing and ingesting–chewing through extreme gradient boosting (XGB) integrated with the hidden Markov model solved by the Viterbi algorithm (HMM–Viterbi). Fourteen machine learning models were established and compared in order to predict feed intake rate through the accelerometer signals of recognized jaw movement activities. The developed behavior identification framework could effectively recognize different jaw movement activities with a precision of 99% at a window size of 10 s. The measured feed intake rate was 190 ± 89 g/min and could be predicted efficiently using the extra trees regressor (ETR), whose R2, RMSE, and NME were 0.97, 0.36 and 0.05, respectively. The three investigated monitoring sites may have affected the accuracy of feed intake prediction, but not behavior identification. P1 was recommended as the proper monitoring site, and the results of this study provide a reference for the further development of a wearable device equipped with accelerometers to measure feeding behavior and to predict feed intake.

Стилі APA, Harvard, Vancouver, ISO та ін.

14

Limingoja, Leevi, Kari Antila, Vesa Jormanainen, Joel Röntynen, Vilma Jägerroos, Leena Soininen, Hanna Nordlund, Kristian Vepsäläinen, Risto Kaikkonen, and Tea Lallukka. "Impact of a Conformité Européenne (CE) Certification–Marked Medical Software Sensor on COVID-19 Pandemic Progression Prediction: Register-Based Study Using Machine Learning Methods." JMIR Formative Research 6, no. 3 (March 17, 2022): e35181. http://dx.doi.org/10.2196/35181.

Повний текст джерела

Анотація:

Background To address the current COVID-19 and any future pandemic, we need robust, real-time, and population-scale collection and analysis of data. Rapid and comprehensive knowledge on the trends in reported symptoms in populations provides an earlier window into the progression of viral spread, and helps to predict the needs and timing of professional health care. Objective The objective of this study was to use a Conformité Européenne (CE)-marked medical online symptom checker service, Omaolo, and validate the data against the national demand for COVID-19–related care to predict the pandemic progression in Finland. Methods Our data comprised real-time Omaolo COVID-19 symptom checker responses (414,477 in total) and daily admission counts in nationwide inpatient and outpatient registers provided by the Finnish Institute for Health and Welfare from March 16 to June 15, 2020 (the first wave of the pandemic in Finland). The symptom checker responses provide self-triage information input to a medically qualified algorithm that produces a personalized probability of having COVID-19, and provides graded recommendations for further actions. We trained linear regression and extreme gradient boosting (XGBoost) models together with F-score and mutual information feature preselectors to predict the admissions once a week, 1 week in advance. Results Our models reached a mean absolute percentage error between 24.2% and 36.4% in predicting the national daily patient admissions. The best result was achieved by combining both Omaolo and historical patient admission counts. Our best predictor was linear regression with mutual information as the feature preselector. Conclusions Accurate short-term predictions of COVID-19 patient admissions can be made, and both symptom check questionnaires and daily admissions data contribute to the accuracy of the predictions. Thus, symptom checkers can be used to estimate the progression of the pandemic, which can be considered when predicting the health care burden in a future pandemic.

Стилі APA, Harvard, Vancouver, ISO та ін.

15

Akinje, Ayorinde O., and Abdulgalee Fuad. "Fraudulent Detection Model Using Machine Learning Techniques for Unstructured Supplementary Service Data." International Journal of Innovative Computing 11, no. 2 (October 31, 2021): 51–60. http://dx.doi.org/10.11113/ijic.v11n2.299.

Повний текст джерела

Анотація:

The increase in mobile phones accessibility and technological advancement in almost every corner of the world has shaped how banks offer financial service. Such services were extended to low-end customers without a smartphone providing Alternative Banking Channels (ABCs) service, rendering regular financial service same as those on smartphones. One of the services of this ABC’s is Unstructured Supplementary Service Data (USSD), two-way communication between mobile phones and applications, which is used to render financial services all from the bank accounts linked for this USSD service. Fraudsters have taken advantage of innocent customers on this channel to carry out fraudulent activities with high impart of fraudulent there is still not an implemented fraud detection model to detect this fraud activities. This paper is an investigation into fraud detection model using machine learning techniques for Unstructured Supplementary Service Data based on short-term memory. Statistical features were derived by aggregating customers activities using a short window size to improve the model performance on selected machine learning classifiers, employing the best set of features to improve the model performance. Based on the results obtained, the proposed Fraudulent detection model demonstrated that with the appropriate machine learning techniques for USSD, best performance was achieved with Random forest having the best result of 100% across all its performance measure, KNeighbors was second in performance measure having an average of 99% across all its performance measure while Gradient boosting was third in its performance measure, its achieved accuracy is 91.94%, precession is 86%, recall is 100% and f1 score is 92.54%. Result obtained shows two of the selected machine learning random forest and decision tree are best fit for the fraud detection in this model. With the right features derived and an appropriate machine learning algorithm, the proposed model offers the best fraud detection accuracy.

Стилі APA, Harvard, Vancouver, ISO та ін.

16

Li, Xilin, Frank H. F. Leung, Steven Su, and Sai Ho Ling. "Sleep Apnea Detection Using Multi-Error-Reduction Classification System with Multiple Bio-Signals." Sensors 22, no. 15 (July 25, 2022): 5560. http://dx.doi.org/10.3390/s22155560.

Повний текст джерела

Анотація:

Introduction: Obstructive sleep apnea (OSA) can cause serious health problems such as hypertension or cardiovascular disease. The manual detection of apnea is a time-consuming task, and automatic diagnosis is much more desirable. The contribution of this work is to detect OSA using a multi-error-reduction (MER) classification system with multi-domain features from bio-signals. Methods: Time-domain, frequency-domain, and non-linear analysis features are extracted from oxygen saturation (SaO2), ECG, airflow, thoracic, and abdominal signals. To analyse the significance of each feature, we design a two-stage feature selection. Stage 1 is the statistical analysis stage, and Stage 2 is the final feature subset selection stage using machine learning methods. In Stage 1, two statistical analyses (the one-way analysis of variance (ANOVA) and the rank-sum test) provide a list of the significance level of each kind of feature. Then, in Stage 2, the support vector machine (SVM) algorithm is used to select a final feature subset based on the significance list. Next, an MER classification system is constructed, which applies a stacking with a structure that consists of base learners and an artificial neural network (ANN) meta-learner. Results: The Sleep Heart Health Study (SHHS) database is used to provide bio-signals. A total of 66 features are extracted. In the experiment that involves a duration parameter, 19 features are selected as the final feature subset because they provide a better and more stable performance. The SVM model shows good performance (accuracy = 81.68%, sensitivity = 97.05%, and specificity = 66.54%). It is also found that classifiers have poor performance when they predict normal events in less than 60 s. In the next experiment stage, the time-window segmentation method with a length of 60s is used. After the above two-stage feature selection procedure, 48 features are selected as the final feature subset that give good performance (accuracy = 90.80%, sensitivity = 93.95%, and specificity = 83.82%). To conduct the classification, Gradient Boosting, CatBoost, Light GBM, and XGBoost are used as base learners, and the ANN is used as the meta-learner. The performance of this MER classification system has the accuracy of 94.66%, the sensitivity of 96.37%, and the specificity of 90.83%.

Стилі APA, Harvard, Vancouver, ISO та ін.

17

Abedi, Vida, Venkatesh Avula, Durgesh Chaudhary, Shima Shahjouei, Ayesha Khan, Christoph J. Griessenauer, Jiang Li, and Ramin Zand. "Prediction of Long-Term Stroke Recurrence Using Machine Learning Models." Journal of Clinical Medicine 10, no. 6 (March 20, 2021): 1286. http://dx.doi.org/10.3390/jcm10061286.

Повний текст джерела

Анотація:

Background: The long-term risk of recurrent ischemic stroke, estimated to be between 17% and 30%, cannot be reliably assessed at an individual level. Our goal was to study whether machine-learning can be trained to predict stroke recurrence and identify key clinical variables and assess whether performance metrics can be optimized. Methods: We used patient-level data from electronic health records, six interpretable algorithms (Logistic Regression, Extreme Gradient Boosting, Gradient Boosting Machine, Random Forest, Support Vector Machine, Decision Tree), four feature selection strategies, five prediction windows, and two sampling strategies to develop 288 models for up to 5-year stroke recurrence prediction. We further identified important clinical features and different optimization strategies. Results: We included 2091 ischemic stroke patients. Model area under the receiver operating characteristic (AUROC) curve was stable for prediction windows of 1, 2, 3, 4, and 5 years, with the highest score for the 1-year (0.79) and the lowest score for the 5-year prediction window (0.69). A total of 21 (7%) models reached an AUROC above 0.73 while 110 (38%) models reached an AUROC greater than 0.7. Among the 53 features analyzed, age, body mass index, and laboratory-based features (such as high-density lipoprotein, hemoglobin A1c, and creatinine) had the highest overall importance scores. The balance between specificity and sensitivity improved through sampling strategies. Conclusion: All of the selected six algorithms could be trained to predict the long-term stroke recurrence and laboratory-based variables were highly associated with stroke recurrence. The latter could be targeted for personalized interventions. Model performance metrics could be optimized, and models can be implemented in the same healthcare system as intelligent decision support for targeted intervention.

Стилі APA, Harvard, Vancouver, ISO та ін.

18

Schonlau, Matthias. "Boosted Regression (Boosting): An Introductory Tutorial and a Stata Plugin." Stata Journal: Promoting communications on statistics and Stata 5, no. 3 (September 2005): 330–54. http://dx.doi.org/10.1177/1536867x0500500304.

Повний текст джерела

Анотація:

Boosting, or boosted regression, is a recent data-mining technique that has shown considerable success in predictive accuracy. This article gives an overview of boosting and introduces a new Stata command, boost, that implements the boosting algorithm described in Hastie, Tibshirani, and Friedman (2001, 322). The plugin is illustrated with a Gaussian and a logistic regression example. In the Gaussian regression example, the R2 value computed on a test dataset is R2 = 21.3% for linear regression and R2 = 93.8% for boosting. In the logistic regression example, stepwise logistic regression correctly classifies 54.1% of the observations in a test dataset versus 76.0% for boosted logistic regression. Currently, boost accommodates Gaussian (normal), logistic, and Poisson boosted regression. boost is implemented as a Windows C++ plugin.

Стилі APA, Harvard, Vancouver, ISO та ін.

19

Azeez, Nureni Ayofe, Oluwanifise Ebunoluwa Odufuwa, Sanjay Misra, Jonathan Oluranti, and Robertas Damaševičius. "Windows PE Malware Detection Using Ensemble Learning." Informatics 8, no. 1 (February 10, 2021): 10. http://dx.doi.org/10.3390/informatics8010010.

Повний текст джерела

Анотація:

In this Internet age, there are increasingly many threats to the security and safety of users daily. One of such threats is malicious software otherwise known as malware (ransomware, Trojans, viruses, etc.). The effect of this threat can lead to loss or malicious replacement of important information (such as bank account details, etc.). Malware creators have been able to bypass traditional methods of malware detection, which can be time-consuming and unreliable for unknown malware. This motivates the need for intelligent ways to detect malware, especially new malware which have not been evaluated or studied before. Machine learning provides an intelligent way to detect malware and comprises two stages: feature extraction and classification. This study suggests an ensemble learning-based method for malware detection. The base stage classification is done by a stacked ensemble of fully-connected and one-dimensional convolutional neural networks (CNNs), whereas the end-stage classification is done by a machine learning algorithm. For a meta-learner, we analyzed and compared 15 machine learning classifiers. For comparison, five machine learning algorithms were used: naïve Bayes, decision tree, random forest, gradient boosting, and AdaBoosting. The results of experiments made on the Windows Portable Executable (PE) malware dataset are presented. The best results were obtained by an ensemble of seven neural networks and the ExtraTrees classifier as a final-stage classifier.

Стилі APA, Harvard, Vancouver, ISO та ін.

20

Aurangzeb, Sana, Rao Naveed Bin Rais, Muhammad Aleem, Muhammad Arshad Islam, and Muhammad Azhar Iqbal. "On the classification of Microsoft-Windows ransomware using hardware profile." PeerJ Computer Science 7 (February 2, 2021): e361. http://dx.doi.org/10.7717/peerj-cs.361.

Повний текст джерела

Анотація:

Due to the expeditious inclination of online services usage, the incidents of ransomware proliferation being reported are on the rise. Ransomware is a more hazardous threat than other malware as the victim of ransomware cannot regain access to the hijacked device until some form of compensation is paid. In the literature, several dynamic analysis techniques have been employed for the detection of malware including ransomware; however, to the best of our knowledge, hardware execution profile for ransomware analysis has not been investigated for this purpose, as of today. In this study, we show that the true execution picture obtained via a hardware execution profile is beneficial to identify the obfuscated ransomware too. We evaluate the features obtained from hardware performance counters to classify malicious applications into ransomware and non-ransomware categories using several machine learning algorithms such as Random Forest, Decision Tree, Gradient Boosting, and Extreme Gradient Boosting. The employed data set comprises 80 ransomware and 80 non-ransomware applications, which are collected using the VirusShare platform. The results revealed that extracted hardware features play a substantial part in the identification and detection of ransomware with F-measure score of 0.97 achieved by Random Forest and Extreme Gradient Boosting.

Стилі APA, Harvard, Vancouver, ISO та ін.

21

Zhou, Ke, Hailei Liu, Xiaobo Deng, Hao Wang, and Shenglan Zhang. "Comparison of Machine-Learning Algorithms for Near-Surface Air-Temperature Estimation from FY-4A AGRI Data." Advances in Meteorology 2020 (October 6, 2020): 1–14. http://dx.doi.org/10.1155/2020/8887364.

Повний текст джерела

Анотація:

Six machine-learning approaches, including multivariate linear regression (MLR), gradient boosting decision tree, k-nearest neighbors, random forest, extreme gradient boosting (XGB), and deep neural network (DNN), were compared for near-surface air-temperature (Tair) estimation from the new generation of Chinese geostationary meteorological satellite Fengyun-4A (FY-4A) observations. The brightness temperatures in split-window channels from the Advanced Geostationary Radiation Imager (AGRI) of FY-4A and numerical weather prediction data from the global forecast system were used as the predictor variables for Tair estimation. The performance of each model and the temporal and spatial distribution of the estimated Tair errors were analyzed. The results showed that the XGB model had better overall performance, with R2 of 0.902, bias of −0.087°C, and root-mean-square error of 1.946°C. The spatial variation characteristics of the Tair error of the XGB method were less obvious than those of the other methods. The XGB model can provide more stable and high-precision Tair for a large-scale Tair estimation over China and can serve as a reference for Tair estimation based on machine-learning models.

Стилі APA, Harvard, Vancouver, ISO та ін.

22

Balasso, Paolo, Giorgio Marchesini, Nicola Ughelini, Lorenzo Serva, and Igino Andrighetto. "Machine Learning to Detect Posture and Behavior in Dairy Cows: Information from an Accelerometer on the Animal’s Left Flank." Animals 11, no. 10 (October 15, 2021): 2972. http://dx.doi.org/10.3390/ani11102972.

Повний текст джерела

Анотація:

The aim of the present study was to develop a model to identify posture and behavior from data collected by a triaxial accelerometer located on the left flank of dairy cows and evaluate its accuracy and precision. Twelve Italian Red-and-White lactating cows were equipped with an accelerometer and observed on average for 136 ± 29 min per cow by two trained operators as a reference. The acceleration data were grouped in time windows of 8 s overlapping by 33.0%, for a total of 35,133 rows. For each row, 32 different features were extracted and used by machine learning algorithms for the classification of posture and behavior. To build up a predictive model, the dataset was split in training and testing datasets, characterized by 75.0 and 25.0% of the observations, respectively. Four algorithms were tested: Random Forest, K Nearest Neighbors, Extreme Boosting Algorithm (XGB), and Support Vector Machine. The XGB model showed the best accuracy (0.99) and Cohen’s kappa (0.99) in predicting posture, whereas the Random Forest model had the highest overall accuracy in predicting behaviors (0.76), showing a balanced accuracy from 0.96 for resting to 0.77 for moving. Overall, very accurate detection of the posture and resting behavior were achieved.

Стилі APA, Harvard, Vancouver, ISO та ін.

23

Kurek, Jarosław, Artur Krupa, Izabella Antoniuk, Arlan Akhmet, Ulan Abdiomar, Michał Bukowski, and Karol Szymanowski. "Improved Drill State Recognition during Milling Process Using Artificial Intelligence." Sensors 23, no. 1 (January 1, 2023): 448. http://dx.doi.org/10.3390/s23010448.

Повний текст джерела

Анотація:

In this article, an automated method for tool condition monitoring is presented. When producing items in large quantities, pointing out the exact time when the element needs to be exchanged is crucial. If performed too early, the operator gets rid of a good drill, also resulting in production downtime increase if this operation is repeated too often. On the other hand, continuing production with a worn tool might result in a poor-quality product and financial loss for the manufacturer. In the presented approach, drill wear is classified using three states representing decreasing quality: green, yellow and red. A series of signals were collected as training data for the classification algorithms. Measurements were saved in separate data sets with corresponding time windows. A total of ten methods were evaluated in terms of overall accuracy and the number of misclassification errors. Three solutions obtained an acceptable accuracy rate above 85%. Algorithms were able to assign states without the most undesirable red-green and green-red errors. The best results were achieved by the Extreme Gradient Boosting algorithm. This approach achieved an overall accuracy of 93.33%, and the only misclassification was the yellow sample assigned as green. The presented solution achieves good results and can be applied in industry applications related to tool condition monitoring.

Стилі APA, Harvard, Vancouver, ISO та ін.

24

Chi, Yufeng, Zhifeng Wu, Kuo Liao, and Yin Ren. "Handling Missing Data in Large-Scale MODIS AOD Products Using a Two-Step Model." Remote Sensing 12, no. 22 (November 18, 2020): 3786. http://dx.doi.org/10.3390/rs12223786.

Повний текст джерела

Анотація:

Aerosol optical depth (AOD) is a key parameter that reflects the characteristics of aerosols, and is of great help in predicting the concentration of pollutants in the atmosphere. At present, remote sensing inversion has become an important method for obtaining the AOD on a large scale. However, AOD data acquired by satellites are often missing, and this has gradually become a popular topic. In recent years, a large number of AOD recovery algorithms have been proposed. Many AOD recovery methods are not application-oriented. These methods focus mainly on to the accuracy of AOD recovery and neglect the AOD recovery ratio. As a result, the AOD recovery accuracy and recovery ratio cannot be balanced. To solve these problems, a two-step model (TWS) that combines multisource AOD data and AOD spatiotemporal relationships is proposed. We used the light gradient boosting (LightGBM) model under the framework of the gradient boosting machine (GBM) to fit the multisource AOD data to fill in the missing AOD between data sources. Spatial interpolation and spatiotemporal interpolation methods are limited by buffer factors. We recovered the missing AOD in a moving window. We used TWS to recover AOD from Terra Satellite’s 2018 AOD product (MOD AOD). The results show that the MOD AOD, after a 3 × 3 moving window TWS recovery, was closely related to the AOD of the Aerosol Robotic Network (AERONET) (R = 0.87, RMSE = 0.23). In addition, the MOD AOD missing rate after a 3 × 3 window TWS recovery was greatly reduced (from 0.88 to 0.1). In addition, the spatial distribution characteristics of the monthly and annual averages of the recovered MOD AOD were consistent with the original MOD AOD. The results show that TWS is reliable. This study provides a new method for the restoration of MOD AOD, and is of great significance for studying the spatial distribution of atmospheric pollutants.

Стилі APA, Harvard, Vancouver, ISO та ін.

25

Li, Chunhua, Lizhi Zhou, and Wenbin Xu. "Estimating Aboveground Biomass Using Sentinel-2 MSI Data and Ensemble Algorithms for Grassland in the Shengjin Lake Wetland, China." Remote Sensing 13, no. 8 (April 20, 2021): 1595. http://dx.doi.org/10.3390/rs13081595.

Повний текст джерела

Анотація:

Wetland vegetation aboveground biomass (AGB) directly indicates wetland ecosystem health and is critical for water purification, carbon cycle, and biodiversity conservation. Accurate AGB estimation is essential for the monitoring and supervision of ecosystems, especially in seasonal floodplain wetlands. This paper explored the capability of spectral and texture features from the Sentinel-2 Multispectral Instrument (MSI) for modeling grassland AGB using random forest (RF) and extreme gradient boosting (XGBoost) algorithms in Shengjin Lake wetland (a Ramsar site). We use five-fold cross-validation to verify the model effectiveness. The results indicated that the RF and XGBoost models had a robust and efficient performance (with root mean square error (RMSE) of 126.571 g·m−2 and R2 of 0.844 for RF, RMSE of 112.425 g·m−2 and R2 of 0.869 for XGBoost), and the XGBoost models, by contrast, performed better. Both traditional and red-edge vegetation indices (VIs) obtained satisfactory results of AGB estimation (RMSE = 127.936 g·m−2, RMSE = 125.879 g·m−2 in XGBoost models, respectively), with the red-edge VIs contributed more to the AGB models. Moreover, we selected eight gray-level co-occurrence matrix (GLCM) textures calculated by four processing window sizes using the mean value of four offsets, and further analyzed the results of three analysis sets. Textures derived from traditional and red-edge bands using a 7 × 7 window size performed better in biomass estimation. This finding suggested that textures derived from the traditional bands were as important as the red-edge bands. The introduction of textures moderately improved the accuracy of modeling AGB, whereas the use of textures alo ne was not satisfactory. This research demonstrated that using the Sentinel-2 MSI and the two ensemble algorithms is an effective method for long-term dynamic monitoring and assessment of grass AGB in seasonal floodplain wetlands, which can support sustainable management and carbon accounting of wetland ecosystems.

Стилі APA, Harvard, Vancouver, ISO та ін.

26

Xu, Haoran, Wei Yan, Ke Lan, Chenbin Ma, Di Wu, Anshuo Wu, Zhicheng Yang, et al. "Assessing Electrocardiogram and Respiratory Signal Quality of a Wearable Device (SensEcho): Semisupervised Machine Learning-Based Validation Study." JMIR mHealth and uHealth 9, no. 8 (August 12, 2021): e25415. http://dx.doi.org/10.2196/25415.

Повний текст джерела

Анотація:

Background With the development and promotion of wearable devices and their mobile health (mHealth) apps, physiological signals have become a research hotspot. However, noise is complex in signals obtained from daily lives, making it difficult to analyze the signals automatically and resulting in a high false alarm rate. At present, screening out the high-quality segments of the signals from huge-volume data with few labels remains a problem. Signal quality assessment (SQA) is essential and is able to advance the valuable information mining of signals. Objective The aims of this study were to design an SQA algorithm based on the unsupervised isolation forest model to classify the signal quality into 3 grades: good, acceptable, and unacceptable; validate the algorithm on labeled data sets; and apply the algorithm on real-world data to evaluate its efficacy. Methods Data used in this study were collected by a wearable device (SensEcho) from healthy individuals and patients. The observation windows for electrocardiogram (ECG) and respiratory signals were 10 and 30 seconds, respectively. In the experimental procedure, the unlabeled training set was used to train the models. The validation and test sets were labeled according to preset criteria and used to evaluate the classification performance quantitatively. The validation set consisted of 3460 and 2086 windows of ECG and respiratory signals, respectively, whereas the test set was made up of 4686 and 3341 windows of signals, respectively. The algorithm was also compared with self-organizing maps (SOMs) and 4 classic supervised models (logistic regression, random forest, support vector machine, and extreme gradient boosting). One case validation was illustrated to show the application effect. The algorithm was then applied to 1144 cases of ECG signals collected from patients and the detected arrhythmia false alarms were calculated. Results The quantitative results showed that the ECG SQA model achieved 94.97% and 95.58% accuracy on the validation and test sets, respectively, whereas the respiratory SQA model achieved 81.06% and 86.20% accuracy on the validation and test sets, respectively. The algorithm was superior to SOM and achieved moderate performance when compared with the supervised models. The example case showed that the algorithm was able to correctly classify the signal quality even when there were complex pathological changes in the signals. The algorithm application results indicated that some specific types of arrhythmia false alarms such as tachycardia, atrial premature beat, and ventricular premature beat could be significantly reduced with the help of the algorithm. Conclusions This study verified the feasibility of applying the anomaly detection unsupervised model to SQA. The application scenarios include reducing the false alarm rate of the device and selecting signal segments that can be used for further research.

Стилі APA, Harvard, Vancouver, ISO та ін.

27

Tsiklidis, Evan J., Talid Sinno, and Scott L. Diamond. "Predicting risk for trauma patients using static and dynamic information from the MIMIC III database." PLOS ONE 17, no. 1 (January 19, 2022): e0262523. http://dx.doi.org/10.1371/journal.pone.0262523.

Повний текст джерела

Анотація:

Risk quantification algorithms in the ICU can provide (1) an early alert to the clinician that a patient is at extreme risk and (2) help manage limited resources efficiently or remotely. With electronic health records, large data sets allow the training of predictive models to quantify patient risk. A gradient boosting classifier was trained to predict high-risk and low-risk trauma patients, where patients were labeled high-risk if they expired within the next 10 hours or within the last 10% of their ICU stay duration. The MIMIC-III database was filtered to extract 5,400 trauma patient records (526 non-survivors) each of which contained 5 static variables (age, gender, etc.) and 28 dynamic variables (e.g., vital signs and metabolic panel). Training data was also extracted from the dynamic variables using a 3-hour moving time window whereby each window was treated as a unique patient-time fragment. We extracted the mean, standard deviation, and skew from each of these 3-hour fragments and included them as inputs for training. Additionally, a survival metric upon admission was calculated for each patient using a previously developed National Trauma Data Bank (NTDB)-trained gradient booster model. The final model was able to distinguish between high-risk and low-risk patients to an AUROC of 92.9%, defined as the area under the receiver operator characteristic curve. Importantly, the dynamic survival probability plots for patients who die appear considerably different from those who survive, an example of reducing the high dimensionality of the patient record to a single trauma trajectory.

Стилі APA, Harvard, Vancouver, ISO та ін.

28

Soler, Santiago R., and Leonardo Uieda. "Gradient-boosted equivalent sources." Geophysical Journal International 227, no. 3 (August 24, 2021): 1768–83. http://dx.doi.org/10.1093/gji/ggab297.

Повний текст джерела

Анотація:

SUMMARY The equivalent source technique is a powerful and widely used method for processing gravity and magnetic data. Nevertheless, its major drawback is the large computational cost in terms of processing time and computer memory. We present two techniques for reducing the computational cost of equivalent source processing: block-averaging source locations and the gradient-boosted equivalent source algorithm. Through block-averaging, we reduce the number of source coefficients that must be estimated while retaining the minimum desired resolution in the final processed data. With the gradient-boosting method, we estimate the sources coefficients in small batches along overlapping windows, allowing us to reduce the computer memory requirements arbitrarily to conform to the constraints of the available hardware. We show that the combination of block-averaging and gradient-boosted equivalent sources is capable of producing accurate interpolations through tests against synthetic data. Moreover, we demonstrate the feasibility of our method by gridding a gravity data set covering Australia with over 1.7 million observations using a modest personal computer.

Стилі APA, Harvard, Vancouver, ISO та ін.

29

Wang, Ying, Zhicheng Du, Wayne R. Lawrence, Yun Huang, Yu Deng, and Yuantao Hao. "Predicting Hepatitis B Virus Infection Based on Health Examination Data of Community Population." International Journal of Environmental Research and Public Health 16, no. 23 (December 2, 2019): 4842. http://dx.doi.org/10.3390/ijerph16234842.

Повний текст джерела

Анотація:

Despite a decline in the prevalence of hepatitis B in China, the disease burden remains high. Large populations unaware of infection risk often fail to meet the ideal treatment window, resulting in poor prognosis. The purpose of this study was to develop and evaluate models identifying high-risk populations who should be tested for hepatitis B surface antigen. Data came from a large community-based health screening, including 97,173 individuals, with an average age of 54.94. A total of 33 indicators were collected as model predictors, including demographic characteristics, routine blood indicators, and liver function. Borderline-Synthetic minority oversampling technique (SMOTE) was conducted to preprocess the data and then four predictive models, namely, the extreme gradient boosting (XGBoost), random forest (RF), decision tree (DT), and logistic regression (LR) algorithms, were developed. The positive rate of hepatitis B surface antigen (HBsAg) was 8.27%. The area under the receiver operating characteristic curves for XGBoost, RF, DT, and LR models were 0.779, 0.752, 0.619, and 0.742, respectively. The Borderline-SMOTE XGBoost combined model outperformed the other models, which correctly predicted 13,637/19,435 cases (sensitivity 70.8%, specificity 70.1%), and the variable importance plot of XGBoost model indicated that age was of high importance. The prediction model can be used to accurately identify populations at high risk of hepatitis B infection that should adopt timely appropriate medical treatment measures.

Стилі APA, Harvard, Vancouver, ISO та ін.

30

Xiong, Pan, Cheng Long, Huiyu Zhou, Roberto Battiston, Xuemin Zhang, and Xuhui Shen. "Identification of Electromagnetic Pre-Earthquake Perturbations from the DEMETER Data by Machine Learning." Remote Sensing 12, no. 21 (November 6, 2020): 3643. http://dx.doi.org/10.3390/rs12213643.

Повний текст джерела

Анотація:

The low-altitude satellite DEMETER recorded many cases of ionospheric perturbations observed on occasion of large seismic events. In this paper, we explore 16 spot-checking classification algorithms, among which, the top classifier with low-frequency power spectra of electric and magnetic fields was used for ionospheric perturbation analysis. This study included the analysis of satellite data spanning over six years, during which about 8760 earthquakes with magnitude greater than or equal to 5.0 occurred in the world. We discover that among these methods, a gradient boosting-based method called LightGBM outperforms others and achieves superior performance in a five-fold cross-validation test on the benchmarking datasets, which shows a strong capability in discriminating electromagnetic pre-earthquake perturbations. The results show that the electromagnetic pre-earthquake data within a circular region with its center at the epicenter and its radius given by the Dobrovolsky’s formula and the time window of about a few hours before shocks are much better at discriminating electromagnetic pre-earthquake perturbations. Moreover, by investigating different earthquake databases, we confirm that some low-frequency electric and magnetic fields’ frequency bands are the dominant features for electromagnetic pre-earthquake perturbations identification. We have also found that the choice of the geographical region used to simulate the training set of non-seismic data influences, to a certain extent, the performance of the LightGBM model, by reducing its capability in discriminating electromagnetic pre-earthquake perturbations.

Стилі APA, Harvard, Vancouver, ISO та ін.

31

Tekin, Ahmet Tezcan, and Ferhan Çebi. "Click and sales prediction for OTAs’ digital advertisements: Fuzzy clustering based approach." Journal of Intelligent & Fuzzy Systems 39, no. 5 (November 19, 2020): 6619–27. http://dx.doi.org/10.3233/jifs-189123.

Повний текст джерела

Анотація:

Within the most productive route, online travel agencies (OTAs) intend to use advanced digital media ads to expand their piece of the industry as a whole. The metasearch engine platforms are among the most consistently used digital media environments by OTAs. Most OTAs offer day by day deals in metasearch engine platforms that are paying per click for each hotel to get reservations. The administration of offering methodologies is critical along these lines to reduce costs and increase revenue for online travel agencies. In this study, we tried to predict both the number of impressions and the regular Click-Through-Rate (CTR) level of hotel advertising for each hotel and the daily sales amount. A significant commitment of our research is to use an extended dataset generated by integrating the most informative features implemented in various related studies as the rolling average for a different amount of day and shifted values for use in the proposed test stage for CTR, impression and sales prediction. The data is created in this study by one of Turkey’s largest OTA, and we are giving OTA’s a genuine application. The results at each prediction stage show that enriching the training data with the OTA-specific additional features, which are the most insightful and sliding window techniques, improves the prediction models ’ generalization capability, and tree-based boosting algorithms carry out the greatest results on this problem. Clustering the dataset according to its specifications also improves the results of the predictions.

Стилі APA, Harvard, Vancouver, ISO та ін.

32

Thapa, Rahul, Anurag Garikipati, Sepideh Shokouhi, Myrna Hurtado, Gina Barnes, Jana Hoffman, Jacob Calvert, Lynne Katzmann, Qingqing Mao, and Ritankar Das. "Predicting Falls in Long-term Care Facilities: Machine Learning Study." JMIR Aging 5, no. 2 (April 1, 2022): e35373. http://dx.doi.org/10.2196/35373.

Повний текст джерела

Анотація:

Background Short-term fall prediction models that use electronic health records (EHRs) may enable the implementation of dynamic care practices that specifically address changes in individualized fall risk within senior care facilities. Objective The aim of this study is to implement machine learning (ML) algorithms that use EHR data to predict a 3-month fall risk in residents from a variety of senior care facilities providing different levels of care. Methods This retrospective study obtained EHR data (2007-2021) from Juniper Communities’ proprietary database of 2785 individuals primarily residing in skilled nursing facilities, independent living facilities, and assisted living facilities across the United States. We assessed the performance of 3 ML-based fall prediction models and the Juniper Communities’ fall risk assessment. Additional analyses were conducted to examine how changes in the input features, training data sets, and prediction windows affected the performance of these models. Results The Extreme Gradient Boosting model exhibited the highest performance, with an area under the receiver operating characteristic curve of 0.846 (95% CI 0.794-0.894), specificity of 0.848, diagnostic odds ratio of 13.40, and sensitivity of 0.706, while achieving the best trade-off in balancing true positive and negative rates. The number of active medications was the most significant feature associated with fall risk, followed by a resident’s number of active diseases and several variables associated with vital signs, including diastolic blood pressure and changes in weight and respiratory rates. The combination of vital signs with traditional risk factors as input features achieved higher prediction accuracy than using either group of features alone. Conclusions This study shows that the Extreme Gradient Boosting technique can use a large number of features from EHR data to make short-term fall predictions with a better performance than that of conventional fall risk assessments and other ML models. The integration of routinely collected EHR data, particularly vital signs, into fall prediction models may generate more accurate fall risk surveillance than models without vital signs. Our data support the use of ML models for dynamic, cost-effective, and automated fall predictions in different types of senior care facilities.

Стилі APA, Harvard, Vancouver, ISO та ін.

33

Gašparović, Mateo, and Dino Dobrinić. "Green Infrastructure Mapping in Urban Areas Using Sentinel-1 Imagery." Croatian journal of forest engineering 42, no. 2 (March 12, 2021): 337–56. http://dx.doi.org/10.5552/crojfe.2021.859.

Повний текст джерела

Анотація:

High temporal resolution of synthetic aperture radar (SAR) imagery (e.g., Sentinel-1 (S1) imagery) creates new possibilities for monitoring green vegetation in urban areas and generating land-cover classification (LCC) maps. This research evaluates how different pre-processing steps of SAR imagery affect classification accuracy. Machine learning (ML) methods were applied in three different study areas: random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB). Since the presence of the speckle noise in radar imagery is inevitable, different adaptive filters were examined. Using the backscattering values of the S1 imagery, the SVM classifier achieved a mean overall accuracy (OA) of 63.14%, and a Kappa coefficient (Kappa) of 0.50. Using the SVM classifier with a Lee filter with a window size of 5×5 (Lee5) for speckle reduction, mean values of 73.86% and 0.64 for OA and Kappa were achieved, respectively. An additional increase in the LCC was obtained with texture features calculated from a grey-level co-occurrence matrix (GLCM). The highest classification accuracy obtained for the extracted GLCM texture features using the SVM classifier, and Lee5 filter was 78.32% and 0.69 for the mean OA and Kappa values, respectively. This study improved LCC with an evaluation of various radiometric and texture features and confirmed the ability to apply an SVM classifier. For the supervised classification, the SVM method outperformed the RF and XGB methods, although the highest computational time was needed for the SVM, whereas XGB performed the fastest. These results suggest pre-processing steps of the SAR imagery for green infrastructure mapping in urban areas. Future research should address the use of multitemporal SAR data along with the pre-processing steps and ML algorithms described in this research.

Стилі APA, Harvard, Vancouver, ISO та ін.

34

Sun, Haoyang, Joel Koo, Borame L. Dickens, Hannah E. Clapham, and Alex R. Cook. "Short-term and long-term epidemiological impacts of sustained vector control in various dengue endemic settings: A modelling study." PLOS Computational Biology 18, no. 4 (April 1, 2022): e1009979. http://dx.doi.org/10.1371/journal.pcbi.1009979.

Повний текст джерела

Анотація:

As the most widespread viral infection transmitted by the Aedes mosquitoes, dengue has been estimated to cause 51 million febrile disease cases globally each year. Although sustained vector control remains key to reducing the burden of dengue, current understanding of the key factors that explain the observed variation in the short- and long-term vector control effectiveness across different transmission settings remains limited. We used a detailed individual-based model to simulate dengue transmission with and without sustained vector control over a 30-year time frame, under different transmission scenarios. Vector control effectiveness was derived for different time windows within the 30-year intervention period. We then used the extreme gradient boosting algorithm to predict the effectiveness of vector control given the simulation parameters, and the resulting machine learning model was interpreted using Shapley Additive Explanations. According to our simulation outputs, dengue transmission would be nearly eliminated during the early stage of sustained and intensive vector control, but over time incidence would gradually bounce back to the pre-intervention level unless the intervention is implemented at a very high level of intensity. The time point at which intervention ceases to be effective is strongly influenced not only by the intensity of vector control, but also by the pre-intervention transmission intensity and the individual-level heterogeneity in biting risk. Moreover, the impact of many transmission model parameters on the intervention effectiveness is shown to be modified by the intensity of vector control, as well as to vary over time. Our study has identified some of the critical drivers for the difference in the time-varying effectiveness of sustained vector control across different dengue endemic settings, and the insights obtained will be useful to inform future model-based studies that seek to predict the impact of dengue vector control in their local contexts.

Стилі APA, Harvard, Vancouver, ISO та ін.

35

Qinghe, Zhao, Xiang Wen, Huang Boyan, Wang Jong, and Fang Junlong. "Optimised extreme gradient boosting model for short term electric load demand forecasting of regional grid system." Scientific Reports 12, no. 1 (November 11, 2022). http://dx.doi.org/10.1038/s41598-022-22024-3.

Повний текст джерела

Анотація:

AbstractLoad forecast provides effective and reliable guidance for power construction and grid operation. It is essential for the power utility to forecast the exact in-future coming energy demand. Advanced machine learning methods can support competently for load forecasting, and extreme gradient boosting is an algorithm with great research potential. But there is less research about the energy time series itself as only an internal variable, especially for feature engineering of time univariate. And the machine learning tuning is another issue to applicate boosting method in energy demand, which has more significant effects than improving the core of the model. We take the extreme gradient boosting algorithm as the original model and combine the Tree-structured Parzen Estimator method to design the TPE-XGBoost model for completing the high-performance single-lag power load forecasting task. We resample the power load data of the Île-de-France Region Grid provided by Réseau de Transport d’Électricité in the day, train and optimise the TPE-XGBoost model by samples from 2016 to 2018, and test and evaluate in samples of 2019. The optimal window width of the time series data is determined in this study through Discrete Fourier Transform and Pearson Correlation Coefficient Methods, and five additional date features are introduced to complete feature engineering. By 500 iterations, TPE optimisation ensures nine hyperparameters’ values of XGBoost and improves the models obviously. In the dataset of 2019, the TPE-XGBoost model we designed has an excellent performance of MAE = 166.020 and MAPE = 2.61%. Compared with the original model, the two metrics are respectively improved by 14.23 and 14.14%; compared with the other eight machine learning algorithms, the model performs with the best metrics as well.

Стилі APA, Harvard, Vancouver, ISO та ін.

36

Liu, Lijun, Lan Wang, and Zhen Yu. "Remaining Useful Life Estimation of Aircraft Engines Based on Deep Convolution Neural Network and LightGBM Combination Model." International Journal of Computational Intelligence Systems 14, no. 1 (September 24, 2021). http://dx.doi.org/10.1007/s44196-021-00020-1.

Повний текст джерела

Анотація:

AbstractAccurately predicting the remaining useful life (RUL) of aero-engines is of great significance for improving the reliability and safety of aero-engine systems. Because of the high dimension and complex features of sensor data in RUL prediction, this paper proposes a model combining deep convolution neural networks (DCNN) and the light gradient boosting machine (LightGBM) algorithm to estimate the RUL. Compared with traditional prognostics and health management (PHM) techniques, signal processing of raw sensor data and prior expertise are not required. The procedure is shown as follows. First, the time window of raw data of the aero-engine is used as the input of DCNN after normalization. The role of DCNN is to extract information from the input data. Second, considering the limitations of the fully connected layer of DCNN, we replace it with a strong classifier-LightGBM to improve the accuracy of prediction. Finally, to prove the effectiveness of the proposed method, we conducted some experiments on the C-MAPSS data set provided by NASA, and obtained good accuracy. By comparing the prediction effect with other commonly used algorithms on the same data set, the proposed algorithm has obvious advantages.

Стилі APA, Harvard, Vancouver, ISO та ін.

37

Zheng, Ping, Ze Yu, Liren Li, Shiting Liu, Yan Lou, Xin Hao, Peng Yu, et al. "Predicting Blood Concentration of Tacrolimus in Patients With Autoimmune Diseases Using Machine Learning Techniques Based on Real-World Evidence." Frontiers in Pharmacology 12 (September 24, 2021). http://dx.doi.org/10.3389/fphar.2021.727245.

Повний текст джерела

Анотація:

Tacrolimus is a widely used immunosuppressive drug in patients with autoimmune diseases. It has a narrow therapeutic window, thus requiring therapeutic drug monitoring (TDM) to guide the clinical regimen. This study included 193 cases of tacrolimus TDM data in patients with autoimmune diseases at Southern Medical University Nanfang Hospital from June 7, 2018, to December 31, 2020. The study identified nine important variables for tacrolimus concentration using sequential forward selection, including height, tacrolimus daily dose, other immunosuppressants, low-density lipoprotein cholesterol, mean corpuscular volume, mean corpuscular hemoglobin, white blood cell count, direct bilirubin, and hematocrit. The prediction abilities of 14 models based on regression analysis or machine learning algorithms were compared. Ultimately, a prediction model of tacrolimus concentration was established through eXtreme Gradient Boosting (XGBoost) algorithm with the best predictive ability (R2 = 0.54, mean absolute error = 0.25, and root mean square error = 0.33). Then, SHapley Additive exPlanations was used to visually interpret the variable’s impacts on tacrolimus concentration. In conclusion, the XGBoost model for predicting blood concentration of tacrolimus on the basis of real-world evidence has good predictive performance, providing guidance for the adjustment of regimen in clinical practice.

Стилі APA, Harvard, Vancouver, ISO та ін.

38

Mpanya, Dineo, Turgay Celik, Eric Klug, and Hopewell Ntsinjana. "Predicting in-hospital all-cause mortality in heart failure using machine learning." Frontiers in Cardiovascular Medicine 9 (January 11, 2023). http://dx.doi.org/10.3389/fcvm.2022.1032524.

Повний текст джерела

Анотація:

BackgroundThe age of onset and causes of heart failure differ between high-income and low-and-middle-income countries (LMIC). Heart failure patients in LMIC also experience a higher mortality rate. Innovative ways that can risk stratify heart failure patients in this region are needed. The aim of this study was to demonstrate the utility of machine learning in predicting all-cause mortality in heart failure patients hospitalised in a tertiary academic centre.MethodsSix supervised machine learning algorithms were trained to predict in-hospital all-cause mortality using data from 500 consecutive heart failure patients with a left ventricular ejection fraction (LVEF) less than 50%.ResultsThe mean age was 55.2 ± 16.8 years. There were 271 (54.2%) males, and the mean LVEF was 29 ± 9.2%. The median duration of hospitalisation was 7 days (interquartile range: 4–11), and it did not differ between patients discharged alive and those who died. After a prediction window of 4 years (interquartile range: 2–6), 84 (16.8%) patients died before discharge from the hospital. The area under the receiver operating characteristic curve was 0.82, 0.78, 0.77, 0.76, 0.75, and 0.62 for random forest, logistic regression, support vector machines (SVM), extreme gradient boosting, multilayer perceptron (MLP), and decision trees, and the accuracy during the test phase was 88, 87, 86, 82, 78, and 76% for random forest, MLP, SVM, extreme gradient boosting, decision trees, and logistic regression. The support vector machines were the best performing algorithm, and furosemide, beta-blockers, spironolactone, early diastolic murmur, and a parasternal heave had a positive coefficient with the target feature, whereas coronary artery disease, potassium, oedema grade, ischaemic cardiomyopathy, and right bundle branch block on electrocardiogram had negative coefficients.ConclusionDespite a small sample size, supervised machine learning algorithms successfully predicted all-cause mortality with modest accuracy. The SVM model will be externally validated using data from multiple cardiology centres in South Africa before developing a uniquely African risk prediction tool that can potentially transform heart failure management through precision medicine.

Стилі APA, Harvard, Vancouver, ISO та ін.

39

Michalski, B. W., S. Skonieczka, M. Strzelecki, M. Simiera, E. Szymczyk, P. Wejner-Mik, P. Lipiec, K. Wierzbowska-Drabik, and J. D. Kasprzak. "P4383Machine learning approach for prdiction of postinfarction myocardial recovery using echocardiographic myocardial texture." European Heart Journal 40, Supplement_1 (October 1, 2019). http://dx.doi.org/10.1093/eurheartj/ehz745.0788.

Повний текст джерела

Анотація:

Abstract Background Recovery of left ventricular (LV) function has significant prognostic significance after myocardial infarction (MI) but is challenging to predict. We applied machine learning algorithms (ML) to analyze echocardiographic myocardial texture for predicting long-term recovery of left ventricular myocardium Methods We used native and contrast-enhanced (Sonovue imaged with Contrast Perfusion Sequence, CPS) myocardial images acquired 7 days after reperfused ST-elevation MI from apical window recorded in 61 pts (19 females, age 59.7±11.9) with first ST-elevation MI treated with successful PCI. A custom software (MaZDa 4.6) was used for texture analysis. 299 image features were calculated for defined regions of interest in each image (9 features from histogram, 6 from gradient matrix, 20 from run length matrix, 220 from co-occurrence matrix, and 44 from wavelet transform). Up to 10 most reproducible parameters were selected based on low variation and, later, Fisher criterion with minimization of classification error along with average correlation coefficient. ML methods used to analyze textures included Multilayer perceptron neural network -MLP, support vector machines -SVM, Adaptive Boosting algorithm AdaBoost and library support vector machine. We defined recovery as: improvement of LV wall motion score index (WMSI), absence of remodeling defined as >8% increase in LV end-diastolic volume (LVEDV) and improvement >5% of LV ejection fraction % after 1 year. Results Effectiveness of tested methods was similar for predicting regional and local LV function evolution after one year. Results for native grayscale and red component of CPS myocardial perfusion images were comparable. Percent accuracy of prediction is shown in the table, with best result for WMSI. Accuracy of 1-yr predictions of LV function change ΔWMSI ΔEF ΔLVEDV native grey contrast red contrast grey native grey contrast red contrast grey native grey contrast red contrast grey Adaptive Boosting 78% 79% 78% 64% 64% 59% 61% 60% 60% Neural network 77% 79% 78% 63% 63% 58% 62% 59% 60% Library support vector machine 77% 79% 78% 63% 65% 64% 61% 60% 58% Support vectro machine 77% 79% 78% 64% 65% 59% 61% 60% 58% Conclusions Echocardiographic myocardial texture can be analyzed using machine learning approaches to predict global or regional recovery of myocardial function. Accuracy for predicting regional WMSI improvement was superior to prognosing LVEDV or EF change. Performance of different tools did not differ between native and contrast enhanced images.

Стилі APA, Harvard, Vancouver, ISO та ін.

40

Francos, Nicolas, Gila Notesco, and Eyal Ben-Dor. "Estimation of the Relative Abundance of Quartz to Clay Minerals Using the Visible–Near-Infrared–Shortwave-Infrared Spectral Region." Applied Spectroscopy, March 9, 2021, 000370282199830. http://dx.doi.org/10.1177/0003702821998302.

Повний текст джерела

Анотація:

Quartz is the most abundant mineral on the earth’s surface. It is spectrally active in the longwave infrared (LWIR) region with no significant spectral features in the optical domain, i.e., visible–near-infrared–shortwave-infrared (Vis–NIR–SWIR) region. Several space agencies are planning to mount optical image spectrometers in space, with one of their missions being to map raw materials. However, these sensors are active across the optical region, making the spectral identification of quartz mineral problematic. This study demonstrates that indirect relationships between the optical and LWIR regions (where quartz is spectrally dominant) can be used to assess quartz content spectrally using solely the optical region. To achieve this, we made use of the legacy Israeli soil spectral library, which characterizes arid and semiarid soils through comprehensive chemical and mineral analyses along with spectral measurements across the Vis–NIR–SWIR region (reflectance) and LWIR region (emissivity). Recently, a Soil Quartz Clay Mineral Index (SQCMI) was developed using mineral-related emissivity features to determine the content of quartz, relative to clay minerals, in the soil. The SQCMI was highly and significantly correlated with the Vis–NIR–SWIR spectral region (R2 = 0.82, root mean square error (RMSE) = 0.01, ratio of performance to deviation (RPD) = 2.34), whereas direct estimation of the quartz content using a gradient-boosting algorithm against the Vis–NIR–SWIR region provided poor results (R2 = 0.45, RMSE = 15.63, RPD = 1.32). Moreover, estimation of the SQCMI value was even more accurate when only the 2000–2450 nm spectral range (atmospheric window) was used (R2 = 0.9, RMSE = 0.005, RPD = 1.95). These results suggest that reflectance data across the 2000–2450 nm spectral region can be used to estimate quartz content, relative to clay minerals in the soil satisfactorily using hyperspectral remote sensing means.

Стилі APA, Harvard, Vancouver, ISO та ін.

41

Liao, Lauren D., Assiamira Ferrara, Mara B. Greenberg, Amanda L. Ngo, Juanran Feng, Zhenhua Zhang, Patrick T. Bradshaw, Alan E. Hubbard, and Yeyi Zhu. "Development and validation of prediction models for gestational diabetes treatment modality using supervised machine learning: a population-based cohort study." BMC Medicine 20, no. 1 (September 15, 2022). http://dx.doi.org/10.1186/s12916-022-02499-7.

Повний текст джерела

Анотація:

Abstract Background Gestational diabetes (GDM) is prevalent and benefits from timely and effective treatment, given the short window to impact glycemic control. Clinicians face major barriers to choosing effectively among treatment modalities [medical nutrition therapy (MNT) with or without pharmacologic treatment (antidiabetic oral agents and/or insulin)]. We investigated whether clinical data at varied stages of pregnancy can predict GDM treatment modality. Methods Among a population-based cohort of 30,474 pregnancies with GDM delivered at Kaiser Permanente Northern California in 2007–2017, we selected those in 2007–2016 as the discovery set and 2017 as the temporal/future validation set. Potential predictors were extracted from electronic health records at different timepoints (levels 1–4): (1) 1-year preconception to the last menstrual period, (2) the last menstrual period to GDM diagnosis, (3) at GDM diagnosis, and (4) 1 week after GDM diagnosis. We compared transparent and ensemble machine learning prediction methods, including least absolute shrinkage and selection operator (LASSO) regression and super learner, containing classification and regression tree, LASSO regression, random forest, and extreme gradient boosting algorithms, to predict risks for pharmacologic treatment beyond MNT. Results The super learner using levels 1–4 predictors had higher predictability [tenfold cross-validated C-statistic in discovery/validation set: 0.934 (95% CI: 0.931–0.936)/0.815 (0.800–0.829)], compared to levels 1, 1–2, and 1–3 (discovery/validation set C-statistic: 0.683–0.869/0.634–0.754). A simpler, more interpretable model, including timing of GDM diagnosis, diagnostic fasting glucose value, and the status and frequency of glycemic control at fasting during one-week post diagnosis, was developed using tenfold cross-validated logistic regression based on super learner-selected predictors. This model compared to the super learner had only a modest reduction in predictability [discovery/validation set C-statistic: 0.825 (0.820–0.830)/0.798 (95% CI: 0.783–0.813)]. Conclusions Clinical data demonstrated reasonably high predictability for GDM treatment modality at the time of GDM diagnosis and high predictability at 1-week post GDM diagnosis. These population-based, clinically oriented models may support algorithm-based risk-stratification for treatment modality, inform timely treatment, and catalyze more effective management of GDM.

Стилі APA, Harvard, Vancouver, ISO та ін.

42

Abedi, Vida, Venkatesh Avula, Durgesh Chaudhary, Shima Shahjouei, Ayesha Khan, Christoph J. Griessenauer, Jiang Li, and Ramin Zand. "Abstract P261: Machine Learning-Enabled Prediction of Long-Term Stroke Recurrence Using Data From Electronic Health Records." Stroke 52, Suppl_1 (March 2021). http://dx.doi.org/10.1161/str.52.suppl_1.p261.

Повний текст джерела

Анотація:

Objective: The long-term risk of recurrent ischemic stroke, estimated to be between 17%-30%, cannot be reliably assessed. Our goal was to study whether machine-learning can be trained to predict stroke recurrence and identify key clinical variables and assess whether performance metrics can be optimized. Methods: We used patient-level data from electronic health records, 6 algorithms (Logistic Regression, Extreme Gradient Boosting, Gradient Boosting Machine, Random Forest, Support Vector Machine, Decision Tree), 4 feature selection strategies, 5 prediction windows, and 2 sampling strategies to develop 288 models for up to 5-year stroke recurrence prediction. We further identified important clinical features and different optimization strategies. Results: We included 2,091 ischemic stroke patients for this study. Model AUROC was stable for prediction windows of 1,2,3,4 and 5 years, with the highest score for the 1-year (0.79) and the lowest score for the 5-year prediction window (0.69). A total of 21(7%) models reached an AUROC above 0.73 while 110(38%) models reached an AUROC greater than 0.7. Among the 53 features analyzed, age, body mass index, and laboratory-based features (such as high-density lipoprotein, hemoglobin A1C, and creatinine) had the highest overall importance scores. The balance between specificity and sensitivity improved through sampling strategies. Conclusion: All the selected six modeling algorithms could be trained to predict the long-term stroke recurrence and laboratory-based variables were highly associated with stroke recurrence. The latter could be targeted for personalized interventions. Model performance metrics could be optimized, and models can be implemented in the same healthcare system as intelligent decision support to improve outcomes.

Стилі APA, Harvard, Vancouver, ISO та ін.

43

Liu, Jun, Lingxiao Xu, Enzhao Zhu, Chunxia Han, and Zisheng Ai. "Prediction of acute kidney injury in patients with femoral neck fracture utilizing machine learning." Frontiers in Surgery 9 (July 26, 2022). http://dx.doi.org/10.3389/fsurg.2022.928750.

Повний текст джерела

Анотація:

BackgroundAcute kidney injury (AKI) is a common complication associated with significant morbidity and mortality in high-energy trauma patients. Given the poor efficacy of interventions after AKI development, it is important to predict AKI before its diagnosis. Therefore, this study aimed to develop models using machine learning algorithms to predict the risk of AKI in patients with femoral neck fractures.MethodsWe developed machine-learning models using the Medical Information Mart from Intensive Care (MIMIC)-IV database. AKI was predicted using 10 predictive models in three-time windows, 24, 48, and 72 h. Three optimal models were selected according to the accuracy and area under the receiver operating characteristic curve (AUROC), and the hyperparameters were adjusted using a random search algorithm. The Shapley additive explanation (SHAP) analysis was used to determine the impact and importance of each feature on the prediction. Compact models were developed using important features chosen based on their SHAP values and clinical availability. Finally, we evaluated the models using metrics such as accuracy, precision, AUROC, recall, F1 scores, and kappa values on the test set after hyperparameter tuning.ResultsA total of 1,596 patients in MIMIC-IV were included in the final cohort, and 402 (25%) patients developed AKI after surgery. The light gradient boosting machine (LightGBM) model showed the best overall performance for predicting AKI before 24, 48, and 72 h. AUROCs were 0.929, 0.862, and 0.904. The SHAP value was used to interpret the prediction models. Renal function markers and perioperative blood transfusions are the most critical features for predicting AKI. In compact models, LightGBM still performs the best. AUROCs were 0.930, 0.859, and 0.901.ConclusionsIn our analysis, we discovered that LightGBM had the best metrics among all algorithms used. Our study identified the LightGBM as a solid first-choice algorithm for early AKI prediction in patients after femoral neck fracture surgery.

Стилі APA, Harvard, Vancouver, ISO та ін.

44

Singh, Pradeep, Aditya Nagori, Rakesh Lodha, and Tavpritesh Sethi. "Early prediction of hypothermia in pediatric intensive care units using machine learning." Frontiers in Physiology 13 (September 2, 2022). http://dx.doi.org/10.3389/fphys.2022.921884.

Повний текст джерела

Анотація:

Hypothermia is a life-threatening condition where the temperature of the body drops below 35°C and is a key source of concern in Intensive Care Units (ICUs). Early identification can help to nudge clinical management to initiate early interventions. Despite its importance, very few studies have focused on the early prediction of hypothermia. In this study, we aim to monitor and predict Hypothermia (30 min-4 h) ahead of its onset using machine learning (ML) models developed on physiological vitals and to prospectively validate the best performing model in the pediatric ICU. We developed and evaluated ML algorithms for the early prediction of hypothermia in a pediatric ICU. Sepsis advanced forecasting engine ICU Database (SafeICU) data resource is an in-house ICU source of data built in the Pediatric ICU at the All-India Institute of Medical Science (AIIMS), New Delhi. Each time-stamp at 1-min resolution was labeled for the presence of hypothermia to construct a retrospective cohort of pediatric patients in the SafeICU data resource. The training set consisted of windows of the length of 4.2 h with a lead time of 30 min-4 h from the onset of hypothermia. A set of 3,835 hand-engineered time-series features were calculated to capture physiological features from the time series. Features selection using the Boruta algorithm was performed to select the most important predictors of hypothermia. A battery of models such as gradient boosting machine, random forest, AdaBoost, and support vector machine (SVM) was evaluated utilizing five-fold test sets. The best-performing model was prospectively validated. A total of 148 patients with 193 ICU stays were eligible for the model development cohort. Of 3,939 features, 726 were statistically significant in the Boruta analysis for the prediction of Hypothermia. The gradient boosting model performed best with an Area Under the Receiver Operating Characteristic curve (AUROC) of 85% (SD = 1.6) and a precision of 59.2% (SD = 8.8) for a 30-min lead time before the onset of Hypothermia onset. As expected, the model showed a decline in model performance at higher lead times, such as AUROC of 77.2% (SD = 2.3) and precision of 41.34% (SD = 4.8) for 4 h ahead of Hypothermia onset. Our GBM(gradient boosting machine) model produced equal and superior results for the prospective validation, where an AUROC of 79.8% and a precision of 53% for a 30-min lead time before the onset of Hypothermia whereas an AUROC of 69.6% and a precision of 38.52% for a (30 min-4 h) lead time prospective validation of Hypothermia. Therefore, this work establishes a pipeline termed ThermoGnose for predicting hypothermia, a major complication in pediatric ICUs.

Стилі APA, Harvard, Vancouver, ISO та ін.

45

Cui, Ruixia, Wenbo Hua, Kai Qu, Heran Yang, Yingmu Tong, Qinglin Li, Hai Wang, et al. "An Interpretable Early Dynamic Sequential Predictor for Sepsis-Induced Coagulopathy Progression in the Real-World Using Machine Learning." Frontiers in Medicine 8 (December 3, 2021). http://dx.doi.org/10.3389/fmed.2021.775047.

Повний текст джерела

Анотація:

Sepsis-associated coagulation dysfunction greatly increases the mortality of sepsis. Irregular clinical time-series data remains a major challenge for AI medical applications. To early detect and manage sepsis-induced coagulopathy (SIC) and sepsis-associated disseminated intravascular coagulation (DIC), we developed an interpretable real-time sequential warning model toward real-world irregular data. Eight machine learning models including novel algorithms were devised to detect SIC and sepsis-associated DIC 8n (1 ≤ n ≤ 6) hours prior to its onset. Models were developed on Xi'an Jiaotong University Medical College (XJTUMC) and verified on Beth Israel Deaconess Medical Center (BIDMC). A total of 12,154 SIC and 7,878 International Society on Thrombosis and Haemostasis (ISTH) overt-DIC labels were annotated according to the SIC and ISTH overt-DIC scoring systems in train set. The area under the receiver operating characteristic curve (AUROC) were used as model evaluation metrics. The eXtreme Gradient Boosting (XGBoost) model can predict SIC and sepsis-associated DIC events up to 48 h earlier with an AUROC of 0.929 and 0.910, respectively, and even reached 0.973 and 0.955 at 8 h earlier, achieving the highest performance to date. The novel ODE-RNN model achieved continuous prediction at arbitrary time points, and with an AUROC of 0.962 and 0.936 for SIC and DIC predicted 8 h earlier, respectively. In conclusion, our model can predict the sepsis-associated SIC and DIC onset up to 48 h in advance, which helps maximize the time window for early management by physicians.

Стилі APA, Harvard, Vancouver, ISO та ін.

46

Krachman, Joshua A., Jessica A. Patricoski, Christopher T. Le, Jina Park, Ruijing Zhang, Kirby D. Gong, Indranuj Gangan, et al. "Predicting Flow Rate Escalation for Pediatric Patients on High Flow Nasal Cannula Using Machine Learning." Frontiers in Pediatrics 9 (November 8, 2021). http://dx.doi.org/10.3389/fped.2021.734753.

Повний текст джерела

Анотація:

Background: High flow nasal cannula (HFNC) is commonly used as non-invasive respiratory support in critically ill children. There are limited data to inform consensus on optimal device parameters, determinants of successful patient response, and indications for escalation of support. Clinical scores, such as the respiratory rate-oxygenation (ROX) index, have been described as a means to predict HFNC non-response, but are limited to evaluating for escalations to invasive mechanical ventilation (MV). In the presence of apparent HFNC non-response, a clinician may choose to increase the HFNC flow rate to hypothetically prevent further respiratory deterioration, transition to an alternative non-invasive interface, or intubation for MV. To date, no models have been assessed to predict subsequent escalations of HFNC flow rates after HFNC initiation.Objective: To evaluate the abilities of tree-based machine learning algorithms to predict HFNC flow rate escalations.Methods: We performed a retrospective, cohort study assessing children admitted for acute respiratory failure under 24 months of age placed on HFNC in the Johns Hopkins Children's Center pediatric intensive care unit from January 2019 through January 2020. We excluded encounters with gaps in recorded clinical data, encounters in which MV treatment occurred prior to HFNC, and cases electively intubated in the operating room. The primary study outcome was discriminatory capacity of generated machine learning algorithms to predict HFNC flow rate escalations as compared to each other and ROX indices using area under the receiver operating characteristic (AUROC) analyses. In an exploratory fashion, model feature importance rankings were assessed by comparing Shapley values.Results: Our gradient boosting model with a time window of 8 h and lead time of 1 h before HFNC flow rate escalation achieved an AUROC with a 95% confidence interval of 0.810 ± 0.003. In comparison, the ROX index achieved an AUROC of 0.525 ± 0.000.Conclusion: In this single-center, retrospective cohort study assessing children under 24 months of age receiving HFNC for acute respiratory failure, tree-based machine learning models outperformed the ROX index in predicting subsequent flow rate escalations. Further validation studies are needed to ensure generalizability for bedside application.

Стилі APA, Harvard, Vancouver, ISO та ін.

47

Ibeas, Jose, Oscar Galles, Nuria Monill, Edwar Macias, Antoni Morell, Javier Serrano, Dolores Rexachs, Jose Vicario, Jordi Cokas, and Elisenda Martinez. "MO463: Machine Learning-Based Prediction of Mortality and Risk Factors in Patients With Chronic Kidney Disease Developed With Data From 10000 Patients Over 11 Years." Nephrology Dialysis Transplantation 37, Supplement_3 (May 2022). http://dx.doi.org/10.1093/ndt/gfac070.077.

Повний текст джерела

Анотація:

Abstract BACKGROUND AND AIMS Around the globe, over 850 million patients suffer from chronic kidney disease (CKD). These have associated with high mortality rates, in particular when undergoing renal replacement therapies (RRT) such as dialysis, reaching up to 10% a year, and therefore, are considered of a fragile status. CKD is also associated with cardiovascular complications that can cause mutual aggravation. Available clinical guidelines identify certain risk factors and predictive models, but those have not been tested and validated successfully for renal patients and consequently, there is for the identification of predictive factors and the prediction of mortality. This is caused by the limitations of current methodologies and statistics: current models simplify complex relationships by assuming a linear relation between risk factors and certain events, and so, there is a need for a new approach. Over the last years, a rise in Artificial Intelligence and Machine Learning has been seen, presenting an alternative for the first time. This project aimed to study the performance of different ML algorithms for the prediction of mortality and the identification of risk factors for CKD patients. METHOD Design: Retrospective analysis of a historical cohort from the Register of Renal Patients of Catalonia (RMRC) and the Catalan Agency for Health Quality and Evaluation. Group of 10 473 patients with CKD stages from first to RRT. Follow-up of 11 years, from January 2010 to December 2020. Inclusion criteria: ˃18 years. Training of an Extreme Gradient Boosting model, and comparison with other algorithms for the prediction of mortality at different times, using different follow-up periods for each patient. Methodology: Variables: i) Age, gender, body mass index, time for death (9), ii) Diagnoses (ICD-9/10) (26); iii) Laboratory variables (37) iv) All pharmacological treatments (46). For all executions, data was balanced using the SMOTETomek technique. Analysis: RESULTS The patient sample presented a mean of 68.2 ± 12.9 years and 65.8% were female and 34.2% male. Different follow-up and time windows were tested and the best results were obtained when using a 2-year period follow-up and a 4-year mortality prediction. The Area Under the Curve values obtained for each model were: XBGClassifier (0.89), LGBM Classifier (0.90), CatBoost Classifier (0.91). The 10 variables with major relevance according to the XBGClassifier (54.65% of the total weight of the 71 variables) and in this order, are cardiopathy, advanced chronic kidney disease, vasculopathy, age, neoplasia, transplant, digestive pathology, estimated glomerular filtration rate, high blood pressure. The results presented in Figure and Table correspond to the mean obtained for the 5-folds of the Cross-Validation. CONCLUSION Machine Learning techniques suppose an alternative to classical statistical methods, with a high predictive capacity for mortality. The possibility of generating algorithms with real-world data can allow the individualization of the mortality risk as well as the predictive factors.

Стилі APA, Harvard, Vancouver, ISO та ін.

48

Dam, Tariq A., Luca F. Roggeveen, Fuda van Diggelen, Lucas M. Fleuren, Ameet R. Jagesar, Martijn Otten, Heder J. de Vries, et al. "Predicting responders to prone positioning in mechanically ventilated patients with COVID-19 using machine learning." Annals of Intensive Care 12, no. 1 (October 20, 2022). http://dx.doi.org/10.1186/s13613-022-01070-0.

Повний текст джерела

Анотація:

Abstract Background For mechanically ventilated critically ill COVID-19 patients, prone positioning has quickly become an important treatment strategy, however, prone positioning is labor intensive and comes with potential adverse effects. Therefore, identifying which critically ill intubated COVID-19 patients will benefit may help allocate labor resources. Methods From the multi-center Dutch Data Warehouse of COVID-19 ICU patients from 25 hospitals, we selected all 3619 episodes of prone positioning in 1142 invasively mechanically ventilated patients. We excluded episodes longer than 24 h. Berlin ARDS criteria were not formally documented. We used supervised machine learning algorithms Logistic Regression, Random Forest, Naive Bayes, K-Nearest Neighbors, Support Vector Machine and Extreme Gradient Boosting on readily available and clinically relevant features to predict success of prone positioning after 4 h (window of 1 to 7 h) based on various possible outcomes. These outcomes were defined as improvements of at least 10% in PaO2/FiO2 ratio, ventilatory ratio, respiratory system compliance, or mechanical power. Separate models were created for each of these outcomes. Re-supination within 4 h after pronation was labeled as failure. We also developed models using a 20 mmHg improvement cut-off for PaO2/FiO2 ratio and using a combined outcome parameter. For all models, we evaluated feature importance expressed as contribution to predictive performance based on their relative ranking. Results The median duration of prone episodes was 17 h (11–20, median and IQR, N = 2632). Despite extensive modeling using a plethora of machine learning techniques and a large number of potentially clinically relevant features, discrimination between responders and non-responders remained poor with an area under the receiver operator characteristic curve of 0.62 for PaO2/FiO2 ratio using Logistic Regression, Random Forest and XGBoost. Feature importance was inconsistent between models for different outcomes. Notably, not even being a previous responder to prone positioning, or PEEP-levels before prone positioning, provided any meaningful contribution to predicting a successful next proning episode. Conclusions In mechanically ventilated COVID-19 patients, predicting the success of prone positioning using clinically relevant and readily available parameters from electronic health records is currently not feasible. Given the current evidence base, a liberal approach to proning in all patients with severe COVID-19 ARDS is therefore justified and in particular regardless of previous results of proning.

Стилі APA, Harvard, Vancouver, ISO та ін.

49

Rogers, Ian Keith. "Without a True North: Tactical Approaches to Self-Published Fiction." M/C Journal 20, no. 6 (December 31, 2017). http://dx.doi.org/10.5204/mcj.1320.

Повний текст джерела

Анотація:

IntroductionOver three days in November 2017, 400 people gathered for a conference at the Sam’s Town Hotel and Gambling Hall in Las Vegas, Nevada. The majority of attendees were fiction authors but the conference program looked like no ordinary writer’s festival; there were no in-conversation interviews with celebrity authors, no panels on the politics of the book industry and no books launched or promoted. Instead, this was a gathering called 20Books2017, a self-publishing conference about the business of fiction ebooks and there was expertise in the room.Among those attending, 50 reportedly earned over $100,000 US per annum, with four said to be earning in excess of $1,000,000 US year. Yet none of these authors are household names. Their work is not adapted to film or television. Their books cannot be found on the shelves of brick-and-mortar bookstores. For the most part, these authors go unrepresented by the publishing industry and literary agencies, and further to which, only a fraction have ever actively pursued traditional publishing. Instead, they write for and sell into a commercial fiction market dominated by a single retailer and publisher: online retailer Amazon.While the online ebook market can be dynamic and lucrative, it can also be chaotic. Unlike the traditional publishing industry—an industry almost stoically adherent to various gatekeeping processes: an influential agent-class, formalized education pathways, geographic demarcations of curatorial power (see Thompson)—the nascent ebook market is unmapped and still somewhat ungoverned. As will be discussed below, even the markets directly engineered by Amazon are subject to rapid change and upheaval. It can be a space with shifting boundaries and thus, for many in the traditional industry both Amazon and self-publishing come to represent a type of encroaching northern dread.In the eyes of the traditional industry, digital self-publishing certainly conforms to the barbarous north of European literary metaphor: Orwell’s ‘real ugliness of industrialism’ (94) governed by the abject lawlessness of David Peace’s Yorkshire noir (Fowler). But for adherents within the day-to-day of self-publishing, this unruly space also provides the frontiers and gold-rushes of American West mythology.What remains uncertain is the future of both the traditional and the self-publishing sectors and the degree to which they will eventually merge, overlap and/or co-exist. So-called ‘hybrid-authors’ (those self-publishing and involved in traditional publication) are becoming increasingly common—especially in genre fiction—but the disruption brought about by self-publishing and ebooks appears far from complete.To the contrary, the Amazon-led ebook iteration of this market is relatively new. While self-publishing and independent publishing have long histories as modes of production, Amazon launched both its Kindle e-reader device and its marketplace Kindle Direct Publishing (KDP) a little over a decade ago. In the years subsequent, the integration of KDP within the Amazon retail environment dramatically altered the digital self-publishing landscape, effectively paving the way for competing platforms (Kobo, Nook, iBooks, GooglePlay) and today’s vibrant—and, at times, crassly commercial—self-published fiction communities.As a result, the self-publishing market has experienced rapid growth: self-publishers now collectively hold the largest share of fiction sales within Amazon’s ebook categories, as much as 35% of the total market (Howey). Contrary to popular belief they do not reside entirely at the bottom of Amazon’s expansive catalogue either: at the time of writing, 11 of Amazon’s Top 50 Bestsellers were self-published and the median estimated monthly revenue generated by these ‘indie’ books was $43,000 USD / month (per author) on the American site alone (KindleSpy).This international publishing market now proffers authors running the gamut of commercial uptake, from millionaire successes like romance writer H.M. Ward and thriller author Mark Dawson, through to the 19% of self-published authors who listed their annual royalty income as $0 per annum (Weinberg). Their overall market share remains small—as little as 1.8% of trade publishing in the US as a whole (McIlroy 4)—but the high end of this lucrative slice is particularly dynamic: science fiction author Michael Anderle (and 20Books2017 keynote) is on-track to become a seven-figure author in his second year of publishing (based on Amazon sales ranking data), thriller author Mark Dawson has sold over 300,000 copies of his self-published Milton series in 3 years (McGregor), and a slew of similar authors have recently attained New York Times and US Today bestseller status.To date, there is not a broad range of scholarship investigating the operational logics of self-published fiction. Timothy Laquintano’s recent Mass Authorship and the Rise of Self-Publishing (2016) is a notable exception, drawing self-publishing into historical debates surrounding intellectual property, the future of the book and digital abundance. The more empirical portions of Mass Authorship—taken from activity between 2011 to 2015—directly informs this research and his chapter on Amazon (Chapter 4) could be read as a more macro companion to my findings below; taken together and compared they illustrate just how fast-moving the market is. Nick Levey’s work on ‘post-press’ literature and its inherent risks (and discourses of cultural capitol) also informs my thesis here.In addition to which, there is scholarship centred on publishing more generally that also touches on self-published writers as a category of practitioner (see Baverstock and Steinitz, Haughland, Thomlinson and Bélanger). Most of this later work focuses almost entirely on the finished product, usually situating self-publishing as directly oppositional to traditional publishing, and thus subordinating it.In this paper, I hope to outline how the self-publishers I’ve observed have enacted various tactical approaches that specifically strive to tame their chaotic marketplace, and to indicate—through one case study (Amazon exclusivity)—a site of production and resistance where they have occasionally succeeded. Their approach is one that values information sharing and an open-source approach to book-selling and writing craft, ideologies drawn more from the tech / start-up world than commercial book industry described by Thompson (10). It is a space deeply informed by the virtual nature of its major platforms and as such, I argue its relation to the world of traditional publishing—and its representation within the traditional book industry—are tenuous, despite the central role of authorship and books.Making the Virtual Self-Publishing SceneWithin the study of popular music, the use of Barry Shank and Will Straw’s ‘scene’ concept has been an essential tool for uncovering and mapping independent/DIY creative practice. The term scene, defined by Straw as cultural space, is primarily interested in how cultural phenomena articulates or announces itself. A step beyond community, scene theorists are less concerned with examining an evolving history of practice (deemed essentialist) than they are concerned with focusing on the “making and remaking of alliances” as the crucial process whereby communal culture is formed, expressed and distributed (370).A scene’s spatial dimension—often categorized as local, translocal or virtual (see Bennett and Peterson)—demands attention be paid to hybridization, as a diversity of actors approach the same terrain from differing vantage points, with distinct motivations. As a research tool, scene can map action as the material existence of ideology. Thus, its particular usefulness is its ability to draw findings from diverse communities of practice.Drawing methodologies and approaches from Bourdieu’s field theory—a particularly resonant lens for examining cultural work—and de Certeau’s philosophies of space and circumstantial moves (“failed and successful attempts at redirection within a given terrain,” 375), scene focuses on articulation, the process whereby individual and communal activity becomes an observable or relatable or recordable phenomena.Within my previous work (see Bennett and Rogers, Rogers), I’ve used scene to map a variety of independent music-making practices and can see clear resemblances between independent music-making and the growing assemblage of writers within ebook self-publishing. The democratizing impulses espoused by self-publishers (the removal of gatekeepers as married to visions of a fiction/labour meritocracy) marry up quite neatly with the heady mix of separatism and entrepreneurialism inherent in Australian underground music.Self-publishers are typically older and typically more upfront about profit, but the communal interaction—the trade and gifting of support, resources and information—looks decidedly similar. Instead, the self-publishers appear different in one key regard: their scene-making is virtual in ways that far outstrip empirical examples drawn from popular music. 20Books2017 is only one of two conferences for this community thus far and represents one of the few occasions in which the community has met in any sort of organized way offline. For the most part, and in the day-to-day, self-publishing is a virtual scene.At present, the virtual space of self-published fiction is centralized around two digital platforms. Firstly, there is the online message board, of which two specific online destinations are key: the first is Kboards, a PHP-coded forum “devoted to all things Kindle” (Kboards) but including a huge author sub-board of self-published writers. The archive of this board amounts to almost two million posts spanning back to 2009. The second message board site is a collection of Facebook groups, of which the 10,000-strong membership of 20BooksTo50K is the most dominant; it is the originating home of 20Books2017.The other platform constituting the virtual scene of self-publishing is that of podcasting. While there are a number of high-profile static websites and blogs related to self-publishing (and an emerging community of vloggers), these pale in breadth and interaction when compared to podcasts such as The Creative Penn, The Self-Publishing Podcast, The Sell More Books Show, Rocking Self-Publishing (now defunct but archived) and The Self-Publishing Formula podcast. Statistical information on the distribution of these podcasts is unavailable but the circulation and online discussion of their content and the interrelation between the different shows and their hosts and guests all point to their currency within the scene.In short, if one is to learn about the business and craft production modes of self-publishing, one tends to discover and interact with one of these two platforms. The consensus best practice espoused on these boards and podcasts is the data set in which the remainder of this paper draws findings. I have spent the last two years embedded in these communities but for the purposes of this paper I will be drawing data exclusively from the public-facing Kboards, namely because it is the oldest, most established site, but also because all of the issues and discussion presented within this data have been cross-referenced across the different podcasts and boards. In fact, for a long period Kboards was so central to the scene that itself was often the topic of conversation elsewhere.Sticking in the Algorithm: The Best Practice of Fiction Self-PublishingSelf-publishing is a virtual scene because its “constellation of divergent interests and forces” (Shank, Preface, x) occur almost entirely online. This is not just a case of discussion, collaboration and discovery occurring online—as with the virtual layer of local and translocal music scenes—rather, the self-publishing community produces into the online space, almost exclusively. Its venues and distribution pathways are online and while its production mechanisms (writing) are still physical, there is an almost instantaneous and continuous interface with the online. These writers type and, increasingly dictate, their work into the virtual cloud, have it edited there (via in-text annotation) and from there the work is often designed, formatted, published, sold, marketed, reviewed and discussed online.In addition to which, a significant portion of these writers produce collaborative works, co-writing novels and co-editing them via cooperative apps. Teams of beta-readers (often fans) work on manuscripts pre-launch. Covers, blurbs, log lines, ad copy and novel openings are tested and reconfigured via crowd-sourced opinion. Seen here, the writing of the self-publishing scene is often explicitly commercial. But more to the fact, it never denies its direct co-relation with the mandates of online publishing. It is not traditional writing (it moves beyond authorship) and viewing these writers as emerging or unpublished or indeed, using the existing vernacular of literary writing practices, often fails to capture what it is they do.As the self-publishers write for the online space, Amazon forms a huge part of their thinking and working. The site sits at the heart of the practices under consideration here. Many of the authors drawn into this research are ‘wide’ in their online retail distribution, meaning they have books placed with Amazon’s online retail competitors. Yet the decision to go ‘wide’ or stay exclusive to Amazon — and the volume of discussion around this choice — is illustrative of how dominant the company remains in the scene. In fact, the example of Amazon exclusivity provides a valuable case-study.For self-publishers, Amazon exclusivity brings two stated and tangible benefits. The first relates to revenue diversification within Amazon, with exclusivity delivering an additional revenue stream in the form of Kindle Unlimited royalties. Kindle Unlimited (KU) is a subscription service for ebooks. Consumers pay a flat monthly fee ($13.99 AUD) for unlimited access to over a million Kindle titles. For a 300-page book, a full read-through of a novel under KU pays roughly the same royalty to authors as the sale of a $2.99 ebook, but only to Amazon-exclusive authors. If an exclusive book is particularly well suited to the KU audience, this can present authors with a very serious return.The second benefit of Amazon exclusivity is access to internal site merchandising; namely ‘Free Days’ where the book is given away (and can chart on the various ‘Top 100 Free’ leaderboards) and ‘Countdown Deals’ where a decreasing discount is staggered across a period (thus creating a type of scarcity).These two perks can prove particularly lucrative to individual authors. On Kboards, user Annie Jocoby (also writing as Rachel Sinclair) details her experiences with exclusivity:I have a legal thriller series that is all-in with KU [Kindle Unlimited], and I can honestly say that KU has been fantastic for visibility for that particular series. I put the books into KU in the first part of August, and I watched my rankings rise like crazy after I did that. They've stuck, too. If I weren't in KU, I doubt that they would still be sticking as well as they have. (anniejocoby)This is fairly typical of the positive responses to exclusivity, yet it incorporates a number of the more opaque benefits entangled with going exclusive to Amazon.First, there is ‘visibility.’ In self-publishing terms, ‘visibility’ refers almost exclusively to chart positions within Amazon. The myriad of charts — and how they function — is beyond the scope of this paper but they absolutely indicate — often dictate — the discoverability of a book online. These charts are the ‘front windows’ of Amazon, to use an analogy to brick-and-mortar bookstores. Books that chart well are actively being bought by customers and they are very often those benefiting from Amazon’s powerful recommendation algorithm, something that expands beyond the site into the company’s expansive customer email list. This brings us to the second point Jocoby mentions, the ‘sticking’ within the charts.There is a widely held belief that once a good book (read: free of errors, broadly entertaining, on genre) finds its way into the Amazon recommendation algorithm, it can remain there for long periods of time leading to a building success as sales beget sales, further boosting the book’s chart performance and reviews. There is also the belief among some authors that Kindle Unlimited books are actively favoured by this algorithm. The high-selling Amanda M. Lee noted a direct correlation:Rank is affected when people borrow your book [under KU]. Page reads don't play into it all. (Amanda M. Lee)Within the same thread, USA Today bestseller Annie Bellet elaborated:We tested this a bunch when KU 2.0 hit. A page read does zip for rank. A borrow, even with no pages read, is what prompts the rank change. Borrows are weighted exactly like sales from what we could tell, it doesn't matter if nobody opens the book ever. All borrows now are ghost borrows, of course, since we can't see them anymore, so it might look like pages are coming in and your rank is changing, but what is probably happening is someone borrowed your book around the same time, causing the rank jump. (Annie B)Whether this advantage is built into the algorithm in a (likely) attempt to favour exclusive authors, or by nature of KU books presenting at a lower price point, is unknown but there is anecdotal evidence that once a KU book gains traction, it can ‘stick’ within the charts for longer periods of time compared to non-exclusive titles.At the entrepreneurial end of the fiction self-publishing scene, Amazon is positioned at the very centre. To go wide—to follow vectors through the scene adjacent to Amazon — is to go around the commercial centre and its profits. Yet no one in this community remains unaffected by the strategic position of this site and the market it has either created or captured. Amazon’s institutional practices can be adopted by competitors (Kobo Plus is a version of KU) and the multitude of tactics authors use to promote their work all, in one shape or another, lead back to ‘circumstantial moves’ learned from Amazon or services that are aimed at promoting work sold there. Further to which, the sense of instability and risk engendered by such a dominant market player is felt everywhere.Some Closing Ideas on the Ideology of Self-PublishingSelf-publishing fiction remains tactical in the de Certeau sense of the term. It is responsive and ever-shifting, with a touch of communal complicity and what he calls la perruque (‘the wig’), a shorthand for resistance that presents itself as submission (25). The entrepreneurialism of self-published fiction trades off this sense of the tactical.Within the scene, Amazon bestseller charts aren’t as much markers of prestige as systems to be hacked. The choice between ‘wide’ and exclusive is only ever short-term; it is carefully scrutinised and the trade-offs and opportunities are monitored week-to-week and debated constantly online. Over time, the self-publishing scene has become expert at decoding Amazon’s monolithic Terms of Service, ever eager to find both advantage and risk as they attempt to lever the affordances of digital publishing against their own desire for profit and expression.This sense of mischief and slippage forms a big part of what self-publishing is. In contrast to traditional publishing—with its long lead times and physical real estate—self-publishing can’t help but appear fragile, wild and coarse. There is no other comparison possible.To survive in self-publishing is to survive outside the established book industry and to thrive within a new and far more uncertain market/space, one almost entirely without a mapped topology. Unlike the traditional publishing industry—very much a legacy, a “relatively stable” population group (Straw 373)—self-publishing cannot escape its otherness, not in the short term. Both its spatial coordinates and its pathways remain too fast-evolving in comparison to the referent of traditional publishing. In the short-to-medium term, I imagine it will remain at some cultural remove from traditional publishing, be it perceived as a threatening northern force or a speculative west.To see self-publishing in the present, I encourage scholars to step away from traditional publishing industry protocols and frameworks, to strive to see this new arena as the self-published authors themselves understand it (what Muggleton has referred to a “indigenous meaning” 13).Straw and Shank’s scene concept provides one possible conceptual framework for this shift in understanding as scene’s reliance on spatial considerations harbours an often underemphazised asset: it is a theory of orientation. At heart, it draws as much from de Certeau as Bourdieu and as such, the scene presented in this work is never complete or fixed. It is de Certeau’s city “shaped out of fragments of trajectories and alterations of spaces” (93). These scenes—be they musicians or authors—are only ever glimpsed and from a vantage point of close proximity. In short, it is one way out of the essentialisms that currently shroud self-published fiction as a craft, business and community of authors. The cultural space of self-publishing, to return Straw’s scene definition, is one that mirrors its own porous, online infrastructure, its own predominance in virtuality. Its pathways are coded together inside fast-moving media companies and these pathways are increasingly entwined within algorithmic processes of curation that promise meritocratization and disintermediation yet delivery systems that can be learned and manipulated.The agility to publish within these systems is the true skill-set required to self-publish fiction online. It traverses specific platforms and short-term eras. It is the core attribute of success in the scene. Everything else is secondary, including the content of the books produced. It is not the case that these books are of lesser literary quality or that their ever-growing abundance is threatening—this is the counter-argument so often presented by the traditional book industry—but more so that without entrepreneurial agility, the quality of the ebook goes undetermined as it sinks lower and lower into a distribution system that is so open it appears endless.ReferencesAmanda M. Lee. “Re: KU Page Reads and Rank.” Kboards: Writer’s Cafe. 1 Oct. 2007 <https://www.kboards.com/index.php/topic,232945.msg3245005.html#msg3245005>.Annie B [Annie Bellet]. “Re: KU Page Reads and Rank.” Kboards: Writer’s Cafe. 1 Oct. 2007 <https://www.kboards.com/index.php/topic,232945.msg3245068.html#msg3245068>.Anniejocoby [Annie Jocoby]. “Re: Tell Me Why You're WIDE or KU ONLY.” Kboards: Writer’s Cafe. 1 Oct. 2007 <https://www.kboards.com/index.php/topic,242514.msg3558176.html#msg3558176>.Baverstock, Alison, and Jackie Steinitz. “Why Are the Self-Publishers?” Learned Publishing 26 (2013): 211-223.Bennett, Andy, and Richard A. Peterson, eds. Music Scenes: Local, Translocal and Virtual. Vanderbilt University Press, 2004.———, and Ian Rogers. Popular Music Scenes and Cultural Memory. Palgrave Macmillan, 2016.Bourdieu, Pierre. Distinction: A Social Critique of the Judgement of Taste. Routledge, 1984.De Certeau, Michel. The Practice of Everyday Life. University of California Press, 1984.Haugland, Ann. “Opening the Gates: Print On-Demand Publishing as Cultural Production” Publishing Research Quarterly 22.3 (2006): 3-16.Howey, Hugh. “October 2016 Author Earnings Report: A Turning of the Tide.” Author Earnings. 12 Oct. 2016 <http://authorearnings.com/report/october-2016/>.Kboards. About Kboards.com. 2017. 4 Oct. 2017 <https://www.kboards.com/index.php/topic,242026.0.html>.KindleSpy. 2017. Chrome plug-in.Laquintano, Timothy. Mass Authorship and the Rise of Self-Publishing. University of Iowa Press, 2016.Levey, Nick. “Post-Press Literature: Self-Published Authors in the Literary Field.” Post 45. 1 Oct. 2017 <http://post45.research.yale.edu/2016/02/post-press-literature-self-published-authors-in-the-literary-field-3/>.McGregor, Jay. “Amazon Pays $450,000 a Year to This Self-Published Writer.” Forbes. 17 Apr. 2017 <http://www.forbes.com/sites/jaymcgregor/2015/04/17/mark-dawson-made-750000-from-self-published-amazon-books/#bcce23a35e38>.McIlroy, Thad. “Startups within the U.S. Book Publishing Industry.” Publishing Research Quarterly 33 (2017): 1-9.Muggleton, David. Inside Subculture: The Post-Modern Meaning of Style. Berg, 2000.Orwell, George. Selected Essays. Penguin Books, 1960.Fowler, Dawn. ‘‘This Is the North – We Do What We Want’: The Red Riding Trilogy as ‘Yorkshire Noir.” Cops on the Box. University of Glamorgan, 2013.Rogers, Ian. “The Hobbyist Majority and the Mainstream Fringe: The Pathways of Independent Music Making in Brisbane, Australia.” Redefining Mainstream Popular Music, eds. Andy Bennett, Sarah Baker, and Jodie Taylor. Routlegde, 2013. 162-173.Shank, Barry. Dissonant Identities: The Rock’n’Roll Scene in Austin Texas. Wesleyan University Press, 1994.Straw, Will. “Systems of Articulation, Logics of Change: Communities and Scenes in Popular Music.” Cultural Studies 5.3 (1991): 368–88.Thomlinson, Adam, and Pierre C. Bélanger. “Authors’ Views of e-Book Self-Publishing: The Role of Symbolic Capital Risk.” Publishing Research Quarterly 31 (2015): 306-316.Thompson, John B. Merchants of Culture: The Publishing Business in the Twenty-First Century. Penguin, 2012.Weinberg, Dana Beth. “The Self-Publishing Debate: A Social Scientist Separates Fact from Fiction.” Digital Book World. 3 Oct. 2017 <http://www.digitalbookworld.com/2013/self-publishing-debate-part3/>.

Стилі APA, Harvard, Vancouver, ISO та ін.

Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!