Academic literature on the topic 'Gini impurity index'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Gini impurity index.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Gini impurity index"

1

Yuan, Ye, Liji Wu, and Xiangmin Zhang. "Gini-Impurity Index Analysis." IEEE Transactions on Information Forensics and Security 16 (2021): 3154–69. http://dx.doi.org/10.1109/tifs.2021.3076932.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Singh, Sudhir Kuamr, and Dr Vipin Saxena. "Reducing the Impurity of Object-Oriented DatabaseThrough Gini Index." INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 13, no. 11 (November 30, 2014): 5172–78. http://dx.doi.org/10.24297/ijct.v13i11.2787.

Full text
Abstract:
In the current scenario, the size of database is increasing due to audio and video files. In the database, irregularities occur due to duplication of data at many places, therefore, it needs reconstruction of database size. The present work deals with reducing of impurity through a well-known Gini index technique. Since many of software’s are using the object-oriented databases, therefore, an object-oriented database is considered, A real object-oriented database for Electricity Bill Deposit System is considered. A sample size of 15 records is considered, however the present technique can be applied for large size or even for the complex database. A decision tree is constructed and sample queries are performed for verifying the result and Gini index is computed for minimizing the impurity in the presented object-oriented database. Â
APA, Harvard, Vancouver, ISO, and other styles
3

Jananto, Arief, Sulastri Sulastri, Eko Nur Wahyudi, and Sunardi Sunardi. "Data Induk Mahasiswa sebagai Prediktor Ketepatan Waktu Lulus Menggunakan Algoritma CART Klasifikasi Data Mining." Jurnal Sisfokom (Sistem Informasi dan Komputer) 10, no. 1 (February 22, 2021): 71–78. http://dx.doi.org/10.32736/sisfokom.v10i1.991.

Full text
Abstract:
Fakultas Teknologi Informasi Universitas Stikubank (UNISBANK) as one of the faculties in higher education in implementing learning activities has produced a lot of stored data and has graduated many students. The level of timeliness of graduation is important for study programs as an assessment of success. This research tries to dig up the pile of student parent data and graduation data in order to get the pass rate and graduation prediction of active students. By implementing the classification data mining technique and the CART algorithm, it is hoped that a decision tree can be used to predict the class timeliness of graduating from active students. By using the graduation data and student parent data totaling 1018 records, a decision tree model was obtained with an accuracy rate of 63% from the data testing test. Determination of split nodes using the Gini Index which breaks the dataset based on its impurity value. Tests conducted in this study show that the order of the variables in the decision tree is gender, origin school status, parental education, age at entry, city of birth, parent's occupation. The prediction with the resulting model is that 71% of active S1 Information Systems students can graduate on time and 51% for S1 Informatics Engineering students.
APA, Harvard, Vancouver, ISO, and other styles
4

Yu, Yun, Xi Wu, Jiu Chen, Gong Cheng, Xin Zhang, Cheng Wan, Jie Hu, et al. "Characterizing Brain Tumor Regions Using Texture Analysis in Magnetic Resonance Imaging." Frontiers in Neuroscience 15 (June 3, 2021). http://dx.doi.org/10.3389/fnins.2021.634926.

Full text
Abstract:
PurposeTo extract texture features from magnetic resonance imaging (MRI) scans of patients with brain tumors and use them to train a classification model for supporting an early diagnosis.MethodsTwo groups of regions (control and tumor) were selected from MRI scans of 40 patients with meningioma or glioma. These regions were analyzed to obtain texture features. Statistical analysis was conducted using SPSS (version 20.0), including the Shapiro–Wilk test and Wilcoxon signed-rank test, which were used to test significant differences in each feature between the tumor and healthy regions. T-distributed stochastic neighbor embedding (t-SNE) was used to visualize the data distribution so as to avoid tumor selection bias. The Gini impurity index in random forests (RFs) was used to select the top five out of all features. Based on the five features, three classification models were built respectively with three machine learning classifiers: RF, support vector machine (SVM), and back propagation (BP) neural network.ResultsSixteen of the 25 features were significantly different between the tumor and healthy areas. Through the Gini impurity index in RFs, standard deviation, first-order moment, variance, third-order absolute moment, and third-order central moment were selected to build the classification model. The classification model trained using the SVM classifier achieved the best performance, with sensitivity, specificity, and area under the curve of 94.04%, 92.3%, and 0.932, respectively.ConclusionTexture analysis with an SVM classifier can help differentiate between brain tumor and healthy areas with high speed and accuracy, which would facilitate its clinical application.
APA, Harvard, Vancouver, ISO, and other styles
5

PATIL, PRAMOD, ALKA LONDHE, and PARAG KULKARNI. "LEARNING HYPERPLANES THAT CAPTURES THE GEOMETRIC STRUCTURE OF CLASS REGIONS." Graduate Research in Engineering and Technology, July 2013, 7–12. http://dx.doi.org/10.47893/gret.2013.1003.

Full text
Abstract:
Most of the decision tree algorithms rely on impurity measures to evaluate the goodness of hyperplanes at each node while learning a decision tree in a top-down fashion. These impurity measures are not differentiable with relation to the hyperplane parameters. Therefore the algorithms for decision tree learning using impurity measures need to use some search techniques for finding the best hyperplane at every node. These impurity measures don’t properly capture the geometric structures of the data. In this paper a Two-Class algorithm for learning oblique decision trees is proposed. Aggravated by this, the algorithm uses a strategy, to evaluate the hyperplanes in such a way that the (linear) geometric structure in the data is taken into consideration. At each node of the decision tree, algorithm finds the clustering hyperplanes for both the classes. The clustering hyperplanes are obtained by solving the generalized Eigen-value problem. Then the data is splitted based on angle bisector and recursively learn the left and right sub-trees of the node. Since, in general, there will be two angle bisectors; one is selected which is better based on an impurity measure gini index. Thus the algorithm combines the ideas of linear tendencies in data and purity of nodes to find better decision trees. This idea leads to small decision trees and better performance.
APA, Harvard, Vancouver, ISO, and other styles
6

PATIL, PRAMOD, ALKA LONDHE, and PARAG KULKARNI. "LEARNING HYPERPLANES THAT CAPTURES THE GEOMETRIC STRUCTURE OF CLASS REGIONS." Graduate Research in Engineering and Technology, July 2013, 7–12. http://dx.doi.org/10.47893/gret.2013.1003.

Full text
Abstract:
Most of the decision tree algorithms rely on impurity measures to evaluate the goodness of hyperplanes at each node while learning a decision tree in a top-down fashion. These impurity measures are not differentiable with relation to the hyperplane parameters. Therefore the algorithms for decision tree learning using impurity measures need to use some search techniques for finding the best hyperplane at every node. These impurity measures don’t properly capture the geometric structures of the data. In this paper a Two-Class algorithm for learning oblique decision trees is proposed. Aggravated by this, the algorithm uses a strategy, to evaluate the hyperplanes in such a way that the (linear) geometric structure in the data is taken into consideration. At each node of the decision tree, algorithm finds the clustering hyperplanes for both the classes. The clustering hyperplanes are obtained by solving the generalized Eigen-value problem. Then the data is splitted based on angle bisector and recursively learn the left and right sub-trees of the node. Since, in general, there will be two angle bisectors; one is selected which is better based on an impurity measure gini index. Thus the algorithm combines the ideas of linear tendencies in data and purity of nodes to find better decision trees. This idea leads to small decision trees and better performance.
APA, Harvard, Vancouver, ISO, and other styles
7

Ballante, Elena, Marta Galvani, Pierpaolo Uberti, and Silvia Figini. "Polarized Classification Tree Models: Theory and Computational Aspects." Journal of Classification, February 24, 2021. http://dx.doi.org/10.1007/s00357-021-09383-8.

Full text
Abstract:
AbstractIn this paper, a new approach in classification models, called Polarized Classification Tree model, is introduced. From a methodological perspective, a new index of polarization to measure the goodness of splits in the growth of a classification tree is proposed. The new introduced measure tackles weaknesses of the classical ones used in classification trees (Gini and Information Gain), because it does not only measure the impurity but it also reflects the distribution of each covariate in the node, i.e., employing more discriminating covariates to split the data at each node. From a computational prospective, a new algorithm is proposed and implemented employing the new proposed measure in the growth of a tree. In order to show how our proposal works, a simulation exercise has been carried out. The results obtained in the simulation framework suggest that our proposal significantly outperforms impurity measures commonly adopted in classification tree modeling. Moreover, the empirical evidence on real data shows that Polarized Classification Tree models are competitive and sometimes better with respect to classical classification tree models.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Gini impurity index"

1

Hansén, Jacob, and Axel Gustafsson. "A Study on Comparison Websites in the Airline Industry and Using CART Methods to Determine Key Parameters in Flight Search Conversion." Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254309.

Full text
Abstract:
This bachelor thesis in applied mathematics and industrial engineering and management aimed to identify relationships between search parameters in flight comparison search engines and the exit conversion rate, while also investigating how the emergence of such comparison search engines has impacted the airline industry. To identify such relationships, several classification models were employed in conjunction with several sampling methods to produce a predictive model using the program R. To investigate the impact of the emergence of comparison websites, Porter's 5 forces and a SWOT - analysis were employed to analyze findings of a literature study and a qualitative interview. The classification models developed performed poorly with regards to several assessments metrics which suggested that there were little to no significance in the relationship between the search parameters investigated and exit conversion rate. Porter's 5 forces and the SWOT-analysis suggested that the competitive landscape of the airline industry has become more competitive and that airlines which do not manage to adapt to this changing market environment will experience decreasing profitability.
Detta kandidatexamensarbete inriktat på tillämpad matematik och industriell ekonomi syftade till att identifiera samband mellan sökparametrar från flygsökmotorer och konverteringsgraden för utträde till ett flygbolags hemsida, och samtidigt undersöka hur uppkomsten av flygsökmotorer har påverkat flygindustrin för flygbolag. För att identifiera sådana samband, tillämpades flera klassificeringsmodeller tillsammans med stickprovsmetoder för att bygga en predikativ modell i programmet R. För att undersöka påverkan av flygsökmotorer tillämpades Porters 5 krafter och SWOT-analys som teoretiska ramverk för att analysera information uppsamlad genom en litteraturstudie och en intervju. Klassificeringsmodellerna som byggdes presterade undermåligt med avseende på flera utvärderingsmått, vilket antydde att det fanns lite eller inget samband mellan de undersökta sökparametrarna och konverteringsgraden för utträde. Porters 5 krafter och SWOT-analysen visade att flygindustrin hade blivit mer konkurrensutsatt och att flygbolag som inte lyckas anpassa sig efter en omgivning i ändring kommer att uppleva minskande lönsamhet.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography