To see the other types of publications on this topic, follow the link: Stackoverflow.

Journal articles on the topic 'Stackoverflow'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Stackoverflow.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Joorabchi, Arash, Michael English, and Abdulhussain E. Mahdi. "Text mining stackoverflow." Journal of Enterprise Information Management 29, no. 2 (2016): 255–75. http://dx.doi.org/10.1108/jeim-11-2014-0109.

Full text
Abstract:
Purpose – The use of social media and in particular community Question Answering (Q & A) websites by learners has increased significantly in recent years. The vast amounts of data posted on these sites provide an opportunity to investigate the topics under discussion and those receiving most attention. The purpose of this paper is to automatically analyse the content of a popular computer programming Q & A website, StackOverflow (SO), determine the exact topics of posted Q & As, and narrow down their categories to help determine subject difficulties of learners. By doing so, the authors have been able to rank identified topics and categories according to their frequencies, and therefore, mark the most asked about subjects and, hence, identify the most difficult and challenging topics commonly faced by learners of computer programming and software development. Design/methodology/approach – In this work the authors have adopted a heuristic research approach combined with a text mining approach to investigate the topics and categories of Q & A posts on the SO website. Almost 186,000 Q & A posts were analysed and their categories refined using Wikipedia as a crowd-sourced classification system. After identifying and counting the occurrence frequency of all the topics and categories, their semantic relationships were established. This data were then presented as a rich graph which could be visualized using graph visualization software such as Gephi. Findings – Reported results and corresponding discussion has given an indication that the insight gained from the process can be further refined and potentially used by instructors, teachers, and educators to pay more attention to and focus on the commonly occurring topics/subjects when designing their course material, delivery, and teaching methods. Research limitations/implications – The proposed approach limits the scope of the analysis to a subset of Q & As which contain one or more links to Wikipedia. Therefore, developing more sophisticated text mining methods capable of analysing a larger portion of available data would improve the accuracy and generalizability of the results. Originality/value – The application of text mining and data analytics technologies in education has created a new interdisciplinary field of research between the education and information sciences, called Educational Data Mining (EDM). The work presented in this paper falls under this field of research; and it is an early attempt at investigating the practical applications of text mining technologies in the area of computer science (CS) education.
APA, Harvard, Vancouver, ISO, and other styles
2

Xiong, Yunxiang, Zhangyuan Meng, Beijun Shen, and Wei Yin. "Developer Identity Linkage and Behavior Mining Across GitHub and StackOverflow." International Journal of Software Engineering and Knowledge Engineering 27, no. 09n10 (2017): 1409–25. http://dx.doi.org/10.1142/s0218194017400034.

Full text
Abstract:
Nowadays, software developers are increasingly involved in GitHub and StackOverflow, creating a lot of valuable data in the two communities. Researchers mine the information in these software communities to understand developer behaviors, while previous works mainly focus on mining data within a single community. In this paper, we propose a novel approach to developer identity linkage and behavior mining across GitHub and StackOverflow. This approach links the accounts from two communities using a CART decision tree, leveraging the features from usernames, user behaviors and writing styles. Then, it explores cross-site developer behaviors through [Formula: see text]-graph analysis, LDA-based topics clustering and cross-site tagging. We conducted several experiments to evaluate this approach. The results show that the precision and [Formula: see text]-score of our identity linkage method are higher than previous methods in software communities. Especially, we discovered that (1) active issue committers are also active question askers; (2) for most developers, the topics of their contents in GitHub are similar to those of those questions and answers in StackOverflow; (3) developers’ concerns in StackOverflow shift over the time of their current participating projects in GitHub; (4) developers’ concerns in GitHub are more relevant to their answers than questions and comments in StackOverflow.
APA, Harvard, Vancouver, ISO, and other styles
3

Putra, Bagus Geriansyah, and Naim Rochmawati. "Klasifikasi Berdasarkan Question dalam Stack Overflow Menggunakan Algoritma Naïve Bayes." Journal of Informatics and Computer Science (JINACS) 2, no. 04 (2021): 259–67. http://dx.doi.org/10.26740/jinacs.v2n04.p259-267.

Full text
Abstract:
Abstrak—Stackoverflow merupakan sebuah website yang menyediakan banyak informasi tentang pemrograman. Pengguna dapat berinteraksi dengan pengguna lainnya dalam sebuah forum diskusi yang diajukan. Pengguna dapat mengajukan sebuah pertanyaan yang kemudian akan ditanggapi oleh pengguna lain. Ketika mengajukan sebuah pertanyaan, pengguna harus memasukkan kategori yang tepat pada pertanyaan yang diajukan agar mendapatkan respons atau jawaban yang sesuai. Berdasarkan beberapa kasus yang terjadi masih banyak pengguna website mengalami kebingungan ketika memilih kategori pertanyaan yang diajukan. Akibatnya, pertanyaan yang diajukan tidak mendapat respons yang tepat atau kurang sesuai. Sehingga, penelitian ini diajukan untuk membantu proses pengkategorian pertanyaan pada website Stackoverflow. Penelitian menggunakan Algoritma Naïve Bayes untuk memprediksi kategori pertanyaan yang diajukan. Pada penelitian ini dilakukan beberapa proses, dimulai dengan proses input dataset dilanjutkan dengan pembacaan file dataset. Kemudian dataset akan melalui preprocessing yang dilanjutkan dengan pembobotan dan proses ekstraksi fitur dengan Algoritma TF-IDF. Selanjutnya, data diproses menggunakan Algoritma Naïve Bayes yang akan menghasilkan kategori pertanyaan. Selanjutnya dilakukan proses evaluasi model untuk menentukan model terbaik yang akan digunakan untuk tampilan antarmuka aplikasi. Hasil yang didapat dari tahap evaluasi model dengan 4 kali percobaan menggunakan 10.000-40.000 data menghasilkan nilai akurasi, precision, recall, dan f1-score tertinggi sebesar 75%, 75%, 75% dan 74%. Dari hasil pengujian yang telah dilakukan Algoritma Naïve Bayes dapat digunakan sebagai klasifikasi text dan menghasilkan nilai yang cukup baik.
 Kata Kunci— text mining, Algoritma Naïve Bayes, stackoverflow, Algoritma TF-IDF
APA, Harvard, Vancouver, ISO, and other styles
4

Singh, Prabhnoor, Rajkanwar Chopra, Ojasvi Sharma, and Rekha Singla. "Stackoverflow tag prediction using tag associations and code analysis." Journal of Discrete Mathematical Sciences and Cryptography 23, no. 1 (2020): 35–43. http://dx.doi.org/10.1080/09720529.2020.1721857.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Pagano, Dennis, and Walid Maalej. "How Do Developers Blog?" ACM SIGSOFT Software Engineering Notes 46, no. 3 (2021): 26–29. http://dx.doi.org/10.1145/3468744.3468753.

Full text
Abstract:
A decade ago, the rise of GitHub and StackOverflow as social version control and knowledge sharing environments was about to start. Social media like Twitter were mocked by some software engineering researchers and practitioners as "tools for kids not professionals". At that time, we published one of the first papers [12] on social media in software engineering at MSR 2011, the Mining Software Repositories Conference.
APA, Harvard, Vancouver, ISO, and other styles
6

Sankha Subhra Paul, R. R., and Ashish Tripathi. "Social Influence and learning pattern analysis: Case studies in Stackoverflow." International Journal of IT-based Social Welfare Promotion and Management 2, no. 1 (2015): 1–10. http://dx.doi.org/10.21742/ijswpm.2015.2.1.01.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Dargahi Nobari, Arash, Mahmood Neshati, and Sajad Sotudeh Gharebagh. "Quality-aware skill translation models for expert finding on StackOverflow." Information Systems 87 (January 2020): 101413. http://dx.doi.org/10.1016/j.is.2019.07.003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Huang, Weizhi, Wenkai Mo, Beijun Shen, Yu Yang, and Ning Li. "Automatically Modeling Developer Programming Ability and Interest Across Software Communities." International Journal of Software Engineering and Knowledge Engineering 26, no. 09n10 (2016): 1493–510. http://dx.doi.org/10.1142/s0218194016400143.

Full text
Abstract:
Developer profile plays an important role in software project planning, developer recommendation, personnel training, and other tasks. Modeling the ability and interest of developers is its key issue. However, most existing approaches require manual assessment, like 360[Formula: see text] performance evaluation. With the emergence of social networking sites such as StackOverflow and Github, a vast amount of developer information is created on a daily basis. Such personal and social context data has huge potential to support automatic and effective developer ability evaluation and interest mining. In this paper, we propose CPDScorer, a novel approach for modeling and scoring the programming ability and interest of developers through mining heterogeneous information from both community question answering (CQA) sites and open-source software (OSS) communities. CPDScorer analyzes the questions and answers posted in CQA sites, and evaluates the projects submitted in OSS communities to assign expertise scores as well as interest scores to developers, considering both the quantitative and qualitative factors. When profiling developer's ability and interest, a programming term extraction algorithm is also designed based on set covering. We have conducted experiments on StackOverflow and Github to measure the effectiveness of CPDScorer. The results show that our approach is feasible and practical in user programming ability and interest modeling. In particular, the precision of our approach reaches 80%.
APA, Harvard, Vancouver, ISO, and other styles
9

Abdalkareem, Rabe, Emad Shihab, and Juergen Rilling. "On code reuse from StackOverflow: An exploratory study on Android apps." Information and Software Technology 88 (August 2017): 148–58. http://dx.doi.org/10.1016/j.infsof.2017.04.005.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Shejwalkar, Virat, Arun Ganesh, Rajiv Mathews, et al. "Recycling Scraps: Improving Private Learning by Leveraging Checkpoints]{Recycling Scraps: Improving Private Learning by Leveraging Checkpoints." Proceedings on Privacy Enhancing Technologies 2025, no. 2 (2025): 607–28. https://doi.org/10.56553/popets-2025-0079.

Full text
Abstract:
DP training pipelines for modern neural networks are iterative and generate multiple checkpoints. However, all except the final checkpoint are discarded after training. In this work, we propose novel methods to utilize intermediate checkpoints to improve prediction accuracy and estimate uncertainty in DP predictions. First, we design a general framework that uses aggregates of intermediate checkpoints during training to increase the accuracy of DP ML techniques. Specifically, we demonstrate that training over aggregates can provide significant gains in prediction accuracy over the existing state-of-the-art for StackOverflow, CIFAR10 and CIFAR100 datasets. For instance, we improve the state-of-the-art DP StackOverflow accuracies to 22.74% (+2.06% relative) for epsilon=8.2, and 23.90% (+2.09%) for epsilon=18.9. Furthermore, these gains magnify in settings with periodically varying training data distributions. We also demonstrate that our methods achieve relative improvements of 0.54% and 62.6% in terms of utility and variance, on a proprietary, production-grade pCVR task. Lastly, we initiate an exploration into estimating the uncertainty (variance) that DP noise adds in the predictions of DP ML models. We prove that, under standard assumptions on the loss function, the sample variance from last few checkpoints provides a good approximation of the variance of the final model of a DP run. Empirically, we show that the last few checkpoints can provide a reasonable lower bound for the variance of a converged DP model. Crucially, all the methods proposed in this paper operate on a single training run of the DP ML technique, thus incurring no additional privacy cost.
APA, Harvard, Vancouver, ISO, and other styles
11

Halachev, Petar, Aleksandra Todeva, Gergana Georgieva, and Marina Jekova. "CONTEMPORARY ASPECTS IN THE SELECTION OF TOOLS AND TECHNOLOGIES FOR DEVELOPING WEB SITES." International Conference on Technics, Technologies and Education, no. 1 (2018): 125–31. http://dx.doi.org/10.15547/ictte.2018.04.002.

Full text
Abstract:
he report explores and analyzes the application of the most popular programming languages from different organizations: GitHub; Stackoverflow; the TIOBE's Community index. The main client technologies: HTML; CSS; JavaScript; Typescript are presented and analysed. Features are characterized and the advantages and the disadvantages of the server technologies are described: Java; PHP; Python; Ruby. The application areas for web site development technologies have been defined. The creation of a quality web site is a complex and complicated process, but by observing some guidelines and recommendations in the work process can help to select the tools and the technologies in its design and development.
APA, Harvard, Vancouver, ISO, and other styles
12

Greenberg, Jane, Angela Murillo, and John A. Kunze. "Ontological Empowerment: Sustainability via Ownership." Advances in Classification Research Online 23, no. 1 (2013): 47. http://dx.doi.org/10.7152/acro.v23i1.14258.

Full text
Abstract:
<p>Positive impacts associated with urban housing/home ownership programs motivate us to study this topic in relation to ontologies. This paper reviews ontological dependence and presents early work underway in the DataONE Preservation and Metadata Working Group (PAMWG) to collectively leverage existing metadata schemes and ontologies. The paper introduces a high-level set of functional requirements and the stackoverflow model that may be used detect highly rated metadata or ontological properties to from a loose cannon for describing scientific data. The long term goal is to establish community identity and rhythm supporting a sustainable ontology/metadata driven workflow.</p>
APA, Harvard, Vancouver, ISO, and other styles
13

Anwar, Zeeshan, Hammad Afzal, Seifedine Kadry, and Xiaochun Cheng. "Semantic Web Approaches in Stack Overflow." International Journal on Semantic Web and Information Systems 20, no. 1 (2024): 1–61. http://dx.doi.org/10.4018/ijswis.358617.

Full text
Abstract:
StackOverflow (SO), a prominent question-answering site for programming, has amassed a vast repository of user-generated content since its inception in 2008. This paper conducts a thorough analysis of research trends on SO, examining 170 publications from 2008 to 2019. Utilizing qualitative and quantitative methods, the study categorizes papers using literature review and Latent Dirichlet Allocation (LDA), identifying 62 topics grouped into 8 main categories. Additionally, it highlights tools developed by researchers using SO data sets, showcasing their practical applications. The analysis also identifies research gaps and proposes future directions for each research area. This study serves as a valuable resource for practitioners and researchers interested in utilizing community data sets, offering insights into existing work, essential tools and techniques, and potential avenues for future research.
APA, Harvard, Vancouver, ISO, and other styles
14

Li, Peng, Yeye He, Cong Yan, Yue Wang, and Surajit Chaudhuri. "Auto-Tables: Relationalize Tables without Using Examples." ACM SIGMOD Record 53, no. 1 (2024): 76–85. http://dx.doi.org/10.1145/3665252.3665269.

Full text
Abstract:
Relational tables, where each row corresponds to an entity and each column corresponds to an attribute, have been the standard for tables in relational databases. However, such a standard cannot be taken for granted when dealing with tables "in the wild". Our survey of real spreadsheettables and web-tables shows that over 30% of such tables do not conform to the relational standard, for which complex table-restructuring transformations are needed before these tables can be queried easily using SQL-based tools. Unfortunately, the required transformations are non-trivial to program, which has become a substantial pain point for technical and non-technical users alike, as evidenced by large numbers of forum questions in places like StackOverflow and Excel/Tableau forums.
APA, Harvard, Vancouver, ISO, and other styles
15

Diyanati, Ahmad, Behrooz Shahi Sheykhahmadloo, Seyed Mostafa Fakhrahmad, Mohammad Hadi Sadredini, and Mohammad Hassan Diyanati. "A proposed approach to determining expertise level of StackOverflow programmers based on mining of user comments." Journal of Computer Languages 61 (December 2020): 101000. http://dx.doi.org/10.1016/j.cola.2020.101000.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Joorabchi, Arash, Michael English, and Abdulhussain E. Mahdi. "Automatic mapping of user tags to Wikipedia concepts: The case of a Q&A website – StackOverflow." Journal of Information Science 41, no. 5 (2015): 570–83. http://dx.doi.org/10.1177/0165551515586669.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Osadcha, Kateryna P., and Viacheslav V. Osadchyi. "The use of cloud computing technology in professional training of future programmers." CTE Workshop Proceedings 8 (March 19, 2021): 155–64. http://dx.doi.org/10.55056/cte.229.

Full text
Abstract:
The article provides a brief analysis of the current state of the study of cloud technologies by future software engineers at foreign and Ukrainian universities. The author experience in the application of cloud technologies in the training of future software engineers in Ukraine is presented. The application of cloud business automation systems, online services to monitor the implementation of the software projects, Google services for collaboration, planning and productivity while studying professional disciplines and carrying out diploma projects is described. Based on the survey conducted at Stackoverflow, the state of application of cloud technologies by software engineers around the world has been analyzed. The cloud technologies that are not studied at the analyzed universities of Ukraine and those that are not popular with software developers in the world, but studied at Ukrainian universities by future software engineers are outlined. Conclusions are made on the modernization of training programs for future software engineers. Topics for the study of cloud technologies by future software engineers in the content of professional disciplines are proposed.
APA, Harvard, Vancouver, ISO, and other styles
18

Anneser, Christoph, Nesime Tatbul, David Cohen, et al. "AutoSteer: Learned Query Optimization for Any SQL Database." Proceedings of the VLDB Endowment 16, no. 12 (2023): 3515–27. http://dx.doi.org/10.14778/3611540.3611544.

Full text
Abstract:
This paper presents AutoSteer, a learning-based solution that automatically drives query optimization in any SQL database that exposes tunable optimizer knobs. AutoSteer builds on the Bandit optimizer (Bao) and extends it with new capabilities (e.g., automated hint-set discovery) to minimize integration effort and facilitate usability in both monolithic and disaggregated SQL systems. We successfully applied AutoSteer on PostgreSQL, PrestoDB, Spark-SQL, MySQL, and DuckDB - five popular open-source database engines with diverse query optimizers. We then conducted a detailed experimental evaluation with public benchmarks (JOB, Stackoverflow, TPC-DS) and a production workload from Meta's PrestoDB deployments. Our evaluation shows that AutoSteer can not only outperform these engines' native query optimizers (e.g., up to 40% improvements for PrestoDB) but can also match the performance of Bao-for-PostgreSQL with reduced human supervision and increased adaptivity, as it replaces Bao's static, expert-picked hint-sets with those that are automatically discovered. We also provide an open-source implementation of AutoSteer together with a visual tool for interactive use by query optimization experts.
APA, Harvard, Vancouver, ISO, and other styles
19

Dong, Yichen, Zhen Wang, and Tianjun Wu. "An open intent detection model optimized for datasets based on the Bert large model." Applied and Computational Engineering 47, no. 1 (2024): 225–31. http://dx.doi.org/10.54254/2755-2721/47/20241381.

Full text
Abstract:
Within current task-oriented dialogue systems, the focus of intent detection predominantly centers on closed domains. Nevertheless, in real-world usage scenarios, a substantial proportion of interactions fall into the open-domain category. User intentions frequently transcend predefined boundaries, giving rise to a multitude of out-of-domain intents, which pose a formidable challenge to existing models, ultimately leading to diminished recognition rates and accuracy. The demand for open intent detection models is increasing in today's society to address this issue effectively. This paper proposes a method to optimize datasets, thereby enhancing the training accuracy of open intent detection models. Specifically, this paper employs the Adaptive Decision Boundary Learning algorithm, which is currently popular in open intent detection. Leveraging this algorithm, this paper suggests using the K-means clustering algorithm to refine the intent labels within the dataset. This process helps identify and remove outliers in the dataset, making the distinction between known domain and open-domain intent labels more precise. Experimental results on two datasets, banking77 and stackoverflow, demonstrate the effectiveness of our approach in significantly improving model accuracy.
APA, Harvard, Vancouver, ISO, and other styles
20

Bozic, Bojan, Andre Rios, and Sarah Jane Delany. "Comparing tagging suggestion models on discrete corpora." International Journal of Web Information Systems 16, no. 2 (2020): 201–21. http://dx.doi.org/10.1108/ijwis-08-2019-0035.

Full text
Abstract:
Purpose This paper aims to investigate the methods for the prediction of tags on a textual corpus that describes diverse data sets based on short messages; as an example, the authors demonstrate the usage of methods based on hotel staff inputs in a ticketing system as well as the publicly available StackOverflow corpus. The aim is to improve the tagging process and find the most suitable method for suggesting tags for a new text entry. Design/methodology/approach The paper consists of two parts: exploration of existing sample data, which includes statistical analysis and visualisation of the data to provide an overview, and evaluation of tag prediction approaches. The authors have included different approaches from different research fields to cover a broad spectrum of possible solutions. As a result, the authors have tested a machine learning model for multi-label classification (using gradient boosting), a statistical approach (using frequency heuristics) and three similarity-based classification approaches (nearest centroid, k-nearest neighbours (k-NN) and naive Bayes). The experiment that compares the approaches uses recall to measure the quality of results. Finally, the authors provide a recommendation of the modelling approach that produces the best accuracy in terms of tag prediction on the sample data. Findings The authors have calculated the performance of each method against the test data set by measuring recall. The authors show recall for each method with different features (except for frequency heuristics, which does not provide the option to add additional features) for the dmbook pro and StackOverflow data sets. k-NN clearly provides the best recall. As k-NN turned out to provide the best results, the authors have performed further experiments with values of k from 1–10. This helped us to observe the impact of the number of neighbours used on the performance and to identify the best value for k. Originality/value The value and originality of the paper are given by extensive experiments with several methods from different domains. The authors have used probabilistic methods, such as naive Bayes, statistical methods, such as frequency heuristics, and similarity approaches, such as k-NN. Furthermore, the authors have produced results on an industrial-scale data set that has been provided by a company and used directly in their project, as well as a community-based data set with a large amount of data and dimensionality. The study results can be used to select a model based on diverse corpora for a specific use case, taking into account advantages and disadvantages when applying the model to your data.
APA, Harvard, Vancouver, ISO, and other styles
21

Ilhami, Mirza. "Tren dan Peluang Cross-Platform Mobile App untuk Developer Pemula." KONSTELASI: Konvergensi Teknologi dan Sistem Informasi 1, no. 2 (2021): 402–11. http://dx.doi.org/10.24002/konstelasi.v1i2.4320.

Full text
Abstract:
The development of cross-platform based mobile applications has grown rapidly in the last five years. From a developer's point of view, the many technologies that must be studied make them confused about which programming language to master because it will correlate with the time required and technology adoption by the industrial world. From an industry point of view, the challenge faced is determining the right framework or programming language and tools for their application. It also correlates with development time and costs, as well as finding the best talent with the technology. The purpose of this paper is to provide insight, trends and perspectives to new programmers who want to start a career and industry related to technology used in cross-platform mobile applications by looking at the results of surveys conducted by Stackoverflow, SateOfJS and Ionic Framework for the past 5 years. So this will help new programmers and industry in choosing the right technology and framework. The author found that currently JavaScript has mastered the frontend, backend and test tools. Regarding cross-platform frameworks, we find the Ionic Framework, React Native to be the most widely used.
APA, Harvard, Vancouver, ISO, and other styles
22

Mustafa, Sohaib, Wen Zhang, Muhammad Mateen Naveed, and Dur e. Adan. "Using social Cognitive theory to reengage dormant users in question and answer Communities: A case study of active StackOverflow participants." Electronic Commerce Research and Applications 68 (November 2024): 101450. http://dx.doi.org/10.1016/j.elerap.2024.101450.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Ceh, Simon M., and Mathias Benedek. "Where to Share? A Systematic Investigation of Creative Behavior on Online Platforms." Creativity. Theories – Research - Applications 8, no. 1 (2021): 108–23. http://dx.doi.org/10.2478/ctra-2021-0008.

Full text
Abstract:
Abstract Digitalization, underpinned by the ongoing pandemic, has transferred many of our everyday activities to online places. In this study, we wanted to find out what online outlets people use to share their creative work and why they do it. We found that most people posted creative work online at least a few times per year. They especially shared creative content related to creative cooking, visual art, and literature but hardly related to performing art. YouTube, Facebook, and Instagram were the three platforms with the highest familiarity and usage rates; among these, YouTube was most strongly used passively (i.e., to view creative content), while Instagram was most strongly used actively (i.e., to post one’s own creative content). We could further differentiate platforms that were domain-specific (e.g., Stackoverflow for scientific/technological creativity) from platforms that offer a broader variety of creative content (e.g., Reddit, Blogger). The reasoning behind posting one’s creative work online resembled a mixture of technological facilitation, alongside heightened accessibility that allows for feedback and bringing pleasure to one’s followers and friends. All in all, this study provides a first overview of where and why people share their creative products online, shedding light on timely forms of creative expression.
APA, Harvard, Vancouver, ISO, and other styles
24

Tang, Mingjing, Tong Li, Wei Wang, Rui Zhu, Zifei Ma, and Yahui Tang. "Software Knowledge Entity Relation Extraction with Entity-Aware and Syntactic Dependency Structure Information." Scientific Programming 2021 (December 22, 2021): 1–13. http://dx.doi.org/10.1155/2021/7466114.

Full text
Abstract:
Software knowledge community contains a large scale of software knowledge entities with complex structure and rich semantic relations. Semantic relation extraction of software knowledge entities is a critical task for software knowledge graph construction, which has an important impact on knowledge graph based tasks such as software document generation and software expert recommendation. Due to the problems of entity sparsity, relation ambiguity, and the lack of annotated dataset in user-generated content of software knowledge community, it is difficult to apply existing methods of relation extraction in the software knowledge domain. To address these issues, we propose a novel software knowledge entity relation extraction model which incorporates entity-aware information with syntactic dependency information. Bidirectional Gated Recurrent Unit (Bi-GRU) and Graph Convolutional Networks (GCN) are used to learn the features of contextual semantic representation and syntactic dependency representation, respectively. To obtain more syntactic dependency information, a weight graph convolutional network based on Newton’s cooling law is constructed by calculating a weight adjacency matrix. Specifically, an entity-aware attention mechanism is proposed to integrate the entity information and syntactic dependency information to improve the prediction performance of the model. Experiments are conducted on a dataset which is constructed based on texts of the StackOverflow and show that the proposed model has better performance than the benchmark models.
APA, Harvard, Vancouver, ISO, and other styles
25

Li, Peng, Yeye He, Cong Yan, Yue Wang, and Surajit Chaudhuri. "Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize Tables without Using Examples." Proceedings of the VLDB Endowment 16, no. 11 (2023): 3391–403. http://dx.doi.org/10.14778/3611479.3611534.

Full text
Abstract:
Relational tables, where each row corresponds to an entity and each column corresponds to an attribute, have been the standard for tables in relational databases. However, such a standard cannot be taken for granted when dealing with tables "in the wild". Our survey of real spreadsheet-tables and web-tables shows that over 30% of such tables do not conform to the relational standard, for which complex table-restructuring transformations are needed before these tables can be queried easily using SQL-based tools. Unfortunately, the required transformations are non-trivial to program, which has become a substantial pain point for technical and non-technical users alike, as evidenced by large numbers of forum questions in places like StackOverflow and Excel/Tableau forums. We develop an Auto-Tables system that can automatically synthesize pipelines with multi-step transformations (in Python or other languages), to transform non-relational tables into standard relational forms for downstream analytics, obviating the need for users to manually program transformations. We compile an extensive benchmark for this new task, by collecting 244 real test cases from user spreadsheets and online forums. Our evaluation suggests that Auto-Tables can successfully synthesize transformations for over 70% of test cases at interactive speeds, without requiring any input from users, making this an effective tool for both technical and non-technical users to prepare data for analytics.
APA, Harvard, Vancouver, ISO, and other styles
26

Ahmed, Majid Hameed, Sabrina Tiun, Nazlia Omar, and Nor Samsiah Sani. "A multi-view representation technique based on principal component analysis for enhanced short text clustering." PLOS ONE 19, no. 8 (2024): e0309206. http://dx.doi.org/10.1371/journal.pone.0309206.

Full text
Abstract:
Clustering texts together is an essential task in data mining and information retrieval, whose aim is to group unlabeled texts into meaningful clusters that facilitate extracting and understanding useful information from large volumes of textual data. However, clustering short texts (STC) is complex because they typically contain sparse, ambiguous, noisy, and lacking information. One of the challenges for STC is finding a proper representation for short text documents to generate cohesive clusters. However, typically, STC considers only a single-view representation to do clustering. The single-view representation is inefficient for representing text due to its inability to represent different aspects of the target text. In this paper, we propose the most suitable multi-view representation (MVR) (by finding the best combination of different single-view representations) to enhance STC. Our work will explore different types of MVR based on different sets of single-view representation combinations. The combination of the single-view representations is done by a fixed length concatenation via Principal Component analysis (PCA) technique. Three standard datasets (Twitter, Google News, and StackOverflow) are used to evaluate the performances of various sets of MVRs on STC. Based on experimental results, the best combination of single-view representation as an effective for STC was the 5-views MVR (a combination of BERT, GPT, TF-IDF, FastText, and GloVe). Based on that, we can conclude that MVR improves the performance of STC; however, the design for MVR requires selective single-view representations.
APA, Harvard, Vancouver, ISO, and other styles
27

Liao, Zhifang, Ningwei Wang, Shengzong Liu, Yan Zhang, Hui Liu, and Qi Zhang. "Identification-Method Research for Open-Source Software Ecosystems." Symmetry 11, no. 2 (2019): 182. http://dx.doi.org/10.3390/sym11020182.

Full text
Abstract:
In recent years, open-source software (OSS) development has grown, with many developers around the world working on different OSS projects. A variety of open-source software ecosystems have emerged, for instance, GitHub, StackOverflow, and SourceForge. One of the most typical social-programming and code-hosting sites, GitHub, has amassed numerous open-source-software projects and developers in the same virtual collaboration platform. Since GitHub itself is a large open-source community, it hosts a collection of software projects that are developed together and coevolve. The great challenge here is how to identify the relationship between these projects, i.e., project relevance. Software-ecosystem identification is the basis of other studies in the ecosystem. Therefore, how to extract useful information in GitHub and identify software ecosystems is particularly important, and it is also a research area in symmetry. In this paper, a Topic-based Project Knowledge Metrics Framework (TPKMF) is proposed. By collecting the multisource dataset of an open-source ecosystem, project-relevance analysis of the open-source software is carried out on the basis of software-ecosystem identification. Then, we used our Spectral Clustering algorithm based on Core Project (CP-SC) to identify software-ecosystem projects and further identify software ecosystems. We verified that most software ecosystems usually contain a core software project, and most other projects are associated with it. Furthermore, we analyzed the characteristics of the ecosystem, and we also found that interactive information has greater impact on project relevance. Finally, we summarize the Topic-based Project Knowledge Metrics Framework.
APA, Harvard, Vancouver, ISO, and other styles
28

Mohamad Johari, Mohd Ainal Farhan, and Adila Firdaus Arbain. "Transforming Workplace Efficiency: Using Agile Development Methodologies for Developing Meeting Management System." International Journal of Innovative Computing 14, no. 2 (2024): 117–27. http://dx.doi.org/10.11113/ijic.v14n2.505.

Full text
Abstract:
This In today's digital era, individuals are seamlessly connected across various digital platforms, from social media giants like Facebook, Twitter, and Instagram to knowledge-sharing hubs like StackOverflow and Quora. The relentless advancement of digital technologies has ushered in a wave of unprecedented accessibility and connectivity. While this transformation has brought tremendous benefits to society, organizations face new and intricate challenges in adapting to evolving task preferences, performance expectations, and operational requirements. In response to this dynamic landscape, this paper proposes an innovative paradigm shift aimed at enhancing quality, efficiency, and time-to-completion for critical tasks. Central to this transformation is the concept of digitizing manual processes, which offers a transformative approach capable of significantly reducing task durations while ensuring heightened efficiency and minimizing associated risks. One illustrative example involves the automation of employee attendance tracking within a company, seamlessly integrating data into a centralized database. This digitized process not only expedites task completion but also facilitates the generation of detailed reports presented graphically, allowing human resource managers to make informed decisions swiftly. This is just one instance where digitalization outperforms traditional manual methods, leading us to conclude that digitalization is a pivotal strategy for boosting organizational productivity. Moreover, traditional practices in managing meetings, characterized by manual participant notifications, have presented substantial inefficiencies. Secretaries often bear the brunt of coordinating meeting schedules, making phone calls, and sending emails to ensure all involved parties possess accurate information. This paper embarks on a comprehensive investigation to ameliorate this situation by injecting a digitalization approach into the existing meeting management system.
APA, Harvard, Vancouver, ISO, and other styles
29

Kumar, Akshi, and Saurabh Raj Sangwan. "Expert Finding in Community Question-Answering for Post Recommendation." International Journal of Engineering & Technology 7, no. 3.4 (2018): 151. http://dx.doi.org/10.14419/ijet.v7i3.4.16764.

Full text
Abstract:
Community question answering system is a perfect example of platform where people participate to seek expertise on their topic of interest. But information overload, finding the expertise level of users and trustworthy answers remain key challenges within these communities. Moreover, people do not look for personal advices but expert views on such platforms therefore; expert finding is an integral part of these communities. In order to trust someone's opinion who is not known in person by the users of the community, it is necessary to find the credibility of such person. By determining expertise levels of users, authenticity of their posts can easily be determined. Also, by identifying experts, each expert will be shown relevant posts to indulge in so that he can use his knowledge and skills to give valid and correct answers. For users too, it will be easy to find reliable answers, once they get to know the expertise level of the answerers. Motivated by these facts, we put forward a framework for finding experts in online question answer community (stackoverflow) referred to as Expert Recommender System which uses a well-recognized global-trust metric, PageRankTM for finding experts in the community building a Trust-based system and then uses collaborative filtering to find similar experts based on their level of expertise and their topics of interests to a particular user. Once we have the top- k similar experts to a given expert, that expert is recommended with posts to collaborate upon, based on activities done by his top-k neighbor experts. The framework is evaluated for its performance and it clearly indicates the effectiveness of the system.
APA, Harvard, Vancouver, ISO, and other styles
30

R., Jayashree, and Christy A. "Enhanced User-driven Ranking System with Splay Tree." TELKOMNIKA Telecommunication, Computing, Electronics and Control 16, no. 1 (2018): 432–44. https://doi.org/10.12928/TELKOMNIKA.v16i1.5875.

Full text
Abstract:
E-learning is one of the information and communication technology products used for teaching and learning process [35]. An efficient and effective way to construct trust relationship among peer users in e-learning environment is ranking. User-driven ranking systems are based only on the feedback or rating provided by the users. In [46-48] the authors provide a variety of trust and reputation methods. Certified Belief in Strength (CBS) [45] is a novel trust measurement method based on reputation and strength . In [38] author presents a recommendation system based on the relevant feedback review to predict the user's interests, that are ranked based on the recommendations history they provide previously. Users with higher rating obtain high reputation compared to less scored users. In question answering websites like StackOverflow, new or low scored users are ignored by the community. This discourage them and their involvement with the community reduces further down, as power law states, alleged low users are pu shed to the bottom of the ranking list. Avoid this condition by encouraging less reputed users and prevent them from moving further down in ranking level. Thus, low reputed users are provided with few more chances to participate actively in the e-learning environments. A splay tree is a Binary Search Tree with self-balancing skill. The splay tree brings the recently accessed item to the top of the tree, thus active users are always on the top of the tree. A splay tree is used to represent user's ranks, and to semi-splay low ranked users again in the tree thus preventing them from further drowning in the ranking list. The focus of this research work is to find and enhance low reputed users in reputation system by providing few more chances to take part actively in the e-learning environment using the splay tree. Normalized discounted cumulative gain (NDCG) acts as a decision part for identifying drowning users.
APA, Harvard, Vancouver, ISO, and other styles
31

Christoforaki, Maria, and Panagiotis Ipeirotis. "STEP: A Scalable Testing and Evaluation Platform." Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 2 (September 5, 2014): 41–49. http://dx.doi.org/10.1609/hcomp.v2i1.13159.

Full text
Abstract:
The emergence of online crowdsourcing sites, online work platforms, and evenMassive Open Online Courses (MOOCs), has created an increasing need for reliably evaluating the skills of the participating users in a scalable way.Many platforms already allow users to take online tests and verify their skills, but the existing approaches face many problems. First of all, cheating is very common in online testing without supervision, as the test questions often "leak" and become easily available online together with the answers.Second, technical skills, such as programming, require the tests to be frequently updated in order to reflect the current state-of-the-art. Third,there is very limited evaluation of the tests themselves, and how effectively they measure the skill that the users are tested for. In this paper, we present a Scalable Testing and Evaluation Platform (STEP),that allows continuous generation and evaluation of test questions. STEP leverages already available content, on Question Answering sites such as StackOverflow and re-purposes these questions to generate tests. The system utilizes a crowdsourcing component for the editing of the questions, while it uses automated techniques for identifying promising QA threads that can be successfully re-purposed for testing. This continuous question generation decreases the impact of cheating and also creates questions that are closer to the real problems that the skill holder is expected to solve in real life.STEP also leverages the use of Item Response Theory to evaluate the quality of the questions. We also use external signals about the quality of the workers.These identify the questions that have the strongest predictive ability in distinguishing workers that have the potential to succeed in the online job marketplaces. Existing approaches contrast in using only internal consistency metrics to evaluate the questions. Finally, our system employs an automatic "leakage detector" that queries the Internet to identify leaked versions of our questions. We then mark these questions as "practice only," effectively removing them from the pool of questions used for evaluation. Our experimental evaluation shows that our system generates questions of comparable or higher quality compared to existing tests, with a cost of approximately 3-5 dollars per question, which is lower than the cost of licensing questions from existing test banks.
APA, Harvard, Vancouver, ISO, and other styles
32

Hu, Jiawei, and Bo Yang. "Posts Quality Prediction for StackOverflow Website." IEEE Access, 2024, 1. http://dx.doi.org/10.1109/access.2024.3440879.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Saraj, Singh Manes, and Baysal Olga. "How Often and What StackOverflow Posts Do Developers Reference in Their GitHub Projects?" MSR 2019, March 22, 2019. https://doi.org/10.5281/zenodo.2604130.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Yang, Yi, Ying Li, Kai Chen, and Jinghua Liu. "Jeu de mots paronomasia: a StackOverflow-driven bug discovery approach." Cybersecurity 6, no. 1 (2023). http://dx.doi.org/10.1186/s42400-023-00153-0.

Full text
Abstract:
AbstractLocating bug code snippets (short for BugCode) has been a complex problem throughout the history of software security, mainly because the constraints that define BugCode are obscure and hard to summarize. Previously, security analysts attempted to define such constraints manually (e.g., limiting buffer size to detect overflow), but were limited to the types of BugCode. Recent researchers address this problem by extracting constraints from program documentation, which shows the potential for API misuse. But for bugs beyond the scope of API misuse, such an approach becomes less effective since the corresponding constraints are not defined in documents, not to mention the programs without documentation In this paper, inspired by the fact that expert programmers often correct the BugCode on open forums such as StackOverflow, we design an approach to automatically extract knowledge from StackOverflow and leverage it to detect BugCode. As we all know, the contexts in StackOverflow come from ordinary developers. Their writing tends to be loosely organized and in various styles, which are more challenging to analyze than program documentation. To address the challenges, we design a custom tokenization approach to segment sentences and employ sentiment analysis to find the Controversial Sentences (CSs) that typically contain the constraints we need for code analysis. Then we use constituency parsing to extract knowledge from CSs, which helps locate BugCode. We evaluated our system on 41,144 comments from the questions tagged with Java and Android. The results show that our approach achieves 95.5% precision in discovering CSs. We have discovered 276 pieces of BugCode proved to be true through manual validation including an assigned CVE. 89.3% of the discovered bugs remained in the current version of answers, which are unknown to users.
APA, Harvard, Vancouver, ISO, and other styles
35

GENÇ, Adile, Ayça YURTSEVEN, Hacer ÖZYURT, and Özcan ÖZYURT. "STACKOVERFLOW'DA "BIG DATA" İLE İLGİLİ GÖNDERİLERİN KONU MODELLEME VE BİRLİKTELİK ANALİZİ İLE ÖZELLİKLERİNİN ÇIKARILMASI." Eskişehir Osmangazi Üniversitesi Mühendislik ve Mimarlık Fakültesi Dergisi, December 19, 2023. http://dx.doi.org/10.31796/ogummf.1375611.

Full text
Abstract:
Günümüz teknolojisinde internet kullanımının artması ile birlikte "Büyük Veri" kavramının ortaya çıkması kaçınılmaz olmuştur. 23 milyondan fazla soru ve 35 milyona yakın cevap barındırarak büyük veriye katkı sağlayan StackOverflow'da paylaşılan bilgilerin analizi güncel konu ve eğilimlerin belirlenmesi konusunda önemli çıkarımlar sunabilmektedir. StackOverflow'daki bu büyük ve dağınık veri kümesi üzerinde tartışmaların elle analiz edilmesi mümkün olmadığı için otomatik analiz yapabilecek yöntemlere ihtiyaç duyulmaktadır. Bu ihtiyacı gidermek için konu modelleme yaklaşımlarına başvurulmuştur. Konu modelleme alanında yapılan çalışmalarda Gizli Dirichlet Ataması (Latent Dirichlet Allocation - LDA) yöntemi oldukça tercih edilmiş ve başarısı ispatlanmıştır. Yürütülen çalışmada LDA yöntemi kullanılarak StackOverflow platformu üzerinde "Big Data" etiketli soruların ve bu soruların cevaplarının anlamsal analizi yapılmış olup büyük veri hakkında en çok konuşulan konuların %16’lık bir oran ile makine öğrenmesi/veri bilimi ve bellek yönetimi olduğu sonucuna varılmıştır. StackOverflow gönderilerinde kullanılan etiketlerle ayrı bir veri seti oluşturulmuş ve birliktelik analizi yapılmıştır. Bu aşamanın asıl amacı Apriori algoritması kullanarak görülemeyen ilişkileri ortaya çıkarmaktır. Elde edilen veriler sonucunda en yüksek oran ile 100 sorunun 25'inde bigdata etiketi ile hadoop etiketinin beraber kullanıldığı görülmüştür. Ek olarak hive etiketini kullanan biri %60 gibi bir ihtimalle hadoop ve bigdata etiketini de kullanmaktadır ve bu etiketlerin kullanım oranını 2.39 artırmaktadır.
APA, Harvard, Vancouver, ISO, and other styles
36

Bodke, Sneha, Ashwini Meher, and Kavita Shirsat. "Evaluating Answer Qualities on Q&A Community Sites (StackOverFlow)." SSRN Electronic Journal, 2019. http://dx.doi.org/10.2139/ssrn.3367706.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

GÜRCAN, Fatih, and Özcan ÖZYURT. "Stackoverflow gönderilerinde tartışılan trend konuların kelime frekans analizi ile belirlenmesi." Gümüşhane Üniversitesi Fen Bilimleri Enstitüsü Dergisi, February 9, 2021. http://dx.doi.org/10.17714/gumusfenbil.811123.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Amin, Ghadesi, Li Heng, and Lamothe Maxime. "What Causes Exceptions in Machine Learning Applications? Mining Machine Learning-Related Stack Traces on Stack Overflow." April 17, 2023. https://doi.org/10.5281/zenodo.7839032.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Tseng, Chun-Hsiung, and Jia-Rou Lin. "A semi-hierarchical clustering method for constructing knowledge trees from stackoverflow." Journal of Information Science, September 21, 2020, 016555152096103. http://dx.doi.org/10.1177/0165551520961035.

Full text
Abstract:
To help students learn how to programme, we have to give them a clear knowledge map and sufficient materials. Question-based websites, such as stackoverflow, are excellent information sources for this goal. However, for beginners, the process can be a little tricky since they may not know how to ask correct questions if they do not have sufficient background knowledge, and a knowledge tree is usually considered more helpful in such a scenario. In this research, a method to infer a knowledge tree automatically from the type of websites and to group documents based on the resulting knowledge tree is proposed. The proposed method mainly addresses two issues: first, the quality of tags cannot be guaranteed, and second, clustering-based methods usually generate the flat schema. The occurrence count and the co-occurrence ratio were used together to identify important tags. Then, an algorithm was developed to infer the hierarchical relationship between tags. Using these tags as centres, the clustering performance is better than applying k-means alone.
APA, Harvard, Vancouver, ISO, and other styles
40

Anwar, Zeeshan, Hammad Afzal, Ali Ahsan, Naima Iltaf, and Ayesha Maqbool. "A novel hybrid CNN-LSTM approach for assessing StackOverflow post quality." Journal of Intelligent Systems 32, no. 1 (2023). http://dx.doi.org/10.1515/jisys-2023-0057.

Full text
Abstract:
Abstract Maintaining the content quality on social media Q&A platforms is pivotal for user attraction and retention. Automating post quality assessment offers benefits such as reduced moderator workload, amplified community impact, enhanced expert user recognition, and importance to expert feedback. While existing approaches for post quality mainly employ binary classification, they often lack optimal feature selection. Our research introduces an automated system that categorizes features into textual, readability, format, and community dimensions. This system integrates 20 features belonging to the aforementioned categories, with a hybrid convolutional neural network–long short-term memory deep learning model for multi-class classification. Evaluation against baseline models and state-of-the-art methods demonstrates our system’s superiority, achieving a remarkable 21–23% accuracy enhancement. Furthermore, our system produced better results in terms of other metrics such as precision, recall, and F1 score.
APA, Harvard, Vancouver, ISO, and other styles
41

Firouzi, Ehsan, and Mohammad Ghafari. "Time to separate from StackOverflow and match with ChatGPT for encryption." Journal of Systems and Software, June 2024, 112135. http://dx.doi.org/10.1016/j.jss.2024.112135.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Michael Ayas, Hamdy, Philipp Leitner, and Regina Hebig. "An empirical study of the systemic and technical migration towards microservices." Empirical Software Engineering 28, no. 4 (2023). http://dx.doi.org/10.1007/s10664-023-10308-9.

Full text
Abstract:
Abstract Context As many organizations modernize their software architecture and transition to the cloud, migrations towards microservices become more popular. Even though such migrations help to achieve organizational agility and effectiveness in software development, they are also highly complex, long-running, and multi-faceted. Objective In this study we aim to comprehensively map the journey towards microservices and describe in detail what such a migration entails. In particular, we aim to discuss not only the technical migration, but also the long-term journey of change, on a systemic level. Method Our research method is an inductive, qualitative study on two data sources. Two main methodological steps take place – interviews and analysis of discussions from StackOverflow. The analysis of both, the 19 interviews and 215 StackOverflow discussions, is based on techniques found in grounded theory. Results Our results depict the migration journey, as it materializes within the migrating organization, from structural changes to specific technical changes that take place in the work of engineers. We provide an overview of how microservices migrations take place as well as a deconstruction of high level modes of change to specific solution outcomes. Our theory contains 2 modes of change taking place in migration iterations, 14 activities and 53 solution outcomes of engineers. One of our findings is on the architectural change that is iterative and needs both a long and short term perspective, including both business and technical understanding. In addition, we found that a big proportion of the technical migration has to do with setting up supporting artifacts and changing the paradigm that software is developed.
APA, Harvard, Vancouver, ISO, and other styles
43

"IMPACT OF USERS' MOTIVATION ON GAMIFIED CROWDSOURCING SYSTEMS: A CASE OF STACKOVERFLOW." Issues In Information Systems, 2018. http://dx.doi.org/10.48009/2_iis_2018_33-40.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Mustafa, Sohaib, Wen Zhang, and Muhammad Mateen Naveed. "What motivates online community contributors to contribute consistently? A case study on Stackoverflow netizens." Current Psychology, June 25, 2022. http://dx.doi.org/10.1007/s12144-022-03307-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

van der Linden, Dirk, Emma Williams, Joseph Hallett, and Awais Rashid. "The impact of surface features on choice of (in)secure answers by Stackoverflow readers." IEEE Transactions on Software Engineering, 2020, 1. http://dx.doi.org/10.1109/tse.2020.2981317.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Ahmed, Tanveer, and Abhishek Srivastava. "Understanding and evaluating the behavior of technical users. A study of developer interaction at StackOverflow." Human-centric Computing and Information Sciences 7, no. 1 (2017). http://dx.doi.org/10.1186/s13673-017-0091-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Soto de la Cruz, Ramon, Felix Agustín Castro-Espinoza, and Liz Soto. "Isodata-Based Method for Clustering Surveys Responses with Mixed Data: The 2021 StackOverflow Developer Survey." Computación y Sistemas 27, no. 1 (2023). http://dx.doi.org/10.13053/cys-27-1-4539.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Falci, Samuel Henrique, Fabiano Azevedo Dorça, Alessandro Vivas Andrade, and Daniel Henrique Mourão Falci. "A low complexity heuristic to solve a learning objects recommendation problem." Smart Learning Environments 7, no. 1 (2020). http://dx.doi.org/10.1186/s40561-020-00133-8.

Full text
Abstract:
Abstract The recommendation of learning objects in virtual learning environments has become the focus of research to improve online learning experience. Several approaches have been presented in an attempt to model the individual characteristics of the students and offer learning objects that best suit their particularities. Most of them, though, are impractical in real-world scenarios due to the high computational cost as a huge number of repositories offering learning objects such as Youtube, Wikipedia, Stackoverflow, Github, discussion forums, social networks and many others are available and each has a large amount of learning objects that can be retrieved. In this work, we propose a low complexity heuristic to solve this problem, comparing it to a classical mixed-integer linear programming model and classical genetic algorithm in varying dataset sizes that contain from 2000 to 1360000 learning objects. Performance and optimality were analyzed. The results showed that the proposed technique was only slightly suboptimal, while its computational cost was considerably smaller than the one presented by the linear optimization approach.
APA, Harvard, Vancouver, ISO, and other styles
49

Mustafa, Sohaib, Wen Zhang, and Muhammad Mateen Naveed. "How to mend the dormant user in Q&A communities? A social cognitive theory-based study of consistent geeks of StackOverflow." Behaviour & Information Technology, July 31, 2023, 1–20. http://dx.doi.org/10.1080/0144929x.2023.2237604.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Mak, Yen-Wei, Hui-Ngo Goh, and Amy Hui-Lan Lim. "Forum Text Processing and Summarization." JOIV : International Journal on Informatics Visualization 8, no. 1 (2024). http://dx.doi.org/10.62527/joiv.8.1.2279.

Full text
Abstract:
Frequently Asked Questions (FAQs) are extensively studied in general domains like the medical field, but such frameworks are lacking in domains such as software engineering and open-source communities. This research aims to bridge this gap by establishing the foundations of an automated FAQ Generation and Retrieval framework specifically tailored to the software engineering domain. The framework involves analyzing, ranking, performing sentiment analysis, and summarization techniques on open forums like StackOverflow and GitHub issues. A corpus of Stack Overflow post data is collected to evaluate the proposed framework and the selected models. Integrating state-of-the-art models of string-matching models, sentiment analysis models, summarization models, and the proprietary ranking formula proposed in this paper forms a robust Automatic FAQ Generation and Retrieval framework to facilitate developers' work. String matching, sentiment analysis, and summarization models are evaluated, and F1 scores of 71.31%, 74.90%, and 53.4% were achieved. Given the subjective nature of evaluations in this context, a human review is used to further validate the effectiveness of the overall framework, with assessments made on relevancy, preferred ranking, and preferred summarization. Future work includes improving summarization models by incorporating text classification and summarizing them individually (Kou et al, 2023), as well as proposing feedback loop systems based on human reinforcement learning. Furthermore, efforts will be made to optimize the framework by utilizing knowledge graphs for dimension reduction, enabling it to handle larger corpora effectively
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!