Journal articles: 'Document Object Model DOM'

1

Role, François, and Philippe Verdret. "Le Document Object Model (DOM)." Cahiers GUTenberg, no. 33-34 (1999): 155–71. http://dx.doi.org/10.5802/cg.265.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Wang, Yanlong, and Jinhua Liu. "Object-oriented Design based Comprehensive Experimental Development of Document Object Model." Advances in Engineering Technology Research 3, no. 1 (2022): 390. http://dx.doi.org/10.56028/aetr.3.1.390.

Full text

Abstract:

JavaScript code using Document Object Model (DOM) can realize the dynamic control of Web pages, which is the important content of the Web development technology course. The application of DOM is very flexible and includes many knowledge points, so it is difficult for students to master. In order to help students to understand each knowledge point and improve their engineering ability to solve practical problems, a DOM comprehensive experiment project similar to blind box is designed and implemented. This experimental project integrates knowledge points such as DOM events, DOM operations, and communication between objects. Practice has proved that running and debugging of the project can help students to understand and master relevant knowledge points.

APA, Harvard, Vancouver, ISO, and other styles

3

Radilova, Martina, Patrik Kamencay, Robert Hudec, Miroslav Benco, and Roman Radil. "Tool for Parsing Important Data from Web Pages." Applied Sciences 12, no. 23 (2022): 12031. http://dx.doi.org/10.3390/app122312031.

Full text

Abstract:

This paper discusses the tool for the main text and image extraction (extracting and parsing the important data) from a web document. This paper describes our proposed algorithm based on the Document Object Model (DOM) and natural language processing (NLP) techniques and other approaches for extracting information from web pages using various classification techniques such as support vector machine, decision tree techniques, naive Bayes, and K-nearest neighbor. The main aim of the developed algorithm was to identify and extract the main block of a web document that contains the text of the article and the relevant images. The algorithm on a sample of 45 web documents of different types was applied. In addition, the issue of web pages, from the structure of the document to the use of the Document Object Model (DOM) for their processing, was analyzed. The Document Object Model was used to load and navigation of the document. It also plays an important role in the correct identification of the main block of web documents. The paper also discusses the levels of natural language. These methods of automatic natural language processing help to identify the main block of the web document. In this way, the all-textual parts and images from the main content of the web document were extracted. The experimental results show that our method achieved a final classification accuracy of 88.18%.

APA, Harvard, Vancouver, ISO, and other styles

4

Ily, Amalina Ahmad Sabri, and Man Mustafa. "Improving Performance of DOM in Semi-structured Data Extraction Using WEIDJ Model." Indonesian Journal of Electrical Engineering and Computer Science 9, no. 3 (2018): 752–63. https://doi.org/10.11591/ijeecs.v9.i3.pp752-763.

Full text

Abstract:

Web data extraction is the process of extracting user required information from web page. The information consists of semi-structured data not in structured format. The extraction data involves the web documents in html format. Nowadays, most people uses web data extractors because the extraction involve large information which makes the process of manual information extraction takes time and complicated. We present in this paper WEIDJ approach to extract images from the web, whose goal is to harvest images as object from template-based html pages. The WEIDJ (Web Extraction Image using DOM (Document Object Model) and JSON (JavaScript Object Notation)) applies DOM theory in order to build the structure and JSON as environment of programming. The extraction process leverages both the input of web address and the structure of extraction. Then, WEIDJ splits DOM tree into small subtrees and applies searching algorithm by visual blocks for each web page to find images. Our approach focus on three level of extraction; single web page, multiple web page and the whole web page. Extensive experiments on several biodiversity web pages has been done to show the comparison time performance between image extraction using DOM, JSON and WEIDJ for single web page. The experimental results advocate via our model, WEIDJ image extraction can be done fast and effectively.

APA, Harvard, Vancouver, ISO, and other styles

5

Ahmad Sabri, Ily Amalina, and Mustafa Man. "Improving Performance of DOM in Semi-structured Data Extraction using WEIDJ Model." Indonesian Journal of Electrical Engineering and Computer Science 9, no. 3 (2018): 752. http://dx.doi.org/10.11591/ijeecs.v9.i3.pp752-763.

Full text

Abstract:

<p>Web data extraction is the process of extracting user required information from web page. The information consists of semi-structured data not in structured format. The extraction data involves the web documents in html format. Nowadays, most people uses web data extractors because the extraction involve large information which makes the process of manual information extraction takes time and complicated. We present in this paper WEIDJ approach to extract images from the web, whose goal is to harvest images as object from template-based html pages. The WEIDJ (Web Extraction Image using DOM (Document Object Model) and JSON (JavaScript Object Notation)) applies DOM theory in order to build the structure and JSON as environment of programming. The extraction process leverages both the input of web address and the structure of extraction. Then, WEIDJ splits DOM tree into small subtrees and applies searching algorithm by visual blocks for each web page to find images. Our approach focus on three level of extraction; single web page, multiple web page and the whole web page. Extensive experiments on several biodiversity web pages has been done to show the comparison time performance between image extraction using DOM, JSON and WEIDJ for single web page. The experimental results advocate via our model, WEIDJ image extraction can be done fast and effectively.</p>

APA, Harvard, Vancouver, ISO, and other styles

6

Li, Zimeng, Bo Shao, Linjun Shou, Ming Gong, Gen Li, and Daxin Jiang. "WIERT: Web Information Extraction via Render Tree." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 11 (2023): 13166–73. http://dx.doi.org/10.1609/aaai.v37i11.26546.

Full text

Abstract:

Web information extraction (WIE) is a fundamental problem in web document understanding, with a significant impact on various applications. Visual information plays a crucial role in WIE tasks as the nodes containing relevant information are often visually distinct, such as being in a larger font size or having a brighter color, from the other nodes. However, rendering visual information of a web page can be computationally expensive. Previous works have mainly focused on the Document Object Model (DOM) tree, which lacks visual information. To efficiently exploit visual information, we propose leveraging the render tree, which combines the DOM tree and Cascading Style Sheets Object Model (CSSOM) tree, and contains not only content and layout information but also rich visual information at a little additional acquisition cost compared to the DOM tree. In this paper, we present WIERT, a method that effectively utilizes the render tree of a web page based on a pretrained language model. We evaluate WIERT on the Klarna product page dataset, a manually labeled dataset of renderable e-commerce web pages, demonstrating its effectiveness and robustness.

APA, Harvard, Vancouver, ISO, and other styles

7

Sankari, S., and S. Bose. "Efficient Identification of Structural Relationships for XML Queries using Secure Labeling Schemes." International Journal of Intelligent Information Technologies 12, no. 4 (2016): 63–80. http://dx.doi.org/10.4018/ijiit.2016100104.

Full text

Abstract:

XML emerged as a de-facto standard for data representation and information exchange over the World Wide Web. By utilizing document object model (DOM), XML document can be viewed as XML DOM tree. Nodes of an XML tree are labeled to uniquely identify every node by following a labeling scheme. This paper proposes a method to efficiently identify the two structural relationships namely document order (DO) and sibling relationship that exist between the XML nodes using two secure labeling schemes specifically enhanced Dewey coding (EDC) and secure Dewey coding (SDC). These structural relationships influence the performance of XML queries so they need to be identified in efficient time. This paper implements the method to identify DO and sibling relationship using EDC and SDC labels for various real-time XML documents. Experiment results show the identification of DO and sibling relationship using SDC labels performs better than EDC labels for processing XML queries.

APA, Harvard, Vancouver, ISO, and other styles

8

Ily, Amalina Ahmad Sabri, and Man Mustafa. "A performance of comparative study for semi-structured web data extraction model." International Journal of Electrical and Computer Engineering (IJECE) 9, no. 6 (2019): 5463–70. https://doi.org/10.11591/ijece.v9i6.pp5463-5470.

Full text

Abstract:

The extraction of information from multi-sources of web is an essential yet complicated step for data analysis in multiple domains. In this paper, we present a data extraction model based on visual segmentation, DOM tree and JSON approach which is known as Wrapper Extraction of Image using DOM and JSON (WEIDJ) for extracting semi-structured data from biodiversity web. The large number of information from multiple sources of web which is image’s information will be extracted using three different approach; Document Object Model (DOM), Wrapper image using Hybrid DOM and JSON (WHDJ) and Wrapper Extraction of Image using DOM and JSON (WEIDJ). Experiments were conducted on several biodiversity website. The experiment results show that WEIDJ approach promising results with respect to time analysis values. WEIDJ wrapper has successfully extracted greater than 100 images of data from the multi-source web biodiversity of over 15 different websites.

APA, Harvard, Vancouver, ISO, and other styles

9

Feng, Jian, Ying Zhang, and Yuqiang Qiao. "A Detection Method for Phishing Web Page Using DOM-Based Doc2Vec Model." Journal of Computing and Information Technology 28, no. 1 (2020): 19–31. http://dx.doi.org/10.20532/cit.2020.1004899.

Full text

Abstract:

Detecting phishing web pages is a challenging task. The existing detection method for phishing web page based on DOM (Document Object Model) is mainly aiming at obtaining structural characteristics but ignores the overall representation of web pages and the semantic information that HTML tags may have. This paper regards DOMs as a natural language with Doc2Vec model and learns the structural semantics automatically to detect phishing web pages. Firstly, the DOM structure of the obtained web page is parsed to construct the DOM tree, then the Doc2Vec model is used to vectorize the DOM tree, and to measure the semantic similarity in web pages by the distance between different DOM vectors. Finally, the hierarchical clustering method is used to implement clustering of web pages. Experiments show that the method proposed in the paper achieves higher recall and precision for phishing classification, compared to DOM-based structural clustering method and TF-IDF-based semantic clustering method. The result shows that using Paragraph Vector is effective on DOM in a linguistic approach.

APA, Harvard, Vancouver, ISO, and other styles

10

Sabri, Ily Amalina Ahmad, and Mustafa Man. "A performance of comparative study for semi-structured web data extraction model." International Journal of Electrical and Computer Engineering (IJECE) 9, no. 6 (2019): 5463. http://dx.doi.org/10.11591/ijece.v9i6.pp5463-5470.

Full text

Abstract:

<span lang="EN-US">The extraction of information from multi-sources of web is an essential yet complicated step for data analysis in multiple domains. In this paper, we present a data extraction model based on visual segmentation, DOM tree and JSON approach which is known as Wrapper Extraction of Image using DOM and JSON (WEIDJ) for extracting semi-structured data from biodiversity web. The large number of information from multiple sources of web which is image’s information will be extracted using three different approach; Document Object Model (DOM), Wrapper image using Hybrid DOM and JSON (WHDJ) and Wrapper Extraction of Image using DOM and JSON (WEIDJ). Experiments were conducted on several biodiversity website. The experiment results show that WEIDJ approach promising results with respect to time analysis values. WEIDJ wrapper has successfully extracted greater than 100 images of data from the multi-source web biodiversity of over 15 different websites.</span>

APA, Harvard, Vancouver, ISO, and other styles

11

Atanasov, Valentin. "AN APPROACH OF DOM RECORD MANAGEMENT COMPONENT UI CREATION." Journal Scientific and Applied Research 27, no. 1 (2024): 97–104. http://dx.doi.org/10.46687/jsar.v27i1.410.

Full text

Abstract:

This paper presents an approach of Document Object Model database record management component user interface (UI) creation. Mainstream of UI development approaches concentrates in few directions, omitting holistic view on the fundamental database operations. The proposed solution uses method lay on of standard clauses adopted by the ECMA Script edition sixth and above. The program logic for the UI component was built with the intention - The existing functionality of such UI component to be extended and complemented.

APA, Harvard, Vancouver, ISO, and other styles

12

Ahmad Sabri, Ily Amalina, and Mustafa Man. "A deep web data extraction model for web mining: a review." Indonesian Journal of Electrical Engineering and Computer Science 23, no. 1 (2021): 519. http://dx.doi.org/10.11591/ijeecs.v23.i1.pp519-528.

Full text

Abstract:

The World Wide Web has become a large pool of information. Extracting structured data from a published web pages has drawn attention in the last decade. The process of web data extraction (WDE) has many challenges, dueto variety of web data and the unstructured data from hypertext mark up language (HTML) files. The aim of this paper is to provide a comprehensive overview of current web data extraction techniques, in termsof extracted quality data. This paper focuses on study for data extraction using wrapper approaches and compares each other to identify the best approach to extract data from online sites. To observe the efficiency of the proposed model, we compare the performance of data extraction by single web page extraction with different models such as document object model (DOM), wrapper using hybrid dom and json (WHDJ), wrapper extraction of image using DOM and JSON (WEIDJ) and WEIDJ (no-rules). Finally, the experimentations proved that WEIDJ can extract data fastest and low time consuming compared to other proposed method.<br /><div> </div>

APA, Harvard, Vancouver, ISO, and other styles

13

Sabri, Ily Amalina Ahmad, and Mustafa Man. "A deep web data extraction model for web mining: a review." Indonesian Journal of Electrical Engineering and Computer Science 23, no. 1 (2021): 519–28. https://doi.org/10.11591/ijeecs.v23.i1.pp519-528.

Full text

Abstract:

The world wide web has become a large pool of information. Extracting structured data from a published webpages has drawn attention in the last decade. The process of web data extraction (WDE) has many challenges, due to variety of web data and the unstructured data from hypertext markup language (HTML) files. The aim of this paper is to provide a comprehensive overview of current web data extraction techniques, in terms of extracted quality data. This paper focuses on study for data extraction using wrapper approaches and compares each other to identify the best approach to extract data from online sites. To observe the efficiency of the proposed model, we compare the performance of data extraction by single web page extraction with different models such as document object model (DOM), wrapper using hybrid dom and json (WHDJ), wrapper extraction of image using DOM and JSON (WEIDJ) and WEIDJ (no-rules). Finally, the experimentations proved that WEIDJ can extract data fastest and low time consuming compared to other proposed method

APA, Harvard, Vancouver, ISO, and other styles

14

Ily, Amalina Ahmad Sabri, and Man Mustafa. "WEIDJ: Development of a new algorithm for semi-structured web data extraction." TELKOMNIKA Telecommunication, Computing, Electronics and Control 19, no. 1 (2021): pp. 317~326. https://doi.org/10.12928/TELKOMNIKA.v19i1.16205.

Full text

Abstract:

In the era of industrial digitalization, people are increasingly investing in solutions that allow their process for data collection, data analysis and performance improvement. In this paper, advancing web scale knowledge extraction and alignment by integrating few sources by exploring different methods of aggregation and attention is considered in order focusing on image information. The main aim of data extraction with regards to semi structured data is to retrieve beneficial information from the web. The data from web also known as deep web is retrievable but it requires request through form submission because it cannot be performed by any search engines. As the HTML documents start to grow larger, it has been found that the process of data extraction has been plagued with lengthy processing time. In this research work, we propose an improved model namely wrapper extraction of image using document object model (DOM) and JavaScript object notation data (JSON) (WEIDJ) in response to the promising results of mining in a higher volume of image from a various type of format. To observe the efficiency of WEIDJ, we compare the performance of data extraction by different level of page extraction with VIBS, MDR, DEPTA and VIDE. It has yielded the best results in Precision with 100, Recall with 97.93103 and F-measure with 98.9547.  

APA, Harvard, Vancouver, ISO, and other styles

15

Bondarenko, Olesya Sergeevna. "Analysis of DOM update methods in modern web frameworks: Virtual DOM and Incremental DOM." Программные системы и вычислительные методы, no. 2 (February 2025): 35–43. https://doi.org/10.7256/2454-0714.2025.2.74172.

Full text

Abstract:

The article presents an analysis of modern methods for updating the Document Object Model (DOM) structure in popular client-side web frameworks such as Angular, React and Vue. The main focus is on comparing the concepts of Virtual DOM and Incremental DOM, which underlie the architectural solutions of the respective frameworks. The Virtual DOM used in React and Vue operates on a virtual tree, compares its versions in order to identify differences and minimize changes in the real DOM. This approach provides a relatively simple implementation of the reactive interface, but comes with additional costs for computing and resource usage. In contrast, Angular uses an Incremental DOM, which does not create intermediate structures: changes are applied directly through the Change Detection mechanism. This approach allows to achieve high performance through point updates of DOM elements without the need for a virtual representation. The study uses a comparative analysis of architectural approaches to updating the DOM, based on the study of official documentation, practical experiments with code and visualization of rendering processes in Angular and React. The methodology includes a theoretical justification, a step-by-step analysis of the update mechanisms and an assessment of their impact on performance. The scientific novelty of the article lies in the systematic comparison of architectural approaches to updating the DOM in leading frameworks, with an emphasis on the implementation of the signal model in Angular version 17+. The impact of using signals on the abandonment of the Zone library is analyzed in detail.js and the formation of a more predictable, deterministic rendering model, as well as lower-level performance management capabilities. The article contains not only a theoretical description, but also practical examples that reveal the behavior of updates in real-world scenarios. The nuances of template compilation, the operation of the effect() and computed() functions are also considered. The comparison of Virtual DOM and Incremental DOM makes it possible to identify key differences, evaluate the applicability of approaches depending on the tasks and complexity of the project, and also suggest ways to optimize frontend architect

APA, Harvard, Vancouver, ISO, and other styles

16

Liu, Shuai, Ling Li Zhao, and Jun Sheng Li. "A Kind of Integrated Model for Panorama, Terrain and 3D Data Based on GML." Advanced Materials Research 955-959 (June 2014): 3850–53. http://dx.doi.org/10.4028/www.scientific.net/amr.955-959.3850.

Full text

Abstract:

Panorama image can provide 360 degrees view in one hotspot, which could solve the traditional three-dimensional expression of inadequate authenticity, difficult data acquisition as well as laborious and time-consuming modeling. However, we need other geographic information. So we propose a kind of integrated model based on GML, which contains a set of data structures to obtain panorama, terrain and 3D Data rapidly from the GML file, after analyzing GML files structure and parsing by the Document Object Model (DOM). The experiment shows that integrated model is very validated in web application using PTViewer, Java 3D and Web-related technologies.

APA, Harvard, Vancouver, ISO, and other styles

17

Dudak, A. "Memory Leaks in Spa: Prevention, Detection, and Remediation Methods." Bulletin of Science and Practice 10, no. 12 (2024): 161–66. https://doi.org/10.33619/2414-2948/109/22.

Full text

Abstract:

This article addresses the issue of memory leaks in modern single-page applications (SPAs). By investigating the causes of leaks associated with dynamic content updates, active interaction with the document object model (DOM) interface, and asynchronous operations, developers gain insights into avoiding the excessive accumulation of unused objects in memory. The article discusses methods for preventing and addressing leaks, including the use of weak references, component state management, and optimizing asynchronous requests. It also emphasizes the importance of using monitoring tools, such as Chrome DevTools, and integrating automated testing into the continuous integration (CI) and continuous delivery (CD) process. The article offers a comprehensive approach for efficient memory management and preventing performance issues in SPA applications.

APA, Harvard, Vancouver, ISO, and other styles

18

Ran, Peipei, Wenjie Yang, Zhongyue Da, and Yuke Huo. "Work orders management based on XML file in printing." ITM Web of Conferences 17 (2018): 03009. http://dx.doi.org/10.1051/itmconf/20181703009.

Full text

Abstract:

The Extensible Markup Language (XML) technology is increasingly used in various field, if it’s used to express the information of work orders will improve efficiency for management and production. According to the features, we introduce the technology of management for work orders and get a XML file through the Document Object Model (DOM) technology in the paper. When we need the information to conduct production, parsing the XML file and save the information in database, this is beneficial to the preserve and modify for information.

APA, Harvard, Vancouver, ISO, and other styles

19

Xia, Xiang, Zhi Shu Li, and Yi Xiang Fan. "The Advanced "Rich-Client" Method Based on DOM for the Dynamic and Configurable Web Application." Advanced Materials Research 756-759 (September 2013): 1691–95. http://dx.doi.org/10.4028/www.scientific.net/amr.756-759.1691.

Full text

Abstract:

In order to meet the user requirement of the dynamic customization and configuration of the changeable and complicated page functionality on the client, when constructing the web application platform, an advanced rich-client method and technology based on DOM ( Document Object Model ) was designed and used to develop the client module. The client module with rich-client technology was in the traditional J2EE (Java 2 Enterprise Edition ) architecture which was the Client-Centric and MVC ( Model-View-Control ) mode. On the client side, according to the dynamic page generation algorithm, developers wrote JavaScript scripting language based on DOM and Ajax (Asynchronous JavaScript and XML) for user customization and choose the part of the third-party open-source Extjs ( Extendable JavaScript ) components as the page elements to generate client-side dynamic configuration interface. From a user experience perspective, The good performance test results of the advanced rich-client method effectively examine the distinguishing features of the new method.

APA, Harvard, Vancouver, ISO, and other styles

20

Uçar, Erdem, Erdinç Uzun, and Pınar Tüfekci. "A novel algorithm for extracting the user reviews from web pages." Journal of Information Science 43, no. 5 (2016): 696–712. http://dx.doi.org/10.1177/0165551516666446.

Full text

Abstract:

Extracting the user reviews in websites such as forums, blogs, newspapers, commerce, trips, etc. is crucial for text processing applications (e.g. sentiment analysis, trend detection/monitoring and recommendation systems) which are needed to deal with structured data. Traditional algorithms have three processes consisting of Document Object Model (DOM) tree creation, extraction of features obtained from this tree and machine learning. However, these algorithms increase time complexity of extraction process. This study proposes a novel algorithm that involves two complementary stages. The first stage determines which HTML tags correspond to review layout for a web domain by using the DOM tree as well as its features and decision tree learning. The second stage extracts review layout for web pages in a web domain using the found tags obtained from the first stage. This stage is more time-efficient, being approximately 21 times faster compared to the first stage. Moreover, it achieves a relatively high accuracy of 96.67% in our experiments of review block extraction.

APA, Harvard, Vancouver, ISO, and other styles

21

Nwe, Nwe Hlaing, Thi Soe Nyunt Thi, and Thet Nyo Myat. "The Data Records Extraction from Web Pages." International Journal of Trend in Scientific Research and Development 3, no. 5 (2019): 2258–62. https://doi.org/10.5281/zenodo.3591282.

Full text

Abstract:

No other medium has taken a more meaningful place in our life in such a short time than the world wide largest data network, the World Wide Web. However, when searching for information in the data network, the user is constantly exposed to an ever growing ood of information. This is both a blessing and a curse at the same time. The explosive growth and popularity of the world wide web has resulted in a huge number of information sources on the Internet. As web sites are getting more complicated, the construction of web information extraction systems becomes more difficult and time consuming. So the scalable automatic Web Information Extraction WIE is also becoming high demand. There are four levels of information extraction from the World Wide Web such as free text level, record level, page level and site level. In this paper, the target extraction task is record level extraction. Nwe Nwe Hlaing | Thi Thi Soe Nyunt | Myat Thet Nyo "The Data Records Extraction from Web Pages" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd28010.pdf

APA, Harvard, Vancouver, ISO, and other styles

22

Yu, Lehe, and Zhengxiu Gui. "Analysis of Enterprise Social Media Intelligence Acquisition Based on Data Crawler Technology." Entrepreneurship Research Journal 11, no. 2 (2021): 3–23. http://dx.doi.org/10.1515/erj-2020-0267.

Full text

Abstract:

Abstract There are generally hundreds of millions of nodes in social media, and they are connected to a huge social network through attention and fan relationships. The news is spread through this huge social network. This paper studies the acquisition technology of social media topic data and enterprise data. The topic positioning technology based on Sina meta search and topic related keywords is introduced, and the crawling efficiency of topic crawlers is analyzed. Aiming at the factors of diverse and variable webpage structure on the Internet, this paper proposes a new Web information extraction algorithm by studying the general laws existing in the webpage structure, combining DOM (Document Object Model) tree and DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm. Several links in the algorithm are introduced in detail, including Web page processing, DOM tree construction, segmented text content acquisition, and web content extraction based on the DBSCAN algorithm. The simulation results show that the intelligence culture, intelligence system, technology platform and intelligence organization ecological collaboration strategy under the extraction of DOM tree and DBSCAN information can improve the level of intelligence participation of all employees. There is a significant positive correlation between the level of participation and the level of the intelligence environment of all employees. According to the research results, the DOM tree and DBSCAN information proposed in this paper can extract the enterprise’s employee intelligence and the effective implementation of relevant collaborative strategies, which can provide guidance for the effective implementation of the employee intelligence.

APA, Harvard, Vancouver, ISO, and other styles

23

Mironov, Valeriy V., Artem S. Gusarenko, and Nafisa I. Yusupova. "Software extract data from word-based documents situationally-oriented approach." Journal Of Applied Informatics 16, no. 96 (2021): 66–83. http://dx.doi.org/10.37791/2687-0649-2021-16-6-66-83.

Full text

Abstract:

The article discusses the use of situation-oriented approach to software processing word-documents. The documents under consideration are prepared by the user in the environment of the Microsoft Word processor or its analogs and are used in the future as data sources. The openness of the Office Open XML and Open Document Format made it possible to apply the concept of virtual documents mapped to ZIP archives for programmatic access to XML components of word documents in a situational environment. The importance of developing preliminary agreements regarding the placement of information in the document for subsequent search and retrieval, for example, using pre-prepared templates, is substantiated. For the DOCX and ODT formats, the article discusses the use of key phrases, bookmarks, content controls, custom XML components to organize the extraction of entered data. For each option, tree-like models of access to the extracted data, as well as the corresponding XPath expressions, are built. It is noted that the use of one or another option depends on the functionality and limitations of the word processor and is characterized by varying complexity of developing a blank template, entering data by the user and programming data extraction. The applied solution is based on entering metadata into the article using content controls placed in a stub template and bound to elements of a custom XML component. The developed hierarchical situational model of HSM provides extraction of an XML component, loading it into a DOM object and XSLT transformations to obtain the resulting data: an error report and JavaScript code for subsequent use of the extracted metadata.

APA, Harvard, Vancouver, ISO, and other styles

24

Firdian, Maulana Irfan, Eko Darwiyanto, and Monterico Adrian. "Web Scraping with HTML DOM Method for Website News API creation." JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika) 7, no. 4 (2022): 1211–19. http://dx.doi.org/10.29100/jipi.v7i4.3235.

Full text

Abstract:

Information is one of the important things in this era, one of the information that always exists every day is news. The amount of news that appears every day becomes a new problem when news websites do not provide API (Application Programming Interface) services to get the news. This is an obstacle for researchers who will analyze news topics. The copy and paste method is less effective in getting news every day on news websites because it takes a long time. In this research, web scraping is done with the HTML (Hypertext Markup Language) DOM (Document Object Model) method to retrieve data from news sites. The results of web scraping are in the form of datasets which are then entered into the database and made into an API. The API that has been created is tested using black box testing and testing the suitability of the data, between the data obtained when scraping and the data on the news website at the time of testing. The results of testing using black box testing show that the filters on the API created run according to their functions and get a high percentage of data conformity. The Tribunnews.com news website has a conformity rate of 99.2%, Detik.com of 97.9% and Li-putan6.com of 98.6%.

APA, Harvard, Vancouver, ISO, and other styles

25

He, Zecheng, Srinivas Sunkara, Xiaoxue Zang, et al. "ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 7 (2021): 5931–38. http://dx.doi.org/10.1609/aaai.v35i7.16741.

Full text

Abstract:

As mobile devices are becoming ubiquitous, regularly interacting with a variety of user interfaces (UIs) is a common aspect of daily life for many people. To improve the accessibility of these devices and to enable their usage in a variety of settings, building models that can assist users and accomplish tasks through the UI is vitally important. However, there are several challenges to achieve this. First, UI components of similar appearance can have different functionalities, making understanding their function more important than just analyzing their appearance. Second, domain-specific features like Document Object Model (DOM) in web pages and View Hierarchy (VH) in mobile applications provide important signals about the semantics of UI elements, but these features are not in a natural language format. Third, owing to a large diversity in UIs and absence of standard DOM or VH representations, building a UI understanding model with high coverage requires large amounts of training data. Inspired by the success of pre-training based approaches in NLP for tackling a variety of problems in a data-efficient way, we introduce a new pre-trained UI representation model called ActionBert. Our methodology is designed to leverage visual, linguistic and domain-specific features in user interaction traces to pre-train generic feature representations of UIs and their components. Our key intuition is that user actions, e.g., a sequence of clicks on different UI components, reveals important information about their functionality. We evaluate the proposed model on a wide variety of downstream tasks, ranging from icon classification to UI component retrieval based on its natural language description. Experiments show that the proposed ActionBert model outperforms multi-modal baselines across all downstream tasks by up to 15.5%.

APA, Harvard, Vancouver, ISO, and other styles

26

Liu, Juan, and Sha Mi. "American literature news narration based on computer web technology." PLOS ONE 18, no. 10 (2023): e0292446. http://dx.doi.org/10.1371/journal.pone.0292446.

Full text

Abstract:

Driven by internet technology, online has become the main way of news dissemination, but redundant information such as navigation bars and advertisements affects people’s access to news content. The research aims to enable users to obtain pure news content from redundant web information. Firstly, based on the narrative characteristics of literary news, the Term Frequency-Inverse Document Frequency (TF-IDF) algorithm is employed to extract pure news content from the analyzed web pages. The algorithm uses keyword matching, text analysis, and semantic processing to determine news content’s boundaries and key information. Secondly, the news text classification algorithm (support vector machine, K-nearest neighbor, AdaBoost algorithm) is selected through comparative experiments. The news extraction system based on keyword feature and extended Document Object Model (DOM) tree is constructed. DOM technology analyzes web page structure and extracts key elements and information. Finally, the research can get their narrative characteristics by studying the narrative sequence and structure of 15 American literary news reports. The results reveal that the most used narrative sequence in American literary news is sequence and flashback. The narrative duration is dominated by the victory rate and outline, supplemented by scenes and pauses. In addition, 53.3% of the narrative structures used in literary news are time-connected. This narrative structure can help reporters have a clear conceptual structure when writing, help readers quickly grasp and understand the context of the event and the life course of the protagonists in the report, and increase the report’s readability. This research on the narrative characteristics of American literature news can provide media practitioners with a reference on news narrative techniques and strategies.

APA, Harvard, Vancouver, ISO, and other styles

27

Miyashita, Hisashi, and Hironobu Takagi. "Multimedia Content Formats in Depth; How Do They Make Interactive Broadcast/Communication Services Possible? (4); Declarative Data Format (3) -Document Object Model (DOM)/Scripting Language-." Journal of The Institute of Image Information and Television Engineers 61, no. 4 (2006): 453–58. http://dx.doi.org/10.3169/itej.61.453.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Vitalii, Diduk, Hrytsenko Valerii, and Yeromenko Andrii. "BUILDING A MODEL OF NETWORK INTERACTION BETWEEN THE COMPONENTS OF A MULTIAGENT SYSTEM OF MOBILE ROBOTS." Eastern-European Journal of Enterprise Technologies 5, no. 9 (107) (2020): 57–63. https://doi.org/10.15587/1729-4061.2020.213989.

Full text

Abstract:

The results reported here represent the first stage in the development of a full-featured laboratory system aimed at studying machine learning algorithms. The relevance of the current work is predetermined by the lack of network small-size mobile robots and appropriate control software that would make it possible to conduct field experiments in real time. This paper reports the selection of network data transmission technology for managing mobile robots in real time. Based on the chosen data transmission protocol, a complete stack of technologies of the network model of a multi-agent system of mobile robots has been proposed. This has made it possible to build a network model of the system that visualizes and investigates machine learning algorithms. In accordance with the requirements set by the OSI network model for constructing such systems, the model includes the following levels: 1) the lower level of data collection and controlling elements – mobile robots; 2) the top level of the model includes a user interface server and a business logic support server. Based on the built diagram of the protocol stack and the network model, the software and hardware implementation of the obtained results has been carried out. This paper employed the JavaScript library React with a SPA technology (Single Page Application), a Virtual DOM technology (Document Object Model), stored in the device's RAM and synchronized with the actual DOM. That has made it possible to simplify the process of control over the clients and reduce network traffic. The model provides the opportunity to: 1) manage the prototypes of robot clients in real time; 2) reduce the use of network traffic, compared to other data transmission technologies; 3) reduce the load on the CPU processors of robots and servers;  4) virtually simulate an experiment; 5) investigate the implementation of machine learning algorithms

APA, Harvard, Vancouver, ISO, and other styles

29

Romanova-Hynes, Maria. "POV: A Home of Alterity." Arts 12, no. 3 (2023): 84. http://dx.doi.org/10.3390/arts12030084.

Full text

Abstract:

Challenging the idea of “home” as a safe refuge, or an enclosure of stability, this article explores ways in which home can be envisioned as an ontological space of becoming, where life is always risked. “POV: A Home of Alterity” is conceived within a deconstructivist theoretical framework and asks the question of how home can be perceived as an open text—a locus of oscillation between inside and outside—for the purpose of revealing home as an inherently traumatic “event,” which presupposes an openness to absolute alterity. To show the traces of otherness in one’s experience of being present (at home), it examines a photograph from Julia Borissova’s project DOM: Document Object Model and sets out to interrogate the concept of “home” through three relationships wherein it emerges: (1) between inside and outside, (2) between the I and the other, and (3) between the I and oneself. Consequently, this article seeks to define home as a representational space of one’s own alterity, where one surrenders to one’s non-coincidence with oneself and hence to experience itself, ultimately revealing that, in an aporetical way, home encrypts the very dislocation it “promises” to shield from.

APA, Harvard, Vancouver, ISO, and other styles

30

Griazev, Kiril, and Simona Ramanauskaitė. "Web Page Content Block Identification with Extended Block Properties." Applied Sciences 13, no. 9 (2023): 5680. http://dx.doi.org/10.3390/app13095680.

Full text

Abstract:

Web page segmentation is one of the most influential factors for the automated integration of web page content with other systems. Existing solutions are focused on segmentation but do not provide a more detailed description of the segment including its range (minimum and maximum HTML code bounds, covering the segment content) and variants (the same segments with different content). Therefore the paper proposes a novel solution designed to find all web page content blocks and detail them for further usage. It applies text similarity and document object model (DOM) tree analysis methods to indicate the maximum and minimum ranges of each identified HTML block. In addition, it indicates its relation to other blocks, including hierarchical as well as sibling blocks. The evaluation of the method reveals its ability to identify more content blocks in comparison to human labeling (in manual labeling only 24% of blocks were labeled). By using the proposed method, manual labeling effort could be reduced by at least 70%. Better performance was observed in comparison to other analyzed web page segmentation methods, and better recall was achieved due to focus on processing every block present on a page, and providing a more detailed web page division into content block data by presenting block boundary range and block variation data.

APA, Harvard, Vancouver, ISO, and other styles

31

Bhosale, Swapnali H. "AJAX: COMING TO AN APPLICATION NEAR YOU." COMPUSOFT: An International Journal of Advanced Computer Technology 02, no. 06 (2013): 164–70. https://doi.org/10.5281/zenodo.14605690.

Full text

Abstract:

Today’s rich Web applications use a mix of Java Script and asynchronous communication with the application server. This mechanism is also known as Ajax: Asynchronous JavaScript and XML. The intent of Ajax is to exchange small pieces of data between the browser and the application server, and in doing so, use partial page refresh instead of reloading the entire Web page. In recent years, information system based on browse/server architecture (namely B/S architecture) received more favor by enterprises. Ajax technology consists of five parts. They are HTML (Hyper Text Markup Language), JavaScript, DHTML (Dynamic Hyper Text Markup Language), DOM (Document Object Model) and XML (Extensible Markup Language). With the help of cooperation and collaboration of these technologies, they can optimize the conventional enterprise information system by using an asynchronous way. Meanwhile, a quickly-responded and smoother user interface was provided. Enterprise information system with Ajax can be operated in a more efficient way, which means even use the current hardware, it can provide more load capacity, be more stable and serve more clients in parallel. In this paper: we present two kinds of information system models, one use conventional B/S architecture and the other use Ajax enhanced B/S architecture. 

APA, Harvard, Vancouver, ISO, and other styles

32

Sulikowski, Piotr, Tomasz Zdziebko, Kristof Coussement, Krzysztof Dyczkowski, Krzysztof Kluza, and Karina Sachpazidu-Wójcicka. "Gaze and Event Tracking for Evaluation of Recommendation-Driven Purchase." Sensors 21, no. 4 (2021): 1381. http://dx.doi.org/10.3390/s21041381.

Full text

Abstract:

Recommendation systems play an important role in e-commerce turnover by presenting personalized recommendations. Due to the vast amount of marketing content online, users are less susceptible to these suggestions. In addition to the accuracy of a recommendation, its presentation, layout, and other visual aspects can improve its effectiveness. This study evaluates the visual aspects of recommender interfaces. Vertical and horizontal recommendation layouts are tested, along with different visual intensity levels of item presentation, and conclusions obtained with a number of popular machine learning methods are discussed. Results from the implicit feedback study of the effectiveness of recommending interfaces for four major e-commerce websites are presented. Two different methods of observing user behavior were used, i.e., eye-tracking and document object model (DOM) implicit event tracking in the browser, which allowed collecting a large amount of data related to user activity and physical parameters of recommending interfaces. Results have been analyzed in order to compare the reliability and applicability of both methods. Observations made with eye tracking and event tracking led to similar results regarding recommendation interface evaluation. In general, vertical interfaces showed higher effectiveness compared to horizontal ones, with the first and second positions working best, and the worse performance of horizontal interfaces probably being connected with banner blindness. Neural networks provided the best modeling results of the recommendation-driven purchase (RDP) phenomenon.

APA, Harvard, Vancouver, ISO, and other styles

33

Qi, Huiyu, Nobuo Funabiki, Khaing Hsu Wai, Xiqin Lu, Htoo Htoo Sandi Kyaw, and Wen-Chung Kao. "An Implementation of Element Fill-in-Blank Problems for Code Understanding Study of JavaScript-Based Web-Client Programming." International Journal of Information and Education Technology 12, no. 11 (2022): 1179–84. http://dx.doi.org/10.18178/ijiet.2022.12.11.1736.

Full text

Abstract:

At present, web-client programming using HTML, CSS, and JavaScript is essential in web application systems to offer dynamic behaviors in web pages. With rich libraries and short coding features, it becomes common in developing user interfaces. However, the teaching course is not common in universities due to limited time. Therefore, self-study tools are strongly desired to promote it in societies. Previously, we have studied the programming learning assistant system (PLAS) as a programming self-study platform. In PLAS, among several types of programming problems, the element fill-in-blank problem (EFP) has been implemented for code understanding study of C and Java programming. In an EFP instance, the blank elements in a source code should be filled in with the proper words, where the correctness is checked by string matching. In this paper, we implement EFP for web-client programming in PLAS. In a web page, HTML and CSS define the components with tags in the document object model (DOM), and JavaScript offers their dynamic changes with libraries, which are blanked in EFP. Besides, a set of web page screenshots are given to help the solution. For evaluations, the generated 21 EFP instances were assigned to 20 master students in Okayama University. By analyzing their solution results, the effectiveness was confirmed for JavaScript programming learning.

APA, Harvard, Vancouver, ISO, and other styles

34

Gupta, Shashank, and B. B. Gupta. "Smart XSS Attack Surveillance System for OSN in Virtualized Intelligence Network of Nodes of Fog Computing." International Journal of Web Services Research 14, no. 4 (2017): 1–32. http://dx.doi.org/10.4018/ijwsr.2017100101.

Full text

Abstract:

This article introduces a distributed intelligence network of Fog computing nodes and Cloud data centres for smart devices against XSS vulnerabilities in Online Social Network (OSN). The cloud data centres compute the features of JavaScript, injects them in the form of comments and saved them in the script nodes of Document Object Model (DOM) tree. The network of Fog devices re-executes the feature computation and comment injection process in the HTTP response message and compares such comments with those calculated in the cloud data centres. Any divergence observed will simply alarm the signal of injection of XSS worms on the nodes of fog located at the edge of the network. The mitigation of such worms is done by executing the nested context-sensitive sanitization on the malicious variables of JavaScript code embedded in such worms. The prototype of the authors' work was developed in Java development framework and installed on the virtual machines of Cloud data centres (typically located at the core of network) and the nodes of Fog devices (exclusively positioned at the edge of network). Vulnerable OSN-based web applications were utilized for evaluating the XSS worm detection capability of the authors' framework and evaluation results revealed that their work detects the injection of XSS worms with high precision rate and less rate of false positives and false negatives.

APA, Harvard, Vancouver, ISO, and other styles

35

Masue, Wilbard G., Daniel Ngondya, and Tabu S. Kondo. "Assessment of Vulnerabilities in Student Records Web-Based Systems for Public and Private Higher Learning Institutions in Tanzania." Journal of ICT Systems 2, no. 2 (2024): 1–28. http://dx.doi.org/10.56279/jicts.v2i2.52.

Full text

Abstract:

In spite that HLIs in Tanzania use web-based systems for managing, storing and processing of HLIs information and data such as website contents, academic results and financial records. The HLIs web-based system have been compromised by attackers due to presence of vulnerabilities. The main objective of this study is to assess the vulnerabilities of Students Records Web-based Systems (SRWBS) for private and public Higher Learning Institutions (HLIs) in Tanzania using black-box testing methodology by employing two automatic vulnerability scanners namely OWASP ZAP (Open Webs Application Security Project Zed Attack Proxy; open-source tool) and Acunetix (proprietary tool). This study assesses the vulnerability of SRWBS for 3 private HLIs and 5 public HLIs in Tanzania. The results reveal the total of 29 vulnerabilities which include but are not limited to Broken Authentication and Session Management, Broken Access Control, Security Misconfiguration, Sensitive Data Exposure, Vulnerable JS (Java Script) Libraries, CSRF (Cros Site Request Forgery), Using Components with Known Vulnerabilities, XSS (Cross Site Script), DOM (Document Object Model) based XSS and Reflected XSS. SRWBS of public HLIs were found more vulnerable by average 44.2% than the SRWBS of private HLIs which were vulnerable by average of 37%. Based on these results, this study provides some recommendations for mitigating vulnerabilities and improving the security of SRWBS for private and public HLIs in Tanzania.

APA, Harvard, Vancouver, ISO, and other styles

36

Kumar, Sumit, Nitin, and Mitul Yadav. "An Effective GDP-LSTM and SDQL-Based Finite State Testing of GUI." Applied Sciences 14, no. 2 (2024): 549. http://dx.doi.org/10.3390/app14020549.

Full text

Abstract:

The Graphical User Interface (GUI) is the most promising factor in the Software Development Lifecycle (SDL), which allows the users to interact with the system. To ensure user-friendliness, GUI Testing (GT) is required. The traditional testing techniques attained flawed results due to having inappropriate functions. Hence, Global Decaying Probabilistic Long Short-Term Memory (GDP-LSTM) and Standard Deviation Q-Learning (SDQL)-based automatic testing for GUI are proposed as solutions. Initially, the Test Case (TC) and GUI are extracted from the historical data and are subjected to Region of Interest (ROI) analysis. Here, an appropriate ROI is analyzed by Module Coupling Slice (MCS), and it is fed into Hadoop Parallelization (HP). Now, Spectral Kernelized Gaussian Clustering (SKGC) and Non-Linear Elite Guided Optimized Ant Colony (NE-GO-AC) are used to perform mapping and reducing, respectively. Likewise, the parallelized output is utilized to construct the Document Object Model (DOM) tree. Then, the attributes are extracted and given to the GDP-LSTM classifier that effectively predicts whether GUIs are desirable or undesirable. Then, the undesirable results are inputted into a SDQL-based deviation analysis. If the deviation is low, it is assumed as an update; otherwise, it is considered as an error. The experimental analysis depicted that the proposed system attained high dominance with 98.89% accuracy in the prevailing models.

APA, Harvard, Vancouver, ISO, and other styles

37

Al-Dailami, Abdulrahman, Chang Ruan, Zhihong Bao, and Tao Zhang. "QoS3: Secure Caching in HTTPS Based on Fine-Grained Trust Delegation." Security and Communication Networks 2019 (December 28, 2019): 1–16. http://dx.doi.org/10.1155/2019/3107543.

Full text

Abstract:

With the ever-increasing concern in network security and privacy, a major portion of Internet traffic is encrypted now. Recent research shows that more than 70% of Internet content is transmitted using HyperText Transfer Protocol Secure (HTTPS). However, HTTPS encryption eliminates the advantages of many intermediate services like the caching proxy, which can significantly degrade the performance of web content delivery. We argue that these restrictions lead to the need for other mechanisms to access sites quickly and safely. In this paper, we introduce QoS3, which is a protocol that can overcome such limitations by allowing clients to explicitly and securely re-introduce in-network caching proxies using fine-grained trust delegation without compromising the integrity of the HTTPS content and modifying the format of Transport Layer Security (TLS). In QoS3, we classify web page contents into two types: (1) public contents that are common for all users, which can be stored in the caching proxies, and (2) private contents that are specific for each user. Correspondingly, QoS3 establishes two separate TLS connections between the client and the web server for them. Specifically, for private contents, QoS3 just leverages the original HTTPS protocol to deliver them, without involving any middlebox. For public contents, QoS3 allows clients to delegate trust to specific caching proxy along the path, thereby allowing the clients to use the cached contents in the caching proxy via a delegated HTTPS connection. Meanwhile, to prevent Man-in-the-Middle (MitM) attacks on public contents, QoS3 validates the public contents by employing Document object Model (DoM) object-level checksums, which are delivered through the original HTTPS connection. We implement a prototype of QoS3 and evaluate its performance in our testbed. Experimental results show that QoS3 provides acceleration on page load time ranging between 30% and 64% over traditional HTTPS with negligible overhead. Moreover, QoS3 is deployable since it requires just minor software modifications to the server, client, and the middlebox.

APA, Harvard, Vancouver, ISO, and other styles

38

Qi, Nian, and Ji Hong Ye. "Nonlinear Dynamic Analysis of Space Frame Structures by Discrete Element Method." Applied Mechanics and Materials 638-640 (September 2014): 1716–19. http://dx.doi.org/10.4028/www.scientific.net/amm.638-640.1716.

Full text

Abstract:

This document explores the possibility of the discrete element method (DEM) being applied in nonlinear dynamic analysis of space frame structures. The method models the analyzed object to be composed by finite particles and the Newton’s second law is applied to describe each particle’s motion. The parallel-bond model is adopted during the calculation of internal force and moment arising from the deformation. The procedure of analysis is vastly simple, accurate and versatile. Numerical examples are given to demonstrate the accuracy and applicability of this method in handling the large deflection and dynamic behaviour of space frame structures. Besides, the method does not need to form stiffness matrix or iterations, so it is more advantageous than traditional nonlinear finite element method.

APA, Harvard, Vancouver, ISO, and other styles

39

Chrismanto, Antonius Rachmat, Willy Sudiarto Raharjo, and Yuan Lukito. "Firefox Extension untuk Klasifikasi Komentar Spam pada Instagram Berbasis REST Services." Jurnal Edukasi dan Penelitian Informatika (JEPIN) 5, no. 2 (2019): 146. http://dx.doi.org/10.26418/jp.v5i2.33010.

Full text

Abstract:

Klasifikasi komentar spam pada Instagram (IG) hanya dapat digunakan oleh pengguna melalui sistem yang berjalan di sisi client, karena data IG tidak dapat dimanipulasi dari luar IG. Dibutuhkan sistem yang dapat memanipulasi data dari sisi client dalam bentuk browser extension. Penelitian ini berfokus pada pengembangan browser extension untuk Firefox yang memanfaatkan web services REST pada layanan cloud dengan platform Amazon Web Services (AWS). Browser extension yang dikembangkan menggunakan 2 algoritma klasifikasi, yaitu KNN dan Distance-Weighted KNN (DW-KNN). Extension ini mampu menandai komentar spam dengan mengubah Document Object Model (DOM) IG menjadi berwarna merah dengan dicoret (strikethrough). Metode pengembangan extension dilakukan dengan metode Rapid Application Development (RAD). Pengujian pada penelitian ini dilakukan pada hasil implementasi browser extension dan pengukuran akurasi web service (algoritma KNN & DW-KNN). Pengujian implementasi browser extension menggunakan metode pengujian fungsionalitas, dimana setiap fitur yang telah diimplementasikan diuji apakah sudah sesuai dengan spesifikasi yang telah ditentukan sebelumnya. Pengujian akurasi web service dilakukan dengan bantuan tool SOAPUI. Hasil pengujian extension adalah: (1) pengujian extension pada sembarang halaman web berhasil 100%, (2) pengujian pada halaman awal (default) IG berhasil 100%, (3) pengujian pada halaman profile suatu akun IG berhasil 100%, (4) pengujian pada suatu posting IG dan komentarnya, tidak selalu berhasil karena dipengaruhi oleh kemampuan algoritma pada web services, (5) pengujian untuk bahasa bukan Indonesia tidak selalu berhasil karena bergantung pada library bahasa, (6) pengujian untuk load more comments pada IG tidak selalu berhasil karena bergantung pada algoritma pada web services, dan (7) pengujian pilihan algoritma pada options extension berhasil 100%. Hasil akurasi rata-rata tertinggi algoritma KNN adalah 80% untuk k=1, sedangkan DW-KNN adalah 90% untuk k=2.

APA, Harvard, Vancouver, ISO, and other styles

40

Fang, Xiu Susie, Quan Z. Sheng, Xianzhi Wang, Anne H. H. Ngu, and Yihong Zhang. "GrandBase: generating actionable knowledge from Big Data." PSU Research Review 1, no. 2 (2017): 105–26. http://dx.doi.org/10.1108/prr-01-2017-0005.

Full text

Abstract:

Purpose This paper aims to propose a system for generating actionable knowledge from Big Data and use this system to construct a comprehensive knowledge base (KB), called GrandBase. Design/methodology/approach In particular, this study extracts new predicates from four types of data sources, namely, Web texts, Document Object Model (DOM) trees, existing KBs and query stream to augment the ontology of the existing KB (i.e. Freebase). In addition, a graph-based approach to conduct better truth discovery for multi-valued predicates is also proposed. Findings Empirical studies demonstrate the effectiveness of the approaches presented in this study and the potential of GrandBase. The future research directions regarding GrandBase construction and extension has also been discussed. Originality/value To revolutionize our modern society by using the wisdom of Big Data, considerable KBs have been constructed to feed the massive knowledge-driven applications with Resource Description Framework triples. The important challenges for KB construction include extracting information from large-scale, possibly conflicting and different-structured data sources (i.e. the knowledge extraction problem) and reconciling the conflicts that reside in the sources (i.e. the truth discovery problem). Tremendous research efforts have been contributed on both problems. However, the existing KBs are far from being comprehensive and accurate: first, existing knowledge extraction systems retrieve data from limited types of Web sources; second, existing truth discovery approaches commonly assume each predicate has only one true value. In this paper, the focus is on the problem of generating actionable knowledge from Big Data. A system is proposed, which consists of two phases, namely, knowledge extraction and truth discovery, to construct a broader KB, called GrandBase.

APA, Harvard, Vancouver, ISO, and other styles

41

Jang, Hyeong-Seok, Mohsen Ali Alawami, and Ki-Woong Park. "AV-Teller: Browser Fingerprinting for Client-Side Security Software Identification." Applied Sciences 15, no. 9 (2025): 5059. https://doi.org/10.3390/app15095059.

Full text

Abstract:

The rapid proliferation of digitalization and the growing reliance on internet-based technologies by individuals and organizations have led to a significant escalation in the frequency and sophistication of cyberattacks. As attackers continuously refine their methods to evade conventional defense mechanisms, antivirus solutions, despite their widespread utilization as primary security tools, face increasing challenges in addressing these evolving threats. This study introduces AV-Teller, a novel framework designed for analyzing antivirus behavior through interactions with web browsers. AV-Teller reveals weaknesses in antivirus detection mechanisms by highlighting ways in which web browser interactions may inadvertently expose critical aspects of antivirus operations. The framework provides key insights into the vulnerabilities inherent to these detection processes and their implications for the interplay between antivirus systems and modern web technologies. To assess the efficacy of the AV-Teller in detecting antivirus via web browsers, the framework evaluates three detection scenarios: Document Object Model (DOM) Monitoring-Based Detection, Signature-Based Detection, and Phishing Page-Based Detection. The results revealed performance inconsistencies: 16 products (57%) failed to respond to any tested scenarios, exhibiting deficiencies in threat mitigation capabilities. Of the 12 products (43%) that successfully handled three scenarios, 9 (75%) inadvertently disclosed identifiable antivirus metadata during assessments, thereby enabling attackers to pinpoint specific antivirus solutions and exploit their vulnerabilities. These findings highlight critical gaps in the interaction between antivirus systems and web technologies, exposing systemic flaws in existing security mechanisms. The inadvertent exposure of sensitive antivirus data underscores the necessity for robust data handling protocols, necessitating collaboration between antivirus developers and web technology stakeholders to design secure frameworks. By exposing these risks, the AV-Teller framework elucidates the limitations of current defenses and establishes a foundation for the enhancement of antivirus technologies to address emerging cyber threats effectively.

APA, Harvard, Vancouver, ISO, and other styles

42

Ратов, Д. В. "Object adaptation of Drag and Drop technology for web-system interface components." ВІСНИК СХІДНОУКРАЇНСЬКОГО НАЦІОНАЛЬНОГО УНІВЕРСИТЕТУ імені Володимира Даля, no. 4(268) (June 10, 2021): 7–12. http://dx.doi.org/10.33216/1998-7927-2021-268-4-7-12.

Full text

Abstract:

Today, in the development of information systems, cloud technologies are often used for remote computing and data processing. There are web technologies, and on their basis, libraries and frameworks have been developed for creating web applications and user interfaces designed for the operation of information systems in browsers. Ready-made JavaScript libraries have been developed to add drag and drop functionality to a web application. However, in some situations, the library may not be available, or there may be overhead or dependencies that the project does not need to use it. In such situations, an alternative solution provides the functionality of APIs available in modern browsers. The article discusses the current state of the methods of the Drag and Drop mechanism and proposes a programmatic way to improve the interface by creating a class for dragging and dropping elements when organizing work in multi-user information web systems. Drag and Drop is a convenient way to improve the interface. Grabbing an element with the mouse and moving it visually simplifies many operations: from copying and moving documents, as in file managers, to placing orders in online store services. The HTML drag and drop API uses the DOM event model to retrieve information about a dragged element and update that element after the drag. Using JavaScript event handlers, it is possible to turn any element of the web system into a drag-and-drop element or drop target. To solve this problem, a JavaScript object was developed with methods that allow you to create a copy of any object and handle all events of this object aimed at organizing the Drag and Drop mechanism. Basic algorithm of Drag and Drop technology based on processing mouse events. The software implementation is considered and the results of the practical use of object adaptation of the Drag and Drop technology for the interface components of the web system - the medical information system MedSystem, in which the application modules have the implementation of the dispatcher and the interactive window interface are presented. In the "Outpatient clinic" module, the Drag and Drop mechanism is used when working with the "Appointment sheet". In the "Hospital" module of the MedSystem medical information system, the Drag and Drop mechanism is used in the "List of doctor's appointments". The results of using object adaptation of Drag and Drop technology have shown that this mechanism organically fits into existing technologies for building web applications and has sufficient potential to facilitate and automate work in multi-user information systems and web services.

APA, Harvard, Vancouver, ISO, and other styles

43

Fernandez-Tudela, Elisa, Luis C. Zambrano, Lázaro G. Lagóstena, and Manuel Bethencourt. "Documentación y análisis de un cepo de ancla romano y sus elementos iconográficos y epigráficos sellados." Virtual Archaeology Review 13, no. 26 (2022): 147–62. http://dx.doi.org/10.4995/var.2022.15349.

Full text

Abstract:

This paper aims to present the documentation and analysis methodology carried out on a lead trap from the ancient period, which belongs to the collection of traps in the Museum of Cádiz (Andalusia, Spain). The anchor stock had some interesting characteristics for this research. On the one hand, from the point of view of conservation and restoration, due to the alterations it presented. On the other hand, from a historical and archaeological point of view, it showed signs of reliefs on its surface hidden under the alteration products. The removal of the different layers of alteration that covered the surface during conservation and restoration treatments revealed an unpublished iconographic and epigraphic programme, as well as possible marks of use and manufacture. The poor state of conservation of the original surface made it impossible to visualise the details as a whole, so we applied photogrammetric methods, and subsequently processed models using various GIS analysis and point cloud processing softwares.Two photogrammetric models (in Agisoft PhotoScan) were made to document the trap in general: one prior to the conservation and restoration process; and a second three-dimensional (3D) model once the surface had been cleaned. The purpose of the second model was to visualise the reliefs programme in general, as well as the different surface details. The first complete 3D model of the object was used to perform a virtual reconstruction of the anchor including the elements that did not preserve, using a 3D modelling program (Blender).Nine areas of the stock surface were selected for the analyses of the various iconographic and epigraphic features, which were documented and processed in Agisoft PhotoScan. The Digital Elevation Model (DEM) and point cloud models were then processed with different analyses tools in Geographic Information System (GIS) (such as QGIS) and point cloud processing software (CloudCompare). Our results document a piece of highly interesting information from its surface consisting of reliefs of four dolphins; at least four rectangular stamps: two of them with possible inscriptions, and an anthropomorphic figure. Thanks to the comparative data, we conclude that the four dolphins were made with the same stamp during the stock manufacturing process. Further, we were able to reconstruct the dolphin stamp, partially preserved in each of the reliefs, by unifying the 3D models, thus revealing the original set. This system of stamping by means of reusable dies is well known in other elements such as amphorae but has not been studied in the specific case of lead traps.In the case of the epigraphic elements, the 3D documentation methodology revealed numerous micro-surface details, not visible under conventional documentation techniques, which could help specialists to interpret these inscriptions. Although they have not been analysed in this research, its documentation has promoted the appreciation of surface details that could refer to the manufacturing processes (moulds and tools) or the traces of use, providing historical information on this object. At the same time, the virtual reconstruction of the anchor has aided the formation of hypotheses on the dimensions and original appearance of the anchor. The different tools used, such as raster analysis using shadow mapping and point cloud alignment, proved to be very effective. They have fulfilled the established objectives and have helped to establish a possible analysis methodology for future lead traps with decorative elements. These types of artefacts recovered from underwater sites are very common in museum collections. In many cases, their state of conservation and the difficulty in handling them due to their size and weight make it difficult to document surface details. In this case, the multidisciplinary work of conservation and 3D documentation allows for high-quality documentation that is easy to access and exchange between researchers. The combined use of photogrammetric techniques with virtual RTI provides a non-invasive method for the object, low cost and easy processing compared to other conventional methods.

APA, Harvard, Vancouver, ISO, and other styles

44

Triebel, Dagmar, Dragan Ivanovic, Bar-Gal Gila Kahila, Sven Bingert, and Tanja Weibulat. "Towards a COST MOBILISE Guideline for Long Term Preservation and Archiving of Data Constructs from Scientific Collections Facilities." Biodiversity Information Science and Standards 5 (September 3, 2021): e73901. https://doi.org/10.3897/biss.5.73901.

Full text

Abstract:

COST (European Cooperation in Science and Technology) is a funding organisation for research and innovation networks. One of the objectives of the COSTAction called "Mobilising Data, Policies and Experts in Scientific Collections" (MOBILISE) is to work on documents for expert training with broad involvement of professionals from the participating European countries. The guideline presented here in its general concept will address principles, strategies and standards for long term preservation and archiving of data constructs (data packages, data products) as addressed by and under control of the scientific collections community. The document is being developed as part of the MOBILISE Action targeted towards primarily scientific staff at natural scientific collection facilities, as well as management bodies of collections like museums, herbaria and information technology personnel less familiar with data archiving principles and routines.The challenges of big data storage and (distributed, cloud-based) storage solutions as well as that of data mirroring, backing up, synchronisation and publication in productive data environments are well addressed by documents, guidelines and online platforms, e.g., in the DISSCo knowledge base (see Hardisty et al. (2020)) and as part of concepts of the European Open Science Cloud (EOSC). Archival processes and the resulting data constructs, however, are often left outside of the considerations. This is a large gap because archival issues are not only simple technical ones as addressed by the term "bit preservation" but also envisage a number of logical, functional, normative, administrative and semantic issues as addressed by the term "functional long-term archiving".The main target digital object types addressed by this COST MOBILISE Guideline are data constructs called Digital or Digital Extended Specimens and data products with the persistent identifier assignment lying under the authority of scientific collections facilities. Such digital objects are specified according to the Digital Object Architecture (DOA , see Wittenburg et al. 2018) and similar abstract models introduced by Harjes et al. (2020) and Lannom et al. (2020). The scientific collection-specific types are defined following evolving concepts in the context of the Consortium of European Taxonomic Facilities (CETAF), the research infrastructure DiSSCo (Distributed System of Scientific Collections), and the Biodiversity Information Standards (TDWG). Archival processes are described following the OAIS (Open Archival Information System) reference model. The archived objects should be reusable in the sense of the FAIR (Findable, Accessible, Interoperable, and Reusable) guiding principles. Organisations like national (digital) archives, computing or professional (domain-specific) data centers as well as libraries might offer specific archiving services and act as partner organisations of scientific collections facilities.The guideline consists of key messages that have been defined. They address the collection community, especially the staff and leadership of taxonomic facilities. Aspects of several groups of stakeholders are discussed as well as cost models. The guideline does not recommend specific solutions for archiving software and workflows. Supplementary information is delivered via a wiki-based platform for the COST MOBILISE Archiving Working Group WG4.

APA, Harvard, Vancouver, ISO, and other styles

45

Klochkov, Denys, and Jan Mulawka. "Improving Ruby on Rails-Based Web Application Performance." Information 12, no. 8 (2021): 319. http://dx.doi.org/10.3390/info12080319.

Full text

Abstract:

The evolution of web development and web applications has resulted in creation of numerous tools and frameworks that facilitate the development process. Even though those frameworks make web development faster and more efficient, there are certain downsides to using them. A decrease in application performance when using an “off the shelf” framework might be a crucial disadvantage, especially given the vital role web application response time plays in user experience. This contribution focuses on a particular framework—Ruby on Rails. Once the most popular framework, it has now lost its leading position, partially due to slow performance metrics and response times, especially in larger applications. Improving and expanding upon the previous work in this field, an attempt to improve the response time of a specially developed benchmark application is made. This is achieved by performing optimizations that can be roughly divided into two groups. The first group concerns the frontend improvements, which include: adopting the client-side rendering, JavaScript Document Object Model (DOM) manipulation and asynchronous requests. Another group can be described as the backend improvements, which include implementing intelligent, granular caching, disabling redundant modules, as well as profiling and optimizing database requests and reducing database access inefficiencies. Those improvements resulted in overall up to 74% decreased page loading times, with perceived application performance being improved above this mark due to the adoption of a client-side rendering strategy. Using the different metrics of application performance measurements, each of the improvement steps is evaluated with regards to its effect on different aspects of overall performance. In conclusion, this work presents a way to significantly decrease the response time of a particular Ruby on Rails application and simultaneously provide a better user experience. Even though the majority of this process is specific to Rails, similar steps can be taken to improve applications implemented with the use of other similar frameworks. As the result of the work, a groundwork is laid for the development of the tool that could assist the developers in improving their applications as well.

APA, Harvard, Vancouver, ISO, and other styles

46

Tse, William T., Kevin K. Duh, and Morris Kletzel. "A Low-Cost, Open-Source Informatics Framework for Clinical Trials and Outcomes Research." Blood 118, no. 21 (2011): 4763. http://dx.doi.org/10.1182/blood.v118.21.4763.4763.

Full text

Abstract:

Abstract Abstract 4763 Data collection and analysis in clinical studies in hematology often require the use of specialized databases, which demand extensive information technology (IT) support and are expensive to maintain. With the goal of reducing the cost of clinical trials and promoting outcomes research, we have devised a new informatics framework that is low-cost, low-maintenance, and adaptable to both small- and large-scale clinical studies. This framework is based on the idea that most clinical data are hierarchical in nature: a clinical protocol typically entails the creation of sequential patient files, each of which documents multiple encounters, during which clinical events and data are captured and tagged for later retrieval and analysis. These hierarchical trees of clinical data can be easily stored in a hypertext mark-up language (HTML) document format, which is designed to represent similar hierarchical data on web pages. In this framework, the stored clinical data will be structured according to a web standard called Document Object Model (DOM), for which powerful informatics techniques have been developed to allow efficient retrieval and collation of data from the HTML documents. The proposed framework has many potential advantages. The data will be stored in plain text files in the HTML format, which is both human and machine readable, hence facilitating data exchange between collaborative groups. The framework requires only a regular web browser to function, thereby easing its adoption in multiple institutions. There will be no need to set up or maintain a relational database for data storage, thus minimizing data fragmentation and reducing the demand for IT support. Data entry and analysis will be performed mostly on the client computer, requiring the use of a backend server only for central data storage. Utility programs for data management and manipulation will be written in Javascript and JQuery, computer languages that are free, open-source and easy to maintain. Data can be captured, retrieved, and analyzed on different devices, including desktop computers, tablets or smart phones. Encryption and password protection can be applied in document storage and data transmission to ensure data security and HIPPA compliance. In a pilot project to implement and test this informatics framework, we designed prototype programming modules to perform individual tasks commonly encountered in clinical data management. The functionalities of these modules included user-interface creation, patient data entry and retrieval, visualization and analysis of aggregate results, and exporting and reporting of extracted data. These modules were used to access simulated clinical data stored in a remote server, employing standard web browsers available on all desktop computers and mobile devices. To test the capability of these modules, benchmark tests were performed. Simulated datasets of complete patient records, each with 1000 data items, were created and stored in the remote server. Data were retrieved via the web using a gzip compressed format. Retrieval of 100, 300, 1000 such records took only 1.01, 2.45, and 6.67 seconds using a desktop computer via a broadband connection, or 3.67, 11.39, and 30.23 seconds using a tablet computer via a 3G connection. Filtering of specific data from the retrieved records was equally speedy. Automated extraction of relevant data from 300 complete records for a two-sample t-test analysis took 1.97 seconds. A similar extraction of data for a Kaplan-Meier survival analysis took 4.19 seconds. The program allowed the data to be presented separately for individual patients or in aggregation for different clinical subgroups. A user-friendly interface enabled viewing of the data in either tabular or graphical forms. Incorporation of a new web browser technique permitted caching of the entire dataset locally for off-line access and analysis. Adaptable programming allowed efficient export of data in different formats for regulatory reporting purposes. Once the system was set up, no further intervention from IT department was necessary. In summary, we have designed and implemented a prototype of a new informatics framework for clinical data management, which should be low-cost and highly adaptable to various types of clinical studies. Field-testing of this framework in real-life clinical studies will be the next step to demonstrate its effectiveness and potential benefits. Disclosures: No relevant conflicts of interest to declare.

APA, Harvard, Vancouver, ISO, and other styles

47

Pakhmutova, N. "Differential object marking in Ibero-Romance languages: Explanatory models in domestic and Russian educational literature." Rhema, no. 2, 2019 (2019): 61–76. http://dx.doi.org/10.31862/2500-2953-2019-2-61-76.

Full text

Abstract:

Differential object marking / dom is the term for the phenomenon of distinguishing two classes of direct objects, one bearing a special marker, while the other lacking it. In modern linguistics, the marker licensing is partially or fully attributed to the features of a direct object: Animacy/Inanimacy and referential status. Russian didactic literature generally contains a reduced explanatory model of Spanish dom, based on the grammar of the Royal Spanish Academy. For Catalan, the explanatory model is complicated by the usus/norm split, the latter reducing the phenomenon’s scope. The paper focuses on the improvement of dom explanatory models for Spanish and Catalan.

APA, Harvard, Vancouver, ISO, and other styles

48

Avram, Larisa, Alexandru Mardale, and Elena Soare. "Animacy in the acquisition of differential object marking by Romanian monolingual children." Bucharest Working Papers in Linguistics 25, no. 2 (2023): 81–104. http://dx.doi.org/10.31178/bwpl.25.2.5.

Full text

Abstract:

Differential object marking (DOM) has been shown, in an impressive number of production studies, to be acquired by monolingual children at around age 3. The picture which emerges from comprehension data, however, reveals that DOM is an area of vulnerability in L1 acquisition. This study investigates the acquisition of DOM by monolingual Romanian children using a preference judgment task. 80 monolingual Romanian children (aged 4;04-11;04) and a control group of 10 Romanian adults took part in the study. Results show that DOM is vulnerable and trace this vulnerability to the animacy feature. Romanian children incorrectly overgeneralize DOM to inanimate proper names and inanimate descriptive DPs until age 9. The vulnerability of animacy is predicted by its variable behaviour with respect to object marking as well as by the current increase in the use of clitic doubling, a DOM marker less sensitive to animacy. On the learnability side, we account for the findings in terms of Biberauer & Roberts’ (2015, 2017) Maximize Minimal Means model. We suggest that, in accordance with the Feature Economy bias, Romanian children first identify only the role of referential stability (which has more robust cues in the input) and consider the possibility of animacy as a relevant feature later. In line with the Input Generalization bias, children maximize the role of referential stability which results in overgeneralization of DOM to inanimate objects, especially to inanimate proper names.

APA, Harvard, Vancouver, ISO, and other styles

49

Ma, Jie, Dongyan Pei, Xuhan Zhang, et al. "The Distribution of DOM in the Wanggang River Flowing into the East China Sea." International Journal of Environmental Research and Public Health 19, no. 15 (2022): 9219. http://dx.doi.org/10.3390/ijerph19159219.

Full text

Abstract:

Dissolved organic matter (DOM) is a central component in the biogeochemical cycles of marine and terrestrial carbon pools, and its structural features greatly impact the function and behavior of ecosystems. In this study, the Wanggang River, which is a seagoing river that passes through Yancheng City, was selected as the research object. Three-dimensional (3D) fluorescence spectral data and UV–visible spectral data were used for component identification and source analysis of DOM based on the PARAFAC model. The results showed that the DOM content of the Wanggang River during the dry season was significantly higher than during the wet season; the DOM content increased gradually from the upper to lower reaches; the proportion of terrigenous components was higher during the wet season than during the dry. UV–Vis spectral data a280 and a355 indicated that the relative concentrations of protein-like components in the DOM of the Wanggang River were higher than those of humic-like components, and the ratio of aromatic substances in the DOM of the Wanggang River water was higher during the wet season. The DOM in the Wanggang River was dominated by protein-like components (>60%), and the protein-like components were dominated by tryptophan proteins (>40%). This study showed that the temporal and spatial distributions of DOM in rivers can be accurately determined using 3D fluorescence spectroscopy combined with the PARAFAC model. This provides useful insight into the biogeochemical process of DOM in rivers of coastal areas.

APA, Harvard, Vancouver, ISO, and other styles

50

Coussé, Evie, Yvonne Adesam, Faton Rekathati, and Aleksandrs Berdicevskis. "Hur används de, dem och dom i nutida skriftspråk? En storskalig korpusundersökning av nyheter och sociala medier." Språk och stil 33 (March 15, 2024): 39–70. https://doi.org/10.61965/sos.v33i.18928.

Full text

Abstract:

This study ties in with a longstanding debate on the Swedish spelling variants de, dem and dom for personal pronouns (third person plural) and definite articles (plural). It charts the usage of de, dem and dom in five large corpora with news and social media texts over the past 25 years. The corpora contain more than 1.5 billion tokens, which rules out manual handling of the data. Instead, this study makes use of computational methods (including an AI language model) to automatically identify and classify relevant observations. Analysis of the news corpora shows a relatively stable usage of de, dem and dom over the past 25 years. The forms de and dem are predominantly used according to the norm: de for pronouns in subject position and as a definite article; dem for pronouns in object position. The colloquial form dom is hardly found in news texts. Analysis of the social media corpora shows more variation and change. The colloquial form dom is used in 5–25% of all instances instead of de or dem and has decreased after an initial rise. The forms de and dem are sometimes used in a non-standard way: de occurs in object position in 4–10% of the observations; dem is found in subject position or as a definite article in 1–7% of the cases. Non-standard dem is potentially on the rise with younger writers. The corpus analysis also provides details on the usage of de and dem in relative clauses, and on the users’ ratings of posts containing de, dem and dom on the social media platform Reddit.

APA, Harvard, Vancouver, ISO, and other styles

Journal articles on the topic 'Document Object Model DOM'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles