To see the other types of publications on this topic, follow the link: Document network.

Journal articles on the topic 'Document network'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Document network.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Zhang, Ce, and Hady W. Lauw. "Topic Modeling on Document Networks with Adjacent-Encoder." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 6737–45. http://dx.doi.org/10.1609/aaai.v34i04.6152.

Full text
Abstract:
Oftentimes documents are linked to one another in a network structure,e.g., academic papers cite other papers, Web pages link to other pages. In this paper we propose a holistic topic model to learn meaningful and unified low-dimensional representations for networked documents that seek to preserve both textual content and network structure. On the basis of reconstructing not only the input document but also its adjacent neighbors, we develop two neural encoder architectures. Adjacent-Encoder, or AdjEnc, induces competition among documents for topic propagation, and reconstruction among neighbors for semantic capture. Adjacent-Encoder-X, or AdjEnc-X, extends this to also encode the network structure in addition to document content. We evaluate our models on real-world document networks quantitatively and qualitatively, outperforming comparable baselines comprehensively.
APA, Harvard, Vancouver, ISO, and other styles
2

Noel, Steven, Chee-Hung Henry Chu, and Vijay Raghavan. "Co-Citation Count vs Correlation for Influence Network Visualization." Information Visualization 2, no. 3 (September 2003): 160–70. http://dx.doi.org/10.1057/palgrave.ivs.9500049.

Full text
Abstract:
Visualization of author or document influence networks as a two-dimensional image can provide key insights into the direct influence of authors or documents on each other in a document collection. The influence network is constructed based on the minimum spanning tree, in which the nodes are documents and an edge is the most direct influence between two documents. Influence network visualizations have typically relied on co-citation correlation as a measure of document similarity. That is, the similarity between two documents is computed by correlating the sets of citations to each of the two documents. In a different line of research, co-citation count (the number of times two documents are jointly cited) has been applied as a document similarity measure. In this work, we demonstrate the impact of each of these similarity measures on the document influence network. We provide examples, and analyze the significance of the choice of similarity measure. We show that correlation-based visualizations exhibit chaining effects (low average vertex degree), a manifestation of multiple minor variations in document similarities. These minor similarity variations are absent in count-based visualizations. The result is that count-based influence network visualizations are more consistent with the intuitive expectation of authoritative documents being hubs that directly influence large numbers of documents.
APA, Harvard, Vancouver, ISO, and other styles
3

Yerokhin, A. L., and O. V. Zolotukhin. "Fuzzy probabilistic neural network in document classification tasks." Information extraction and processing 2018, no. 46 (December 27, 2018): 68–71. http://dx.doi.org/10.15407/vidbir2018.46.068.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Bai, Juho, Inwook Shim, and Seog Park. "MEXN: Multi-Stage Extraction Network for Patent Document Classification." Applied Sciences 10, no. 18 (September 8, 2020): 6229. http://dx.doi.org/10.3390/app10186229.

Full text
Abstract:
The patent document has different content for each paragraph, and the length of the document is also very long. Moreover, patent documents are classified hierarchically as multi-labels. Many works have employed deep neural architectures to classify the patent documents. Traditional document classification methods have not well represented the characteristics of entire patent document contents because they usually require a fixed input length. To address this issue, we propose a neural network-based document classification for patent documents by designing a novel multi-stage feature extraction network (MEXN), which comprise of paragraphs encoder and summarizer for all paragraphs. MEXN features analysis of the whole documents hierarchically and providing multi-labels outputs. Furthermore, MEXN preserves computing performance marginally increase. We demonstrate that the proposed method outperforms current state-of-the-art models in patent document classification tasks with multi-label classification experiments for USPD datasets.
APA, Harvard, Vancouver, ISO, and other styles
5

Zheng, Jianming, Yupu Guo, Chong Feng, and Honghui Chen. "A Hierarchical Neural-Network-Based Document Representation Approach for Text Classification." Mathematical Problems in Engineering 2018 (2018): 1–10. http://dx.doi.org/10.1155/2018/7987691.

Full text
Abstract:
Document representation is widely used in practical application, for example, sentiment classification, text retrieval, and text classification. Previous work is mainly based on the statistics and the neural networks, which suffer from data sparsity and model interpretability, respectively. In this paper, we propose a general framework for document representation with a hierarchical architecture. In particular, we incorporate the hierarchical architecture into three traditional neural-network models for document representation, resulting in three hierarchical neural representation models for document classification, that is, TextHFT, TextHRNN, and TextHCNN. Our comprehensive experimental results on two public datasets, that is, Yelp 2016 and Amazon Reviews (Electronics), show that our proposals with hierarchical architecture outperform the corresponding neural-network models for document classification, resulting in a significant improvement ranging from 4.65% to 35.08% in terms of accuracy with a comparable (or substantially less) expense of time consumption. In addition, we find that the long documents benefit more from the hierarchical architecture than the short ones as the improvement in terms of accuracy on long documents is greater than that on short documents.
APA, Harvard, Vancouver, ISO, and other styles
6

CHEN, YANPING, QINGHUA ZHENG, FENG TIAN, HUAN LIU, YAZHOU HAO, and NAZARAF SHAH. "Exploring open information via event network." Natural Language Engineering 24, no. 2 (October 26, 2017): 199–220. http://dx.doi.org/10.1017/s1351324917000390.

Full text
Abstract:
AbstractIt is a challenging task to discover information from a large amount of data in an open domain.1 In this paper, an event network framework is proposed to address this challenge. It is in fact an empirical construct for exploring open information, composed of three steps: document event detection, event network construction and event network analysis. First, documents are clustered into document events for reducing the impact of noisy and heterogeneous resources. Secondly, linguistic units (e.g., named entities or entity relations) are extracted from each document event and combined into an event network, which enables content-oriented retrieval. Then, in the final step, techniques such as social network or complex network can be applied to analyze the event network for exploring open information. In the implementation section, we provide examples of exploring open information via event network.
APA, Harvard, Vancouver, ISO, and other styles
7

Gupta, Akanksha, Ravindra Pratap Narwaria, and Madhav Singh. "Review on Deep Learning Handwritten Digit Recognition using Convolutional Neural Network." International Journal of Recent Technology and Engineering 9, no. 5 (January 30, 2021): 245–47. http://dx.doi.org/10.35940/ijrte.e5287.019521.

Full text
Abstract:
In this digital world, everything including documents, notes is kept in digital form. The requirement of converting these digital documents into processed information is in demand. This process is called as Handwritten digit recognition (HDR). The digital scan document is processed and classified to identify the hand written words into digital text so that it can be used to keep it in the documents format means in computerized font so that everybody can read it properly. In this paper, it is discussed that classifiers like KNN, SVM, CNN are used for HDR. These classifiers are trained with some predefined dataset and then used to process any digital scan document into computer document format. The scanned document is passed through four different stages for recognition where image is preprocessed, segmented and then recognized by classifier. MNIST dataset is used for training purpose. Complete CNN classifier is discussed in this paper. It is found that CNN is very accurate for HDR but still there is a scope to improve the performance in terms of accuracy, complexity and timing.
APA, Harvard, Vancouver, ISO, and other styles
8

Yang, Ji Ying, Bei Zhang, and Yu Mao. "Study on Information Retrieval Sorting Algorithm in Network-Based Manufacturing Environment." Applied Mechanics and Materials 484-485 (January 2014): 183–86. http://dx.doi.org/10.4028/www.scientific.net/amm.484-485.183.

Full text
Abstract:
The core problem of information retrieval is concentrated in the document for the user to retrieve the most relevant sub-set of documents, relying on sorting algorithms on the search results according to relevance sort, sorted the results as the user asked the query response information retrieval performance is determined by many factors, such as to query expressions quality index stemmer nonsense word disabled, query expansion technology, but fundamentally it is determined by the sort function sort function in some Standards document query indicates the degree of matching with the user, and accordingly to make a document with respect to the user's judgment, then the document in accordance with the degree of correlation with respect to the user in descending order, and returns the ordered list of documents as a result of the retrieval the pros and cons of the sorting algorithm directly affect the efficiency of the retrieval.
APA, Harvard, Vancouver, ISO, and other styles
9

Baranyi, Peter, Laszlo T. Koczy, and Tamas D. Gedeon. "Improved Fuzzy and Neural Network Algorithms for Word Frequency Prediction in Document Filtering." Journal of Advanced Computational Intelligence and Intelligent Informatics 2, no. 3 (June 20, 1998): 88–95. http://dx.doi.org/10.20965/jaciii.1998.p0088.

Full text
Abstract:
With very large document collections or high-volume document streams of , finding relevant documents is a major information filtering problem. An aid to information retrieval systems produces a word frequency measure estimated from important parts of the document using neural network approaches. In this paper, a fuzzy logic technique and, as its simplified case, a neural network algorithm are proposed for this task. The comparison of these two and an alternative neural network algorithm are discussed.
APA, Harvard, Vancouver, ISO, and other styles
10

Tripathi, Kshitij, Rajendra G. Vyas, and Anil K. Gupta. "Document Classification Using Artificial Neural Network." Asian Journal of Computer Science and Technology 8, no. 2 (May 5, 2019): 55–58. http://dx.doi.org/10.51983/ajcst-2019.8.2.2140.

Full text
Abstract:
The Document classification system is the field of data mining in which the format of data is based on bag of words (BoW) or document vector model and the task is to build a machine which after successfully learn the characteristic of given data set, predicts the category of the document to which the word vector belongs. In this approach document is represented by BoW where every single word is used as feature which occurs in a document. The proposed article presents artificial neural network approach which is hybrid of n-fold cross validation and training-validation-test approach for classification of data.
APA, Harvard, Vancouver, ISO, and other styles
11

Lee, Kangwook, Sanggyu Han, and Sung-Hyon Myaeng. "A discourse-aware neural network-based text model for document-level text classification." Journal of Information Science 44, no. 6 (December 4, 2017): 715–35. http://dx.doi.org/10.1177/0165551517743644.

Full text
Abstract:
Capturing semantics scattered across entire text is one of the important issues for Natural Language Processing (NLP) tasks. It would be particularly critical with long text embodying a flow of themes. This article proposes a new text modelling method that can handle thematic flows of text with Deep Neural Networks (DNNs) in such a way that discourse information and distributed representations of text are incorporate. Unlike previous DNN-based document models, the proposed model enables discourse-aware analysis of text and composition of sentence-level distributed representations guided by the discourse structure. More specifically, our method identifies Elementary Discourse Units (EDUs) and their discourse relations in a given document by applying Rhetorical Structure Theory (RST)-based discourse analysis. The result is fed into a tree-structured neural network that reflects the discourse information including the structure of the document and the discourse roles and relation types. We evaluate the document model for two document-level text classification tasks, sentiment analysis and sarcasm detection, with comparisons against the reference systems that also utilise discourse information. In addition, we conduct additional experiments to evaluate the impact of neural network types and adopted discourse factors on modelling documents vis-à-vis the two classification tasks. Furthermore, we investigate the effects of various learning methods, input units on the quality of the proposed discourse-aware document model.
APA, Harvard, Vancouver, ISO, and other styles
12

KHASHMAN, ADNAN, and BORAN SEKEROGLU. "DOCUMENT IMAGE BINARISATION USING A SUPERVISED NEURAL NETWORK." International Journal of Neural Systems 18, no. 05 (October 2008): 405–18. http://dx.doi.org/10.1142/s0129065708001671.

Full text
Abstract:
Advances in digital technologies have allowed us to generate more images than ever. Images of scanned documents are examples of these images that form a vital part in digital libraries and archives. Scanned degraded documents contain background noise and varying contrast and illumination, therefore, document image binarisation must be performed in order to separate foreground from background layers. Image binarisation is performed using either local adaptive thresholding or global thresholding; with local thresholding being generally considered as more successful. This paper presents a novel method to global thresholding, where a neural network is trained using local threshold values of an image in order to determine an optimum global threshold value which is used to binarise the whole image. The proposed method is compared with five local thresholding methods, and the experimental results indicate that our method is computationally cost-effective and capable of binarising scanned degraded documents with superior results.
APA, Harvard, Vancouver, ISO, and other styles
13

Cheng, Wen Zhi, Yi Yang, Liao Zhang, and Lian Li. "Optimization for Web-Based Online Document Management." Advanced Materials Research 756-759 (September 2013): 1135–40. http://dx.doi.org/10.4028/www.scientific.net/amr.756-759.1135.

Full text
Abstract:
In this paper, we construct a web-based document life-cycle management model. The model manages documents which consist of the institute library from their creation to the archive state. For an online office system, we aim at solving three issues: network delay, version storage problems and deletion strategy. To solve network delay, we propose both local and online document synchronized editing model. In addition, we combine the longest recursive chain with recursive chain time to optimize the system response time. In order to optimize documents to be deleted, we propose a two-step optimized method. In the performance test, the effectiveness of the method is confirmed to solve the issues of documents management.
APA, Harvard, Vancouver, ISO, and other styles
14

Chakravarty, Aniv, and Jagadish S. Kallimani. "Unsupervised Multi-Document Abstractive Summarization Using Recursive Neural Network with Attention Mechanism." Journal of Computational and Theoretical Nanoscience 17, no. 9 (July 1, 2020): 3867–72. http://dx.doi.org/10.1166/jctn.2020.8976.

Full text
Abstract:
Text summarization is an active field of research with a goal to provide short and meaningful gists from large amount of text documents. Extractive text summarization methods have been extensively studied where text is extracted from the documents to build summaries. There are various type of multi document ranging from different formats to domains and topics. With the recent advancement in technology and use of neural networks for text generation, interest for research in abstractive text summarization has increased significantly. The use of graph based methods which handle semantic information has shown significant results. When given a set of documents of English text files, we make use of abstractive method and predicate argument structures to retrieve necessary text information and pass it through a neural network for text generation. Recurrent neural networks are a subtype of recursive neural networks which try to predict the next sequence based on the current state and considering the information from previous states. The use of neural networks allows generation of summaries for long text sentences as well. This paper implements a semantic based filtering approach using a similarity matrix while keeping all stop-words. The similarity is calculated using semantic concepts and Jiang–Conrath similarity and making use of a recurrent neural network with an attention mechanism to generate summary. ROUGE score is used for measuring accuracy, precision and recall scores.
APA, Harvard, Vancouver, ISO, and other styles
15

KREESURADEJ, WORAPOJ, and APINYA SUWANLAMAI. "DOCUMENT CLUSTERING WITH PAIRWISE CONSTRAINTS." International Journal of Pattern Recognition and Artificial Intelligence 20, no. 02 (March 2006): 241–54. http://dx.doi.org/10.1142/s0218001406004636.

Full text
Abstract:
This paper proposes document clustering using Kohonen neural network with pairwise constraints. This algorithm works directly on textual information without mapping document into some representation that has quantitative features. The input level of the proposed neural network can directly receive a qualitative value without mapping the qualitative value into the numerical value. The proposed neural network is based on the architecture of text processing Kohonen neural network, the concepts of dissimilarity measure of symbolic objects and pairwise constrained concepts. As a result, the model can successfully assign cluster label to the objects.
APA, Harvard, Vancouver, ISO, and other styles
16

Shevtsov, Vadim, and Evgeny Abramov. "Comparison of Requirements of Regulators of the Russian Federation and the United States of America to Automatic Control Systems of Technological Processes of Critical Objects." NBI Technologies, no. 2 (October 2019): 29–34. http://dx.doi.org/10.15688/nbit.jvolsu.2019.2.5.

Full text
Abstract:
There are requirements for AMS TP CO based on federal laws, the presidential decree, the FSTEC decree and documents, other information security conceptions, state standards in the Russian Federation and requirements for ICS based on other security strategies, the FISMA Act, NIST, IEEE, GAO documents and others in the Unitet States of America. The main FSTEC document for information security AMS TP CO is the decree of March 14, 2014 no. 31. The main FIPS document for information security ICS is Special Publication 800-82 Rev 2. The FIPS document Special Publication 800-82 Rev 2 is more detailed than the FSTEC document in realization control levels, components, objects of ICS protection. The Special Publication includes base operations in ICS main components and enterprises network, ICS system review, USA critical object feature description. Special Publication 800-82 Rev 2 shows more quality performance than the FSTEC document. It has more pages, system descriptions, recommendation for the life cycle security system of ICS, a lot of network contents.
APA, Harvard, Vancouver, ISO, and other styles
17

Ashari, Ahmad, and Mardhani Riasetiawan. "Document Summarization using TextRank and Semantic Network." International Journal of Intelligent Systems and Applications 9, no. 11 (November 8, 2017): 26–33. http://dx.doi.org/10.5815/ijisa.2017.11.04.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Sun, Xingping, Yibing Li, Hongwei Kang, and Yong Shen. "Automatic Document Classification Using Convolutional Neural Network." Journal of Physics: Conference Series 1176 (March 2019): 032029. http://dx.doi.org/10.1088/1742-6596/1176/3/032029.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Sebestyén, Viktor, Endre Domokos, and János Abonyi. "Multilayer network based comparative document analysis (MUNCoDA)." MethodsX 7 (2020): 100902. http://dx.doi.org/10.1016/j.mex.2020.100902.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Heatwole, E. "Processing document images on the telco network." IEEE Communications Magazine 31, no. 1 (January 1993): 40–44. http://dx.doi.org/10.1109/35.180072.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Arnia, Fitri, Khairun Saddami, and Khairul Munadi. "DCNet: Noise-Robust Convolutional Neural Networks for Degradation Classification on Ancient Documents." Journal of Imaging 7, no. 7 (July 12, 2021): 114. http://dx.doi.org/10.3390/jimaging7070114.

Full text
Abstract:
Analysis of degraded ancient documents is challenging due to the severity and combination of degradation present in a single image. Ancient documents also suffer from additional noise during the digitalization process, particularly when digitalization is done using low-specification devices and/or under poor illumination conditions. The noises over the degraded ancient documents certainly cause a troublesome document analysis. In this paper, we propose a new noise-robust convolutional neural network (CNN) architecture for degradation classification of noisy ancient documents, which is called a degradation classification network (DCNet). DCNet was constructed based on the ResNet101, MobileNetV2, and ShuffleNet architectures. Furthermore, we propose a new self-transition layer following DCNet. We trained the DCNet using (1) noise-free document images and (2) heavy-noise (zero mean Gaussian noise (ZMGN) and speckle) document images. Then, we tested the resulted models with document images containing different levels of ZMGN and speckle noise. We compared our results to three CNN benchmarking architectures, namely MobileNet, ShuffleNet, and ResNet101. In general, the proposed architecture performed better than MobileNet, ShuffleNet, ResNet101, and conventional machine learning (support vector machine and random forest), particularly for documents with heavy noise.
APA, Harvard, Vancouver, ISO, and other styles
22

Cheng, Yan, Z. Ye, M. Wang, and Q. Zhang. "DOCUMENT CLASSIFICATION BASED ON CONVOLUTIONAL NEURAL NETWORK AND HIERARCHICAL ATTENTION NETWORK." Neural Network World 29, no. 2 (2019): 83–98. http://dx.doi.org/10.14311/nnw.2019.29.007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Yoon, Yeo Chan, Hyung Kuen Gee, and Heuiseok Lim. "Network-Based Document Clustering Using External Ranking Loss for Network Embedding." IEEE Access 7 (2019): 155412–23. http://dx.doi.org/10.1109/access.2019.2948662.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Ping, Deng Li, Guo Bing, and Zheng Wen. "Web Service Clustering Approach Based on Network and Fused Document-Based and Tag-Based Topics Similarity." International Journal of Web Services Research 18, no. 3 (July 2021): 63–81. http://dx.doi.org/10.4018/ijwsr.2021070104.

Full text
Abstract:
To produce a web services clustering with values that satisfy many requirements is a challenging focus. In this article, the authors proposed a new approach with two models, which are helpful to the service clustering problem. Firstly, a document-tag LDA model (DTag-LDA) is proposed that considers the tag information of web services, and the tag can describe the effective information of documents accurately. Based on the first model, this article further proposes an efficient document weight and tag weight-LDA model (DTw-LDA), which fused multi-modal data network. To further improve the clustering accuracy, the model constructs the network for describing text and tag respectively and then merges the two networks to generate web service network clustered. In addition, this article also designs experiments to verify that the used auxiliary information can help to extract more accurate semantics by conducting service classification. And the proposed method has obvious advantages in precision, recall, purity, and other performance.
APA, Harvard, Vancouver, ISO, and other styles
25

Nikolentzos, Giannis, Antoine Tixier, and Michalis Vazirgiannis. "Message Passing Attention Networks for Document Understanding." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 8544–51. http://dx.doi.org/10.1609/aaai.v34i05.6376.

Full text
Abstract:
Graph neural networks have recently emerged as a very effective framework for processing graph-structured data. These models have achieved state-of-the-art performance in many tasks. Most graph neural networks can be described in terms of message passing, vertex update, and readout functions. In this paper, we represent documents as word co-occurrence networks and propose an application of the message passing framework to NLP, the Message Passing Attention network for Document understanding (MPAD). We also propose several hierarchical variants of MPAD. Experiments conducted on 10 standard text classification datasets show that our architectures are competitive with the state-of-the-art. Ablation studies reveal further insights about the impact of the different components on performance. Code is publicly available at: https://github.com/giannisnik/mpad.
APA, Harvard, Vancouver, ISO, and other styles
26

Abdul Rauf, Siti Hajar, Siti Hajar Abu Bakar Ah, and Adi Fahrudin. "Social Change PostCOVID-19 in Malaysia: The Density of Social Network." Asian Social Work Journal 5, no. 2 (July 27, 2020): 1–5. http://dx.doi.org/10.47405/aswj.v5i2.136.

Full text
Abstract:
The COVID-19 pandemic is a global health problem that has given the greatest challenge for humanity today. This paper discusses the social changes that have taken place on social network density after COVID-19 hit the world. The social network density discussed is based on Social Network Theory according to the current situation that has hit Malaysia. The methods used are based on document analysis and case analysis from official documents issued by the government. The analysis revealed that the COVID-19 pandemic has had a devastating impact on human health, social and economic. However, seen from the standpoint of social networks, the COVID-19 pandemic has led to the emergence of densities in social networks due to increased informal sector involvement in the formation of social networks. This means that, as the number of social networks is formed, the density of social network will increased as defined by Social Network Theory.
APA, Harvard, Vancouver, ISO, and other styles
27

Meng, Zu Qiang, Shi Mo Shen, and Qiu Lian Chen. "A Network Decomposition-Based Text Clustering Algorithm for Topic Detection." Applied Mechanics and Materials 239-240 (December 2012): 1318–23. http://dx.doi.org/10.4028/www.scientific.net/amm.239-240.1318.

Full text
Abstract:
Text clustering is one of the most popular topic detection techniques. However, the existing text clustering approaches require that each document has to be partitioned to one and only one cluster. This is not reasonable in some cases for there exist some documents which should not used to constitute topics. This paper firstly models a text document set as a network and designs a method for decomposing such a network, and then proposes a truly original text clustering algorithm for topic detection, called a network decomposition-based text clustering algorithm for topic detection (NDTCATD). The proposed algorithm ensures that meaningless documents can not be used to constitute topics. Experimental results show that NDTCATD is much better than bisecting k-means algorithm in terms of overall similarity and average cluster similarity. Therefore the proposed algorithm is reasonable and effective and is especially suitable for topic detection.
APA, Harvard, Vancouver, ISO, and other styles
28

Yuliana, Dyan, Purwanto, and Catur Supriyanto. "Klasifikasi Teks Pengaduan Masyarakat Dengan Menggunakan Algoritma Neural Network." Jurnal KomtekInfo 5, no. 3 (April 1, 2019): 92–116. http://dx.doi.org/10.35134/komtekinfo.v5i3.35.

Full text
Abstract:
The development of the Internet led to a flood of digital information. Various information can be obtained easily just by clicking or pressing 'enter'. Dissemination of information in the form of digital documents has experienced unprecedented growth. One of the web intended for the general public in the province of Central Java, web named 'Lapor Gub!'. This site was made with the intention to accommodate the aspirations, concerns, or complaints against public services and also dissatisfaction with the performance of local government and the provincial government of Central Java. An increasing number of documents in text format significantly recently made the process of grouping documents (document classification) becomes important. By using the method of text classification, then the document is an overwhelming number of these are organized in such a way so as to facilitate and accelerate the search needed information. Experiments in this study aimed to classify the Indonesian language text documents using Neural Network algorithm. The test is done by using a sample of text documents taken from a web-based electronic mass media entitled 'Lapor Gub!'. The experimental results show that the neural network method effectively used to classify texts public complaints. It is seen from the experimental results, namely the use of Neural Network algorithm on the classification process produces high accuracy in the amount of 43.00% with a period of 03 hours 45 minutes 14 seconds of the Indonesian language text documents classify text public complaints
APA, Harvard, Vancouver, ISO, and other styles
29

Nasir, Inzamam Mashood, Muhammad Attique Khan, Mussarat Yasmin, Jamal Hussain Shah, Marcin Gabryel, Rafał Scherer, and Robertas Damaševičius. "Pearson Correlation-Based Feature Selection for Document Classification Using Balanced Training." Sensors 20, no. 23 (November 27, 2020): 6793. http://dx.doi.org/10.3390/s20236793.

Full text
Abstract:
Documents are stored in a digital form across several organizations. Printing this amount of data and placing it into folders instead of storing digitally is against the practical, economical, and ecological perspective. An efficient way of retrieving data from digitally stored documents is also required. This article presents a real-time supervised learning technique for document classification based on deep convolutional neural network (DCNN), which aims to reduce the impact of adverse document image issues such as signatures, marks, logo, and handwritten notes. The proposed technique’s major steps include data augmentation, feature extraction using pre-trained neural network models, feature fusion, and feature selection. We propose a novel data augmentation technique, which normalizes the imbalanced dataset using the secondary dataset RVL-CDIP. The DCNN features are extracted using the VGG19 and AlexNet networks. The extracted features are fused, and the fused feature vector is optimized by applying a Pearson correlation coefficient-based technique to select the optimized features while removing the redundant features. The proposed technique is tested on the Tobacco3482 dataset, which gives a classification accuracy of 93.1% using a cubic support vector machine classifier, proving the validity of the proposed technique.
APA, Harvard, Vancouver, ISO, and other styles
30

Tang, Pingjie, Meng Jiang, Bryan (Ning) Xia, Jed W. Pitera, Jeffrey Welser, and Nitesh V. Chawla. "Multi-Label Patent Categorization with Non-Local Attention-Based Graph Convolutional Network." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 9024–31. http://dx.doi.org/10.1609/aaai.v34i05.6435.

Full text
Abstract:
Patent categorization, which is to assign multiple International Patent Classification (IPC) codes to a patent document, relies heavily on expert efforts, as it requires substantial domain knowledge. When formulated as a multi-label text classification (MTC) problem, it draws two challenges to existing models: one is to learn effective document representations from text content; the other is to model the cross-section behavior of label set. In this work, we propose a label attention model based on graph convolutional network. It jointly learns the document-word associations and word-word co-occurrences to generate rich semantic embeddings of documents. It employs a non-local attention mechanism to learn label representations in the same space of document representations for multi-label classification. On a large CIRCA patent database, we evaluate the performance of our model and as many as seven competitive baselines. We find that our model outperforms all those prior state of the art by a large margin and achieves high performance on P@k and nDCG@k.
APA, Harvard, Vancouver, ISO, and other styles
31

Hu, Yu. "Neural Network Computing Based Text Mining and its Application." Advanced Materials Research 542-543 (June 2012): 1443–46. http://dx.doi.org/10.4028/www.scientific.net/amr.542-543.1443.

Full text
Abstract:
With the extensive number of documents that firms must process and with the extensive amounts of information available on the Internet, an automated intelligent method is needed to sort through all available documents or sites. It would be beneficial to start with grouping similar documents or sites together based on similarities. Classifying documents and sorting them into categories could be beneficial since in most cases, no one user would be interested in all the different categories of documents at the same time. Classifying a large number of documents would also make it easier to locate a specific document. In this paper, neural network computing based text mining (TM) and its application will be discussed.
APA, Harvard, Vancouver, ISO, and other styles
32

Zhou, Guo Hong. "Application of Text Mining Based on Neural Network Computing." Applied Mechanics and Materials 278-280 (January 2013): 1972–75. http://dx.doi.org/10.4028/www.scientific.net/amm.278-280.1972.

Full text
Abstract:
Today, numbers of documents that firms must process and with the extensive amounts of information available on the Internet, an automated intelligent method is needed to sort through all available documents or sites. It would be beneficial to start with grouping similar documents or sites together based on similarities. Classifying documents and sorting them into categories could be beneficial since in most cases, no one user would be interested in all the different categories of documents at the same time. Classifying a large number of documents would also make it easier to locate a specific document. In this paper, neural network computing based text mining (TM) and its application will be discussed.
APA, Harvard, Vancouver, ISO, and other styles
33

Jung, Won-Chi, and Namje Park. "A Safe Web in Network Separation Environment." Journal of Computational and Theoretical Nanoscience 17, no. 7 (July 1, 2020): 3243–49. http://dx.doi.org/10.1166/jctn.2020.9168.

Full text
Abstract:
This paper describes a technique for constructing network separation and proposes a method for providing convenience to users by using it. This paper analyzes the background of the Korean government’s adoption of the network separation policy. And there are various techniques for configuring network separation. NAC, media control solution and file transfer system are essential for network separation. Because just running two networks does not guarantee secure. This paper considers ways to provide efficiency and convenience to the extent allowed by the network separation policy. In this paper, Using Headless Browser, transform web pages into electronic documents. The converted electronic document does not contain malicious code. Several Headless browser packages were used to implement and test assuming a virtual network separation. Web scraping technology worked reliably, almost all browsers can read and display PDF file. In this paper, we have defined a safe web prototype in a network separation environment.
APA, Harvard, Vancouver, ISO, and other styles
34

Wan, Xing. "Simple Encryption Research Based on Heterogeneous System." Applied Mechanics and Materials 519-520 (February 2014): 202–5. http://dx.doi.org/10.4028/www.scientific.net/amm.519-520.202.

Full text
Abstract:
For the widespread application of XML technology in network, the security of XML documents is particularly outstanding. XML encryption technology can ensure the confidentiality and integrity of an XML document effectively, and provide important technical for the safety of the XML document transmission. This paper will first introduce the XML encryption technology and encryption standard, and three models of XML document encryption, and then the paper will give three simple implementation of the XML encryption document in the. NET Framework platform. It preliminary ensures the safety of XML document and shows the flexibility and simplicity of XML encryption technology.
APA, Harvard, Vancouver, ISO, and other styles
35

Bergmann, David. "Predicting Document Accesses with A Self-Enforcing Network." Journal of Computers 14, no. 7 (2019): 438–50. http://dx.doi.org/10.17706/jcp.14.7.438-450.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Bergmann, David. "Predicting Document Accesses with a Self-enforcing Network." Journal of Computers 14, no. 8 (2019): 528–40. http://dx.doi.org/10.17706/jcp.14.8.528-540.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Hung, Chihli, and Stefan Wermter. "Neural Network Based Document Clustering Using WordNet Ontologies." International Journal of Hybrid Intelligent Systems 1, no. 3-4 (January 19, 2005): 127–42. http://dx.doi.org/10.3233/his-2004-13-402.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Subba, Sanjeev, Nawaraj Paudel, and Tej Bahadur Shahi. "Nepali Text Document Classification Using Deep Neural Network." Tribhuvan University Journal 33, no. 1 (June 30, 2019): 11–22. http://dx.doi.org/10.3126/tuj.v33i1.28677.

Full text
Abstract:
An automated text classification is a well-studied problem in text mining which generally demands the automatic assignment of a label or class to a particular text documents on the basis of its content. To design a computer program that learns the model form training data to assign the specific label to unseen text document, many researchers has applied deep learning technologies. For Nepali language, this is first attempt to use deep learning especially Recurrent Neural Network (RNN) and compare its performance to traditional Multilayer Neural Network (MNN). In this study, the Nepali texts were collected from online News portals and their pre-processing and vectorization was done. Finally deep learning classification framework was designed and experimented for ten experiments: five for Recurrent Neural Network and five for Multilayer Neural Network. On comparing the result of the MNN and RNN, it can be concluded that RNN outperformed the MNN as the highest accuracy achieved by MNN is 48 % and highest accuracy achieved by RNN is 63%.
APA, Harvard, Vancouver, ISO, and other styles
39

Schönhofen, Peter. "Identifying document topics using the Wikipedia category network." Web Intelligence and Agent Systems: An International Journal 7, no. 2 (2009): 195–207. http://dx.doi.org/10.3233/wia-2009-0162.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Lee, Sang Yup. "Document vectorization method using network information of words." PLOS ONE 14, no. 7 (July 18, 2019): e0219389. http://dx.doi.org/10.1371/journal.pone.0219389.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Dang, Quang-Vinh, and GueeSang Lee. "Document Image Binarization Using Multi-scale Fusion Network." Journal of KIISE 46, no. 12 (December 31, 2019): 1314–21. http://dx.doi.org/10.5626/jok.2019.46.12.1314.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Khan, M. Shamim, and Sebastian W. Khor. "Web document clustering using a hybrid neural network." Applied Soft Computing 4, no. 4 (September 2004): 423–32. http://dx.doi.org/10.1016/j.asoc.2004.02.003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Ito, Tomoki, Kota Tsubouchi, Hiroki Sakaji, Tatsuo Yamashita, and Kiyoshi Izumi. "Contextual Sentiment Neural Network for Document Sentiment Analysis." Data Science and Engineering 5, no. 2 (May 20, 2020): 180–92. http://dx.doi.org/10.1007/s41019-020-00122-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Weng, Sung-Shun, and Hui-Ling Chang. "Using ontology network analysis for research document recommendation." Expert Systems with Applications 34, no. 3 (April 2008): 1857–69. http://dx.doi.org/10.1016/j.eswa.2007.02.023.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Denoyer, Ludovic, and Patrick Gallinari. "Bayesian network model for semi-structured document classification." Information Processing & Management 40, no. 5 (September 2004): 807–27. http://dx.doi.org/10.1016/j.ipm.2004.04.009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Martinčić-Ipšić, Sanda, Tanja Miličić, and and Todorovski. "The Influence of Feature Representation of Text on the Performance of Document Classification." Applied Sciences 9, no. 4 (February 20, 2019): 743. http://dx.doi.org/10.3390/app9040743.

Full text
Abstract:
In this paper we perform a comparative analysis of three models for a feature representation of text documents in the context of document classification. In particular, we consider the most often used family of bag-of-words models, the recently proposed continuous space models word2vec and doc2vec, and the model based on the representation of text documents as language networks. While the bag-of-word models have been extensively used for the document classification task, the performance of the other two models for the same task have not been well understood. This is especially true for the network-based models that have been rarely considered for the representation of text documents for classification. In this study, we measure the performance of the document classifiers trained using the method of random forests for features generated with the three models and their variants. Multi-objective rankings are proposed as the framework for multi-criteria comparative analysis of the results. Finally, the results of the empirical comparison show that the commonly used bag-of-words model has a performance comparable to the one obtained by the emerging continuous-space model of doc2vec. In particular, the low-dimensional variants of doc2vec generating up to 75 features are among the top-performing document representation models. The results finally point out that doc2vec shows a superior performance in the tasks of classifying large documents.
APA, Harvard, Vancouver, ISO, and other styles
47

Chen, Wang, Yifan Gao, Jiani Zhang, Irwin King, and Michael R. Lyu. "Title-Guided Encoding for Keyphrase Generation." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 6268–75. http://dx.doi.org/10.1609/aaai.v33i01.33016268.

Full text
Abstract:
Keyphrase generation (KG) aims to generate a set of keyphrases given a document, which is a fundamental task in natural language processing (NLP). Most previous methods solve this problem in an extractive manner, while recently, several attempts are made under the generative setting using deep neural networks. However, the state-of-the-art generative methods simply treat the document title and the document main body equally, ignoring the leading role of the title to the overall document. To solve this problem, we introduce a new model called Title-Guided Network (TG-Net) for automatic keyphrase generation task based on the encoderdecoder architecture with two new features: (i) the title is additionally employed as a query-like input, and (ii) a titleguided encoder gathers the relevant information from the title to each word in the document. Experiments on a range of KG datasets demonstrate that our model outperforms the state-of-the-art models with a large margin, especially for documents with either very low or very high title length ratios.
APA, Harvard, Vancouver, ISO, and other styles
48

Tawdar, A. P., M. S. Bewoor, and S. H. Patil. "Incremental Approach of Neural Network in Back Propagation Algorithms for Web Data Mining." IAES International Journal of Artificial Intelligence (IJ-AI) 6, no. 2 (June 1, 2017): 74. http://dx.doi.org/10.11591/ijai.v6.i2.pp74-78.

Full text
Abstract:
Text Classification is also called as Text Categorization (TC), is the task of classifying a set of text documents automatically into different categories from a predefined set. If a text document relates to exactly one of the categories, then it is called as single-label classification task; otherwise, it is called as multi-label classification task. For Information Retrieval (IR) and Machine Learning (ML), TC uses several tools and has received much attention in the last decades. In this paper, first classifies the text documents using MLP based machine learning approach (BPP) and then return the most relevant documents. And also describes a proposed back propagation neural network classifier that performs cross validation for original Neural Network. In order to optimize the classification accuracy, training time. Proposed web content mining methodology in the exploration with the aid of BPP. The main objective of this investigation is web document extraction and utilizing different grouping algorithm. This work extricates the data from the web URL.
APA, Harvard, Vancouver, ISO, and other styles
49

Hu, Yang, and Lin Su. "Electronic Document Safe Exchange Technology Based on Identity Cryptography." Advanced Materials Research 433-440 (January 2012): 3353–56. http://dx.doi.org/10.4028/www.scientific.net/amr.433-440.3353.

Full text
Abstract:
In this paper, in accordance with the current difficulties in the document and data exchange between different departments of an enterprise, the authors carry on analysis on the identity authentication and authorization problems in the document transmission system at the present time, and design a scheme for the electronic document exchange platform which is based on the cryptography algorithm of identity, and finally put forward an electronic document safe exchange platform based on the identity, which provides a security guarantee for the implementation of the free flow and conversion of the electronic documents among different intranets. In addition, the identity authentication and authorization access control with safety and higher reliability can be realized for the electronic documents transmission system in the applications of network as well.
APA, Harvard, Vancouver, ISO, and other styles
50

Chen, Yuan, and Guo Biao Ren. "Computer Network Fault Detection Based on Neural Network." Advanced Materials Research 889-890 (February 2014): 1279–83. http://dx.doi.org/10.4028/www.scientific.net/amr.889-890.1279.

Full text
Abstract:
This document explains that the neural network has good nonlinear mapping and adaptive capacity. It is becoming more and more widely applied in computer network fault detection. The article take BP neural network as an example to illustrate how to detect the computer network faults.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography