Log in

Relevant bibliographies by topics / Abstract syntax tree (AST)

Contents

Journal articles

Academic literature on the topic 'Abstract syntax tree (AST)'

Author: Grafiati

Published: 4 June 2021

Last updated: 1 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Abstract syntax tree (AST).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Abstract syntax tree (AST)"

1

Shen, Victor R. L. "Novel Code Plagiarism Detection Based on Abstract Syntax Tree and Fuzzy Petri Nets." International Journal of Engineering Education 1, no. 1 (2019): 46–56. http://dx.doi.org/10.14710/ijee.1.1.46-56.

Full text

Abstract:

Those students who major in computer science and/or engineering are required to design program codes in a variety of programming languages. However, many students submit their source codes they get from the Internet or friends with no or few modifications. Detecting the code plagiarisms done by students is very time-consuming and leads to the problems of unfair learning performance evaluation. This paper proposes a novel method to detect the source code plagiarisms by using a high-level fuzzy Petri net (HLFPN) based on abstract syntax tree (AST). First, the AST of each source code is generated after the lexical and syntactic analyses have been done. Second, token sequence is generated based on the AST. Using the AST can effectively detect the code plagiarism by changing the identifier or program statement order. Finally, the generated token sequences are compared with one another using an HLFPN to determine the code plagiarism. Furthermore, the experimental results have indicated that we can make better determination to detect the code plagiarism.

APA, Harvard, Vancouver, ISO, and other styles

2

Kaur, Amandeep, and Munish Saini. "Enhancing the Software Clone Detection in BigCloneBench." International Journal of Open Source Software and Processes 12, no. 3 (2021): 17–31. http://dx.doi.org/10.4018/ijossp.2021070102.

Full text

Abstract:

In the software system, the code snippets that are copied and pasted in the same software or another software result in cloning. The basic cause of cloning is either a programmer‘s constraint or language constraints. An increase in the maintenance cost of software is the major drawback of code clones. So, clone detection techniques are required to remove or refactor the code clone. Recent studies exhibit the abstract syntax tree (AST) captures the structural information of source code appropriately. Many researchers used tree-based convolution for identifying the clone, but this technique has certain drawbacks. Therefore, in this paper, the authors propose an approach that finds the semantic clone through square-based convolution by taking abstract syntax representation of source code. Experimental results show the effectiveness of the approach to the popular BigCloneBench benchmark.

APA, Harvard, Vancouver, ISO, and other styles

3

Li, Zhiming, Qing Wu, and Kun Qian. "Adabot: Fault-Tolerant Java Decompiler (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 10 (2020): 13861–62. http://dx.doi.org/10.1609/aaai.v34i10.7203.

Full text

Abstract:

Reverse Engineering has been an extremely important field in software engineering, it helps us to better understand and analyze the internal architecture and interrealtions of executables. Classical Java reverse engineering task includes disassembly and decompilation. Traditional Abstract Syntax Tree (AST) based disassemblers and decompilers are strictly rule defined and thus highly fault intolerant when bytecode obfuscation were introduced for safety concern. In this work, we view decompilation as a statistical machine translation task and propose a decompilation framework which is fully based on self-attention mechanism. Through better adaption to the linguistic uniqueness of bytecode, our model fully outperforms rule-based models and previous works based on recurrence mechanism.

APA, Harvard, Vancouver, ISO, and other styles

4

Ponomarenko, G. S., and P. G. Klyucharev. "Detection of Obfuscated Javascript Code Based on Abstract Syntax Trees Coloring." Mathematics and Mathematical Modeling, no. 2 (June 9, 2020): 1–24. http://dx.doi.org/10.24108/mathm.0220.0000218.

Full text

Abstract:

The paper deals with a problem of the obfuscated JavaScript code detection and classification based on Abstract Syntax Trees (AST) coloring. Colors of the AST vertexes and edges are assigned with regard to the types of the AST vertexes specified by the program lexical and syntax structure and the programming language standard. Research involved a few stages. First of the all, a non-obfuscated JavaScript programs dataset was collected by the public repositories evaluation. Secondly, obfuscated samples were created using eight open-source obfuscators. Classifier models were built using an algorithm of gradient boosting on the decision trees (GBDT). We built two types of the classifiers. The first one is the model that classifies the program according to the type of the obfuscator used, i.e. based on what obfuscator created the sample. The second one tries to detect samples obfuscated by the obfuscator whose samples are not observed during training. The quality of the obtained models is on par with the known published results. The feature engineering method proposed in the paper does not require a preliminary analysis of the obfuscators and obfuscating transformations. In the final part of the paper we analyze a quality of models estimated, discussing the certain statistical properties of the obfuscated and non-obfuscated samples obtained and corresponding colored ASTs. Analysis of generated samples of obfuscated programs has shown that the method proposed in the paper has some limitations. In particular, it is difficult to recognize minifiers or other obfuscating programs, which change the lexical structure to a greater extent and the syntax to a lesser extent. To improve the quality of detection of this kind of obscuring transformations, one can built combined classifiers using both the method based on the AST coloring and the additional information about lexemes and punctuation, for example, entropy of identifiers and strings, proportion of characters in upper and lower case, usage frequency of certain characters etc.

APA, Harvard, Vancouver, ISO, and other styles

5

Hosseinpour, Sahereh, Mir Mohammad Reza Alavi Milani, and Hüseyin Pehlivan. "A Step-by-Step Solution Methodology for Mathematical Expressions." Symmetry 10, no. 7 (2018): 285. http://dx.doi.org/10.3390/sym10070285.

Full text

Abstract:

In this paper, we propose a methodology for the step-by-step solution of problems, which can be incorporated into a computer algebra system. Our main aim is to show all the intermediate evaluation steps of mathematical expressions from the start to the end of the solution. The first stage of the methodology covers the development of a formal grammar that describes the syntax and semantics of mathematical expressions. Using a compiler generation tool, the second stage produces a parser from the grammar description. The parser is used to convert a particular mathematical expression into an Abstract Syntax Tree (AST), which is evaluated in the third stage by traversing al its nodes. After every evaluation of some nodes, which corresponds to an intermediate solution step of the related expression, the resulting AST is transformed into the corresponding mathematical expression and then displayed. Many other algebra-related issues such as simplification, factorization, distribution and substitution can be covered by the solution methodology. We currently focuses on the solutions of various problems associated with the subject of derivative, equations, single variable polynomials, and operations on functions. However, it can easily be extended to cover the other subjects of general mathematics.

APA, Harvard, Vancouver, ISO, and other styles

6

Han, KyungHyun, and Seong Oun Hwang. "Lightweight Detection Method of Obfuscated Landing Sites Based on the AST Structure and Tokens." Applied Sciences 10, no. 17 (2020): 6116. http://dx.doi.org/10.3390/app10176116.

Full text

Abstract:

Attackers use a variety of techniques to insert redirection JavaScript that leads a user to a malicious webpage, where a drive-by-download attack is executed. In particular, the redirection JavaScript in the landing site is obfuscated to avoid detection systems. In this paper, we propose a lightweight detection system based on static analysis to classify the obfuscation type and to promptly detect the obfuscated redirection JavaScript. The proposed model detects the obfuscated redirection JavaScript by converting the JavaScript into an abstract syntax tree (AST). Then, the structure and token information are extracted. Specifically, we propose a lightweight AST to identify the obfuscation type and the revised term frequency-inverse document frequency to efficiently detect the malicious redirection JavaScript. This approach enables rapid identification of the obfuscated redirection JavaScript and proactive blocking of the webpages that are used in drive-by-download attacks.

APA, Harvard, Vancouver, ISO, and other styles

7

Xu, Yingjie, Gengran Hu, Lin You, and Chengtang Cao. "A Novel Machine Learning-Based Analysis Model for Smart Contract Vulnerability." Security and Communication Networks 2021 (August 10, 2021): 1–12. http://dx.doi.org/10.1155/2021/5798033.

Full text

Abstract:

In recent years, a lot of vulnerabilities of smart contracts have been found. Hackers used these vulnerabilities to attack the corresponding contracts developed in the blockchain system such as Ethereum, and it has caused lots of economic losses. Therefore, it is very important to find out the potential problems of the smart contracts and develop more secure smart contracts. As blockchain security events have raised more important issues, more and more smart contract security analysis methods have been developed. Most of these methods are based on traditional static analysis or dynamic analysis methods. There are only a few methods that use emerging technologies, such as machine learning. Some models that use machine learning to detect smart contract vulnerabilities cost much time in extracting features manually. In this paper, we introduce a novel machine learning-based analysis model by introducing the shared child nodes for smart contract vulnerabilities. We build the Abstract-Syntax-Tree (AST) for smart contracts with some vulnerabilities from two data sets including SmartBugs and SolidiFI-benchmark. Then, we build the Abstract-Syntax-Tree (AST) of the labeled smart contract for data sets named Smartbugs-wilds. Next, we get the shared child nodes from both of the ASTs to obtain the structural similarity, and then, we construct a feature vector composed of the values that measure structural similarity automatically to build our machine learning model. Finally, we get a KNN model that can predict eight types of vulnerabilities including Re-entrancy, Arithmetic, Access Control, Denial of Service, Unchecked Low Level Calls, Bad Randomness, Front Running, and Denial of Service. The accuracy, recall, and precision of our KNN model are all higher than 90%. In addition, compared with some other analysis tools including Oyente and SmartCheck, our model has higher accuracy. In addition, we spent less time for training .

APA, Harvard, Vancouver, ISO, and other styles

8

Yang, Kang, Huiqun Yu, Guisheng Fan, and Xingguang Yang. "Graph embedding code prediction model integrating semantic features." Computer Science and Information Systems 17, no. 3 (2020): 907–26. http://dx.doi.org/10.2298/csis190908027y.

Full text

Abstract:

With the advent of Big Code, code prediction has received widespread attention. However, the state-of-the-art code prediction techniques are inadequate in terms of accuracy, interpretability and efficiency. Therefore, in this paper, we propose a graph embedding model that integrates code semantic features. The model extracts the structural paths between the nodes in source code file?s Abstract Syntax Tree(AST). Then, we convert paths into training graph and extracted interdependent semantic structural features from the context of AST. Semantic structure features can filter predicted candidate values and effectively solve the problem of Out-of- Word(OoV). The graph embedding model converts the structural features of nodes into vectors which facilitates quantitative calculations. Finally, the vector similarity of the nodes is used to complete the prediction tasks of TYPE and VALUE. Experimental results show that compared with the existing state-of-the-art method, our method has higher prediction accuracy and less time consumption.

APA, Harvard, Vancouver, ISO, and other styles

9

Zhou, Zhimin, and Zhongwen Chen. "Split Attention Pointer Network for Source Code Language Modeling." International Journal of Software Engineering and Knowledge Engineering 30, no. 09 (2020): 1221–44. http://dx.doi.org/10.1142/s0218194020500321.

Full text

Abstract:

There is a growing interest in leveraging Deep Learning (DL) for automating Software Engineering tasks such as program completion. In this paper, we leverage Recurrent Neural Networks (RNNs) for Abstract Syntax Tree (AST)-based code completion. Our approach converts source code into AST nodes and a language model predicts the type and value attributes of next tokens. Our work demonstrates that the attention augmented RNN-based language models are able to understand local context and copy from recent past tokens which have never appeared in the training data set. We observed a drop of performances of both type and value predictions when using a traditional pointer network architecture for out-of-vocabulary (OoV) copying and context understanding, which we call multi-task conflict. To address this challenge, we have devised a new structure of self-attention called Split Attention, where two separate dot-product layers are applied to different parts of the history cache. Based on this structure, we propose a new network called Split Attention Pointer Network (SAPN), which is efficient and flexible in both learning local context and copying OoV tokens from history. The empirical results suggest that our model is superior in syntax-aware generation and OoV token prediction by demonstrating attention behavior similar to human programmers. The results also indicate that our model out performs previous state-of-the-art approaches by more than 6% on widely recognized program completion benchmarks.

APA, Harvard, Vancouver, ISO, and other styles

10

Ji, Xiujuan, Lei Liu, and Jingwen Zhu. "Code Clone Detection with Hierarchical Attentive Graph Embedding." International Journal of Software Engineering and Knowledge Engineering 31, no. 06 (2021): 837–61. http://dx.doi.org/10.1142/s021819402150025x.

Full text

Abstract:

Code clone serves as a typical programming manner that reuses the existing code to solve similar programming problems, which greatly facilitates software development but recurs program bugs and maintenance costs. Recently, deep learning-based detection approaches gradually present their effectiveness on feature representation and detection performance. Among them, deep learning approaches based on abstract syntax tree (AST) construct models relying on the node embedding technique. In AST, the semantic of nodes is obviously hierarchical, and the importance of nodes is quite different to determine whether the two code fragments are cloned or not. However, some approaches do not fully consider the hierarchical structure information of source code. Some approaches ignore the different importance of nodes when generating the features of source code. Thirdly, when the tree is very large and deep, many approaches are vulnerable to the gradient vanishing problem during training. In order to properly address these challenges, we propose a hierarchical attentive graph neural network embedding model-HAG for the code clone detection. Firstly, the attention mechanism is applied on nodes in AST to distinguish the importance of different nodes during the model training. In addition, the HAG adopts graph convolutional network (GCN) to propagate the code message on AST graph and then exploits a hierarchical differential pooling GCN to sufficiently capture the code semantics at different structure level. To evaluate the effectiveness of HAG, we conducted extensive experiments on public clone dataset and compared it with seven state-of-the-art clone detection models. The experimental results demonstrate that the HAG achieves superior detection performance compared with baseline models. Especially, in the detection of moderately Type-3 or Type-4 clones, the HAG particularly outperforms baselines, indicating the strong detection capability of HAG for semantic clones. Apart from that, the impacts of the hierarchical pooling, attention mechanism and critical model parameters are systematically discussed.

APA, Harvard, Vancouver, ISO, and other styles

More sources

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!