Publications

You can also find my articles on my Google Scholar profile.

Journal Articles

Do pre-trained language models indeed understand software engineering tasks?

Published in IEEE Transactions on Software Engineering, 2024

Artificial intelligence (AI) for software engineering (SE) tasks has recently achieved promising performance. In this article, we investigate to what extent the pre-trained language model truly understands those SE tasks such as code search, code summarization, etc. We conduct a comprehensive empirical study on a board set of AI for SE (AI4SE) tasks by feeding them with variant inputs: 1) with various masking rates and 2) with sufficient input subset method. Then, the trained models are evaluated on different SE tasks, including code search, code summarization, and duplicate bug report detection. Our experimental results show that pre-trained language models are insensitive to the given input, thus they achieve similar performance in these three SE tasks. We refer to this phenomenon as overinterpretation, where a model confidently makes a decision without salient features, or where a model finds some irrelevant relationships between the final decision and the dataset. Our study investigates two approaches to mitigate the overinterpretation phenomenon: whole word mask strategy and ensembling. To the best of our knowledge, we are the first to reveal this overinterpretation phenomenon to the AI4SE community, which is an important reminder for researchers to design the input for the models and calls for necessary future work in understanding and implementing AI4SE tasks.

Recommended citation: Yao Li, Tao Zhang, Xiapu Luo, Haipeng Cai, Sen Fang, and Dawei Yuan. 2023. Do Pretrained Language Models Indeed Understand Software Engineering Tasks? IEEE Trans. Softw. Eng. 49, 10 (Oct. 2023), 4639–4655.
Download Paper

Meta-Learning for Multi-Family Android Malware Classification

Published in ACM Transactions on Software Engineering and Methodology, 2024

With the emergence of smartphones, Android has become a widely used mobile operating system. However, it is vulnerable when encountering various types of attacks. Every day, new malware threatens the security of users’ devices and private data. Many methods have been proposed to classify malicious applications, utilizing static or dynamic analysis for classification. However, previous methods still suffer from unsatisfactory performance due to two challenges. First, they are unable to address the imbalanced data distribution problem, leading to poor performance for malware families with few members. Second, they are unable to address the zero-day malware (zero-day malware refers to malicious applications that exploit unknown vulnerabilities) classification problem. In this article, we introduce an innovative meta-learning approach for multi-family Android malware classification named Meta-MAMC, which uses meta-learning technology to learn meta-knowledge (i.e., the similarities and differences among different malware families) of few-family samples and combines new sampling algorithms to solve the above challenges. Meta-MAMC integrates (i) the meta-knowledge contained within the dataset to guide models in learning to identify unknown malware; and (ii) more accurate and diverse tasks based on novel sampling strategies, as well as directly adapting meta-learning to a new few-sample and zero-sample task to classify families. We have evaluated Meta-MAMC on two popular datasets and a corpus of real-world Android applications. The results demonstrate its efficacy in accurately classifying malicious applications belonging to certain malware families, even achieving 100% classification in some families.

Recommended citation: Yao Li, Dawei Yuan, Tao Zhang, Haipeng Cai, David Lo, Cuiyun Gao, Xiapu Luo, and He Jiang. 2024. Meta-Learning for Multi-Family Android Malware Classification. ACM Trans. Softw. Eng. Methodol. 33, 7, Article 174 (September 2024), 27 pages.
Download Paper

JOWMDroid: Android malware detection based on feature weighting with joint optimization of weight-mapping and classifier parameters

Published in Computers & Security, 2021

Android malware detection is an important problem that must be urgently studied and solved. Machine learning-based methods first extract features from applications and then build a classifier using machine learning algorithms to distinguish malicious and benign applications. In most of the existing work, the difference in feature importance has been ignored, or the calculation of feature weights is irrelevant to the classification model. To address these issues, this paper proposes a novel Android malware detection scheme based on feature weighting with the joint optimization of weight-mapping and classifier parameters, called JOWMDroid. First, features of eight categories are extracted from the Android application package and then a certain number of the most important features are selected using information gain for malware detection. Next, an initial weight is calculated for each selected feature via three machine learning models and then five weight-mapping functions are designed to map the initial weights to the final weights. Finally, the parameters of the weight-mapping function and classifier are jointly optimized by the differential evolution algorithm. The experimental results reveal that the proposed method outperforms four state-of-the-art feature weighting methods and makes the weight-aware classifiers more competitive.

Recommended citation: Cai, Lingru & Li, Yao & Xiong, Zhi. (2021). JOWMDroid: Android malware detection based on feature weighting with joint optimization of weight-mapping and classifier parameters. Computers & Security.
Download Paper

Conference Papers

StructureTester: Automatic Machine Translation Testing Based on Variation Feature Vector

Published in 2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security (QRS), 2023

In recent years, the performance of machine translation systems has made remarkable progress, primarily due to the rapid advancements in neural network language models. These state-of-the-art models enable the swift translation of vast amounts of text, leading to considerable time and cost savings. In pursuit of enhancing machine translation accuracy, researchers have devoted attention to developing automated translation testing tools. A prominent approach in this context involves comparing the translation results of “similar” source sentences, anticipating the correctness of translation by similarities in sentence structure. However, despite the potential of this approach, the current studies still face certain challenges. Notably, false negatives and false positives persist as issues. Moreover, achieving high detection accuracy for all types of translation errors remains an ongoing challenge. To address these challenges, we propose the StructureTester, a novel approach that not only leverages the differences between the structure trees of two sentences but also employs changes in sentence purpose as crucial judgmental features. Our proposed method yields significant improvements, elevating the overall detection accuracy to an impressive 98.17%. Furthermore, StructureTester effectively identifies various types of translation errors.

Recommended citation: W. Luo, Y. Luo, Y. Li and T. Zhang, "StructureTester: Automatic Machine Translation Testing Based on Variation Feature Vector," 2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security (QRS), Chiang Mai, Thailand, 2023, pp. 301-312,
Download Paper