
软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscas.ac.cn
Journal of Software,2019,30(5):1342−1358 [doi: 10.13328/j.cnki.jos.005722] http://www.jos.org.cn
©中国科学院软件研究所版权所有. Tel: +86-10-62562563
基于深度学习的 API 误用缺陷检测
∗
汪
昕
1,2
,
陈
驰
1,2
,
赵逸凡
1,2
,
彭
鑫
1,2
,
赵文耘
1,2
1
(复旦大学 软件学院,上海 201203)
2
(上海市数据科学重点实验室(复旦大学),上海 201203)
通讯作者: 彭鑫, E-mail: pengxin@fudan.edu.cn
摘 要: 开发人员经常需要使用各种应用程序编程接口(application programming interface,简称 API)来复用已有
的软件框架、类库等.由于 API 自身的复杂性、文档资料的缺失等原因,开发人员经常会误用 API,从而导致代码缺
陷.为了自动检测 API 误用缺陷,需要获得 API 使用规约,并根据规约对 API 使用代码进行检测.然而,可用于自动检
测的 API 规约难以获得,而人工编写并维护的代价又很高.针对以上问题,将深度学习中的循环神经网络模型应用于
API 使用规约的学习及 API 误用缺陷的检测.在大量的开源 Java 代码基础上,通过静态分析构造 API 使用规约训练
样本,同时利用这些训练样本搭建循环神经网络学习 API 使用规约.在此基础上,针对 API 使用代码进行基于上下文
的语句预测,并通过预测结果与实际代码的比较发现潜在的 API 误用缺陷.对所提出的方法进行实现并针对 Java 加
密相关的 API 及其使用代码进行了实验评估,结果表明,该方法能够在一定程度上实现 API 误用缺陷的自动发现.
关键词: API 误用;使用规约;缺陷检测;深度学习
中图法分类号: TP311
中文引用格式: 汪昕,陈驰,赵逸凡,彭鑫,赵文耘.基于深度学习的 API 误用缺陷检测.软件学报,2019,30(5):1342−1358. http://
www.jos.org.cn/1000-9825/5722.htm
英文引用格式: Wang X, Chen C, Zhao YF, Peng X, Zhao WY. API misuse bug detection based on deep learning. Ruan Jian Xue
Bao/Journal of Software, 2019,30(5):1342−1358 (in Chinese). http://www.jos.org.cn/1000-9825/5722.htm
API Misuse Bug Detec tion Base d on Deep Learni ng
WANG Xin
1,2
, CHEN Chi
1,2
, ZHAO Yi-Fan
1,2
, PENG Xin
1,2
, ZHAO Wen-Yun
1,2
1
(Software School, Fudan University, Shanghai 201203, China)
2
(Shanghai Key Laboratory of Data Science (Fudan University), Shanghai 201203, China)
Abstra ct : Developers often need to use various application programming interfaces (API) to reuse existing software frameworks, class
libraries, and so on. Because of the complexity of the API itself, or the lack of documentation, developers often make some API misuses,
which can lead to some code defects. In order to automatically detect API misuse defects, the API use specification is required and the
API is tested according to the specification. However, API specifications that can be used for automatic detection are difficult to obtain,
and the cost of manual writing and maintenance is high. To address the issue, this study applies the recurrent neural network model of
deep learning to the task of learning API use specifications and the task of detecting the API misuse defect. In this study, based on a large
number of open source Java code, the training sample of API use specification is extracted based on static analysis method, and then use
the training sample to set up the recurrent neural network to learning API use specification. On this basis, this study makes a context-
based prediction on the API use code, and finds out the potential API misuse defects by comparing the prediction results with the actual
code. The method above is implemented, and it is evaluated with experiments about Java encryption related APIs and their used code. The
results show that the proposed approach has the ability to a certain extent to automatically detect API misuse defects.
∗ 基金项目: 国家重点研发计划(2016YFB1000801)
Foundation item: National Key Research and Development Program of China (2016YFB1000801)
本文由智能化软件新技术专刊特约编辑申富饶教授和李戈副教授推荐.
收稿时间: 2018-08-31; 修改时间: 2018-10-31, 2018-12-14; 采用时间: 2019-02-03
评论