Push the Boundaries of AI – Select References on Deep Learning and Machine Learning

Deep learning and machine learning have transformed the tech industry, with AI models like ChatGPT leading the way. At Hengtian, our alliance with Zhejiang University, a top 3 university in China with strengths in IT, engineering, and more, has enabled us to stay at the forefront of this innovation.

Their research in deep learning and machine learning has led to groundbreaking approaches for recognizing user actions, automating software maintenance, improving code search and retrieval, etc. Here are some select papers for your reference.

Select References on Deep Learning and Machine Learning

# Dehai Zhao, Zhenchang Xing, Xin Xia, Deheng Ye, Xiwei Xu, Liming Zhu. SeeHow: Workflow Extraction from Programming Screencasts through Action-Aware Video Analytics. 45th ACM/IEEE International Conference on Software Engineering (ICSE 2023). The paper proposes a deep learning model that can recognize human-understandable structured user actions from action screencasts, which is useful for UI testing, bug reproduction, and robotic process automation, and has been confirmed to be effective and generalizable through extensive experiments on a large data set of video-action pairs from various applications.

# Sen Fang, Tao Zhang, Youshuai Tan, He Jiang, Xin Xia, Xiaobing Sun. Represent Them All: A Universal Learning Representation of Bug Reports. 45th ACM/IEEE International Conference on Software Engineering (ICSE 2023). The paper discusses how existing automated software maintenance techniques that use deep learning are limited by their need for customized bug report representations, leading to complexity, cost, and compatibility issues, but proposes a pre-trained approach called RepresentThemAll that uses a universal bug report framework with carefully designed learning objectives, and demonstrates superior performance on four downstream tasks.

# Chen Zeng, Yue Yu, Shanshan Li, Xin Xia, Zhiming Wang, Mingyang Geng, Linxiao Bai, Wei Dong, and Xiangke Liao. deGraphCS: Embedding Variable-based Flow Graph for Neural Code Search. In ACM Transactions on Software Engineering and Methodology (TOSEM). Accepted. The paper proposes a new approach for code search called deGraphCS, which uses a learnable deep Graph to transfer source code into variable-based flow graphs, providing a more precise modeling of code semantics than existing methods, and achieves state-of-the-art performance in accurately retrieving code snippets from a large-scale dataset of C language code.

# Fang Liu, Ge Li, Bolin Wei, Xin Xia, Zhiyi Fu, Zhi Jin. A Unified Multi-task Learning Model for AST-level and Token-level Code Completion. Empirical Software Engineering (EMSE). Accepted. The paper discusses how code completion in Integrated Development Environments (IDEs) can be improved using a unified multi-task learning approach that considers both the type and value of tokens, and uses a self-attentional architecture to model long-term dependencies, which results in better performance compared to existing methods.

# Shuzheng Gao, Cuiyun Gao, Yulan He, Jichuan Zeng, Lun Yiu Nie, Xin Xia, Michael R.Lyu. Code Structure Guided Transformer for Source Code Summarization. In ACM Transactions on Software Engineering and Methodology (TOSEM). The paper proposes a new approach called SG-Trans which integrates code structure information into a Transformer-based deep learning model for generating accurate code summaries, and achieves superior performance compared to state-of-the-art approaches on two benchmark datasets.

# Fangcheng Qiu, Zhipeng Gao, Xin Xia, David Lo, John Grundy, Xinyu Wang. Deep Just-In-Time Defect Localization. IEEE Transactions on Software Engineering (TSE). DEEPDL is a just-in-time (JIT) defect localization approach that uses a deep learning-based neural language model to locate defect code lines within a defect introducing change, assigning a suspiciousness score to each code line and sorting them in descending order, to assist developers in efficiently checking buggy lines at an early stage and reduce the risk of introducing bugs in time, outperforming state-of-the-art techniques by a substantial margin in experiments on 14 open source Java projects.