Skip to content

hejy47/HDLParser

Repository files navigation

1. HDLParser

HDLParser is a tool of collecting patch-related commits and extracting real bug fixes in hardware description languages (HDLs). It can automatically collects bug fixing commits from HDL repositories, and parses code changes of patches leveraging hdlConvertor and GumTree. Furthermore, it can measure the redundancy of bug fixing commits.

1.1. Introduction

An important problem is the lack of the knowledge on the characteristics of bug fixes in HDLs. Such knowledge will boost the understanding of hardware developers and provide useful insights to new research direction towards automated bug fixing in HDLs.

However, few studies focus on bug fixes of HDLs, which hinders the proposal of APR techniques targeting HDLs. There are mainly two barriers. On one hand, there is lack of research to study the characteristics of bug fixes in HDLs. On the other hand, whether the redundancy assumption still holds in HDLs has not been validated for now.

With such motivation, we propose an automated technique named HDLParser for analysis of bug fixes in HDLs. We run HDLParser to make a fine-granularity analysis of patches and validate the redundancy assumption on bug fixing commits. We obtain some interesting findings. All the relevant artifacts are available in this repository.

1.2. Environment setup

1.2.1. Requirements

1.2.2. Configuration

1.2.2.1. HDL AST parsing script

The parsing script is used to get the AST of HDL files by hdlConvertor, and then transform the AST to the xml format as the input of GumTree. The steps to use the parsing script are as followed:

  • Adding hdlparser/hdlparser to the system path
  • hdlparser can be used as a standalone tool like this: hdlparser /path/to/HDLfile

1.2.2.2. Configurating GumTree to surpport HDLs

The support for HDLs can be configured with reference to GumTree's support for Python. The configurated files are placed in gumtree-3.0.0-SNAPSHOT.

1.2.3. Execution

1.2.3.1 Collecting HDL projects from GitHub to create the subjects:

Repositories in HDLs

commads

  • ./collect_subjects.sh After runing it, for Verilog, VHDL and SystemVerilog, there are ten repositories cloned into subjects, subjects2, subject3 respectively.

1.2.3.2. Collecting patch-related commits, parsing code changes of patches and measuring commit redundancy

  • ./run.sh
  • If it executes successfully
    • The first step makes statistics of project LOC, which show the code line numbers of all projects respectively.
    • The second step collects bug-fix-related commits with bug-related keywords from project repositories. It also will fileter out changes of test code. Its output consists of three kinds of files. The results in Verilog, VHDL, SystemVerilog are stored in data, data2, data3 respectively.
      • Buggy version of a HDL code file containing a bug, stored in the directory "<HDLdata>/PatchCommits/Keyword/<ProjectName>/prevFiles/".
      • Fixed version of the Java code file, stored in the directory "<HDLdata>/PatchCommits/Keyword/<ProjectName>/revFiles/".
      • Diff Hunk of the code changes of fixing the bug, stored in the directory "<HDLdata>/PatchCommits/Keyword/<ProjectName>/DiffEntries/".
    • The third step will further filter out the HDL code files that only contain non-HDL code changes (e.g. comments).
    • The fourth step makes statistics of diff hunk sizes of code changes. The results will be stored in the directory "<HDLdata>/DiffentrySizes/".
    • The fifth step will parse code changes of patches and make statistics of fine-grained code entities impatced by patches. The results will be stored in the directory "<HDLdata>/ParseResults/". Meanwhile, the fix patterns are also collected and stored in the directory "<HDLdata>/ParseResults/".
    • The sixth step will perform a measurement of the redundancy of the patch-related commits. The results will stored in the directory "<HDLdata>/ParseResults/".

1.3. Scenarios to use HDLParser

1.3.1. For hardware developers

If correctly executed, HDLParser can provide detailed information (e.g. occurrence of buggy codes and corresponding repair actions) that hardware developers deepen their understanding on real bug fixes. The knowledge can help developers repair program effectively.

1.3.2. For researchers who want to explore APR towards HDLs

Based on the repair actions parsed from collected patches, HDLParser can provide the most frequent fix patterns that facilitate the design of pattern-based APR towards HDLs.

HDLParser validates the redundancy assumption of bug fixing commits in HDLs that is fundamental assumption of various APR techniques. The redundancy assumption provides a smaller search space for donor codes.

These two areas of knowledge can support the development of APR in HDLs.


We will consistently develop and maintain this project to make it a better tool for the community. Also, all contributions are welcome.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published