Revolutionize Design Verification with AI

What you will learn:
- Pioneering benefits of using artificial intelligence in design verification.
- How SHAPley values can help engineers optimize debugging in design verification.
- Achieving low-latency SoC configurations using optimization techniques.
- Use a cluster-based recommendation engine for automated testing.
Design verification is a time consuming and human resource intensive process. The complexity of designs continues to increase, leading to further delays. In the hardware industry, the ratio of design engineers to verification engineers is typically 1:4, reflecting the challenge therein.
Artificial intelligence and/or machine learning modeling applications at scale could revitalize the hardware design and verification industry. This article breaks ground on applications of AI in VLSI that lead to significant cost and time savings, and facilitate scale design and verification processes.
SHAP for optimizing debugging
SHAP (Shapely Additive Explanations) is a method inspired by game theory that is used to explain the contribution of each feature to an ML model outcome. It contains two types of explanations: local and global explanations. Local explanations explain each individual prediction, while global explanations explain the model behavior as a whole.
The hardware design and verification process is usually mostly manual, and significant parts of the process are sometimes redundant. This is where artificial intelligence finds its essential use cases. In this section, we will use a machine learning model to predict errors and then use SHAP annotations to optimize debugging processes while synthesizing historical data. Concepts of explainable AI will educate verification engineers about factors that lead to failures/errors.
The simulated local explanation in Figure 1 explain the impact and direction of impact of various input characteristics on model outcome. Attributes such as design_category, number_of_power_domain and gpu_voltage have a large negative impact on error prediction while an attribute such as transistor_size has the largest positive impact on error prediction. SHAP clearly explains how the combination of all impacts of all features for a specific design leads to a prediction of zero failure.
In the context of this application, SHAP will help local explanations to clarify which individual factors with which values caused errors, helping engineers perform root cause analysis in real time. SHAP global explanations are generated by merging local explanations from all training data points, the output of which can be seen in Figure 2. Analyzing the simulated global explanations above, it is clear that memory_interface, cache_size and number_of_cores typically have the largest negative impact on errors, while factors such as design_category and clock_ghz have the largest positive impact leading to errors.
Engineers can use it to optimize debugging by better understanding factors that contribute to errors in a particular design. After deploying the machine learning model and its clarifications with SHAP, verification engineers will receive suggestions/warnings of potential errors even before starting the debugging process. This makes debugging efficient and improves overall productivity and utilization of human resources planned for a project.
MIP optimization for low-latency SoC configurations
All hardware engineers strive to design a system-on-chip (SoC) that operates with extremely low latency. Parameters (eg buffer size, cache size, cache placement policy, cache replacement policy, etc.) in register transfer level (RTL) are adjusted and simulations are performed to find out latency numbers under different test scenarios. This time-consuming process takes many iterations to find optimal parameter configurations, and engineers have to wait a long time to arrive at such configurations.
Mixed integer programming (MIP) is an optimization technique that can be applied to solve large complex problems. It can be used to minimize or maximize an objective within the defined constraints. This is an example of the definition of MIP objective and constraints:
In SoC design, MIP can be used to predict optimal parameters with parameter constraints set around parameters such as buffer size, cache size, cache placement policy, cache replacement policy, etc. MIP will solve this to run a set of parameters that result in reduced latency. Different scenarios can also be set in MIP to identify different sets of parameters to achieve lower latency for each of these scenarios.
Effective parameter configurations can be selected from recommendations of MIP, thereby saving a lot of time that would otherwise be spent running many unnecessary simulations. Typically, a test simulation at the SoC level takes a few days to run, while on MIP it is a matter of minutes to a few hours.
Recommendation engine for automated testing
Test development is another time-consuming process for verification engineers. Verifying a design requires numerous tests to be performed and often engineers write tests from scratch when developing a new design. Additionally, thinking of different tests needed to verify the entire design requires a lot of thought and time. On the other hand, when a design change is made, verification engineers run a whole series of tests to validate the design change, which is highly inefficient.
The recommendation system is a system that filters data to recommend the most relevant action. It is based on patterns detected by the system from historical training data. There are many ways to create a recommender system—here we’ll use clustering.
In design validation, years of designs and tests performed are collected and grouped into similar groups based on how close and similar they are to each other. When a new design is created, based on its similarity to historical designs and verification patterns, the system will recommend specific tests that are likely to be needed to validate it. This system will not be effective at first, but with multiple iterations through human-in-loop (HIL) feedback, the recommendation engine learns over time (Fig. 3) and will be able to recommend highly relevant tests that in Figure 4.
The recommendation system is a disruptive machine learning application that significantly reduces redundant efforts. Verification engineers receive recommendations from tests with the same intent as tests written historically. Choosing tests that effectively verify a design or a change brings sophistication to the verification process.
Closure
This article barely scratches the surface of the applications and benefits of AI/ML in VLSI. Huge untapped potential of AI that can revolutionize this field is just waiting to be discovered/invented. ML can greatly reduce major roadblocks that limit scalability and are painfully manual and time-consuming. In the future, the majority of these tasks can be automated through the application of ML, although more research is needed and industry leaders must weigh the risks and costs associated with such a major change.
References
https://shap.readthedocs.io/en/latest/index.html
https://www.imperva.com/blog/clustering-and-dimensionality-reduction-understanding-the-magic-behind-machine-learning/
https://www.gurobi.com/resource/mip-basics/