Wednesday 11 December 2013

Using a Relational Database for Functional Coverage Collection Part One

Motivation

Why would one want to use a relational database in silicon verification? My previous blog posts advocated the use of a database in recording the status of tests. This next series of posts advocates using a RDMS to store and collate the results of functional coverage instrumentation. Most of the reasons are the same or similar
  • Asynchronous upload from multiple client simulations. The database will ensure data is stored quickly and coherently with minimal effort required from the client.
  • Asynchronous read during uploads. We can look (through the web interface or otherwise) at partial results on the fly whilst the test set is still running and uploading data. This also allows a periodic check to take place that "turns off" coverage instrumentation of points that have already hit their target metrics reducing execution time and keeping stored data to a minimum (if either are an issue).
  • Performance. Using a RDMS will outperform a home brew solution. I have been continually surprised at the speed of MySQL queries, even with large coverage data sets. You will not be able to write anything faster yourself in an acceptable time frame (the units may be in lifetimes).
  • Merging coverage is not a tree of file merges that a naive file system based approach would be. Don't worry about how the SQL is executed, let the tool optimize any query (caveat - you still need to use explain and understand its output). We can merge coverage dynamically part way through. The web interface can also give bucket coverage in real time too (i.e. which tests hit this combination of dimensions?).
  • Profiling and optimization of test set - what tests hit which buckets? Which types of test should I run more of to increase coverage? And which tests should I not bother running? If we store the data simulation-wise we can mine out this sort of information to better optimize the test set and reduce run time.
I personally am also interested in using highly scalable simulators such as Verilator when running with functional coverage. It is highly scalable because I am not limited by the number I can run in parallel. I can run as many simulations as I have CPUs. This is not true for a paid-for simulator, unless I am very canny or rich and can negotiate a very large number of licenses. But I also want this infrastructure to run on a paid-for simulator too to demonstrate that an event driven, four state simulator does exactly the same thing as Verilator.
So although paid-for simulators may ship with similar utilities I cannot use these with Verilator, nor can I add custom reports or export the coverage data for further analysis.
Also I may also still want to use Verilator if it is faster than an event driven simulator, it is always an advantage for me if I can simulate quicker be it a single test or a whole regression set. Additionally cross simulator infrastructure will also allow easy porting of the code from one simulator to another, reducing your vendor lock in.

Why a RDMS?

The NoSQL movement has produced a plethora of non relational databases that offer the same advantages listed above. They often claim to have superior performance to e.g. MySQL. Whilst this may be true for "big data" and "web scale" applications I think that a RDMS should still perform suitably for applications on the silicon project scale. I hope to have a look at some NoSQL databases in the near future and evaluate their performance versus RDMS in this application space, but in the meantime I'm best versed in RDMS and SQL so will use this for the purposes of this example.
Moving data to and from the database is only one part of this example, and it is also possible to abstract this activity so it is conceivable we could produce code that could optionally use any type of storage back end.
See also SQLAlchemy for Python, which can be used to abstract the database layer making the same code usable with SQLite, Postgresql, MySQL and others. I haven't used it here though and I can't vouch for the performance of the generated queries versus hand coded queries.

Outline of post series

So how do we achieve our aim of storing functional coverage to a relational database? We'll need to design a database schema to hold the coverage data which will be the subject of the next post. But we'll also require a methodology to extract functional coverage from a DUV. Here I'll show an example building on the VPI abstraction from the previous blog posts, although the implementation will knowlingly suffer from performance issues and fail to address the requirements of all coverage queries, it will suffice as an example. Once we can upload meaningful data to the database we can view it with a web based coverage viewing tool, which will present coverage data in a browser and allow some interactive visualisation. Following on from this is a profiling and optimising tool to give feedback on which tests did what coverage-wise.

Example Code

As with the previous blog post series I will release the code in a working example, this time as an addition to the original code. The sample code is available in verilog integration at github.

The example requires boost and python development libraries to be installed and you'll need a simulator for most of the test examples (but not all - see later). Verilator is the primary target, where you'll need 3.851 or later (because it has the required VPI support, although 3.851 is still missing some vpi functionality that will effect at least one of the functional coverage tests). However everything should run fine on Icarus (indeed part of the motivation here is to make everything run the same on all simulation platforms). Set VERILATOR_ROOT and IVERILOG_ROOT in the environment or in test/make.inc.
If you're just interested in running multiple coverage runs and then optimizing the result, you can do this without a simulator. See coverage.readme, no simulator required.

  % git clone https://github.com/rporter/verilog_integration
  % cd verilog_integration
  % cat README
  % make -C test
  % test/regress

Please also note that this is proof of concept code. It's not meant to be used in anger as it has not been tested for completeness, correctness or scalability. It does also have a number of shortcomings as presented, including the performance of the functional coverage collection code. It does however show what can be done, and how.
The next post will describe the database schema.

2 comments:

  1. Just discovered that RDMS is a valid alternative abbreviation for RDBMS...

    BTW the link to coverage.readme returns Error 404.

    ReplyDelete