Thursday, 27 June 2013

Test Bench Control Part Seven

Data Presentation

The web browser became the de facto GUI platform a number of years ago. It has the performance and features required, augmented by a rich set of 3rd party libraries to aid implementation.
When running a regression we have shown how the results of individual simulations can be asynchronously uploaded to the database as they run, and now we can read out the results in real time in another process - also asynchronously.
A huge amount of work has gone into web technologies, many of which are available as open source. Most of this tooling is of high quality and with good documentation. The web also provides a large resource of tutorials, examples and solutions to previously asked queries.
We can pick up some of these tools and build a light weight web server to display the results of simulations and regressions. We will do this in Python (again - for the same reasons as before), using bottle as a WSGI enabled server. As with other parts of this example this is not necessarily optimized for scale as here it serves static as well as dynamic content and in a single thread (check out e.g. uWSGI for more parallel workloads which can be simply layered on top of bottle), but it suits very well here because it is simple and easy to get started with.

Technologies


In this example we use the following tools and libraries
The layout of the pertinent parts of the example repository
  • www/ - root of web server
Data is mainly served in JSON format. Previous experience has shown me that this is a useful alternative to serving pre-rendered HTML as we can reuse the JSON data to build interactive pages that do not require further data requests from the server. Note how the tree browser and table views are generated in the regression hierarchy viewer of the example using the initially requested JSON of the entire regression. Additionally it is trivial to serialize SQL query results to JSON. Modern browser performance is more than sufficient to render JSON to HTML and store potentially large data structures in the client. If however you wish to serve pre-rendered HTML be sure to use a server side templating engine instead of emitting ad hoc HTML with print or write commands.

Starting the Server


The micro web server can be executed from the command line thus

  % www/report

and serves on localhost:8080 by default (check out the options with -h for details on how to change).
Point your browser at localhost:8080 to view the dashboard landing page.


How it Works


www/report.py uses bottle.py in its capacity as a micro web framework. Requests come in from a client browser and are served. Bottle decorators are used to associate URLs with python functions, e.g. static files in www/static are through routed.

  @bottle.get('/static/:filename#.*#')
  def server_static(filename):
    return bottle.static_file(filename, root=static)

We route the other URLs using the bottle.route function, but not as a decorator as is the mode usually shown in the documentation. Instead we pass in a function that creates an instance of a handler class and executes the GET method with any arguments passed to it. This allows us to define the URLs to be served as a list of paths and associated classes.

  urls = (
    ('/index/:variant', index,),
    ('/msgs/:log_id', msgs,),
    ('/rgr/:log_id', rgr,),
  )

  for path, cls in urls:
    def serve(_cls) : 
      def fn(**args) : 
        return _cls().GET(**args)
      return fn
    bottle.route(path, name='route_'+cls.__name__)(serve(cls))

The classes that handle the requests are simple wrappers around library functions that execute the database queries themselves. Serialisation to JSON is done by the Python json library. We ensure by design that the query result is serialisable to JSON and that's it on the server side, rendering to HTML is done by javascript in the client browser using the JSON data.
The query containing classes are declared in database.py and make heavy use of itertools style groupby. However I've borrowed some code from the Python documentation and updated it to provide custom factory functions for the objects returned. The motivation is to be able to perform an SQL JOIN where there may be multiple matches in the second table and so the columns from the first table are repeated in these rows. The groupby function allows this to be returned as a data structure of e.g. a list of firstly the repeated columns just once, and then secondly a list of the multiple matches. The objects returned are a class that inherits from dict but with an attribute getter based upon the column names returned from the query. So object.log_id will return the log_id column if it exists - this is superior to object[0] or object['log_id'].
For example the following SELECT result

log_idmsg_idmsg
110alpha
111beta
220gamma
221delta
330epsilon
331zeta

When grouped by log_id it could return thus

 [
  {log_id:1}, msgs:[{msg_id:10, msg:alpha}, {msg_id:11, msg:beta}],
  {log_id:2}, msgs:[{msg_id:20, msg:gamma}, {msg_id:21, msg:delta}],
  {log_id:3}, msgs:[{msg_id:30, msg:epsilon}, {msg_id:31, msg:zeta}]
 ]

Such that result[0].msgs[0].msg is alpha or result[-1].msgs[-1].msg_id is 31.
From here mapping to JSON is easy and a table can be readily created dynamically in the client with some simple javascript, plus we retain this data structure in the browser for creating other abstractions of the data. For example in the log tab we have a dynamic verbosity adjusting slider and also a severity browser created by the same JSON structure that created the message listing itself.

All the rendering to HTML is done in the client browser. As already mentioned I find that the data can be reused client side so we don't need to keep querying the database, use a cache (memcached) or session management middleware.
Cross browser compatability is provided by JQuery, although I developed it using Chromium (so try that first if anything seems to be broken).
Javascipt can be an interesting learning exercise for those with time. It is a functional language with functions as first class objects, so there is lots of passing of function references. Scoping is interesting too, I seem to find myself using closures alot.
The code is here - I'm not going to go through it in any detail.

Layout


The initial state of the dashboard is three tabs. These each contain a table with respectively
  1. all the log invocations
  2. all regression log invocations
  3. all singleton log invocations (those without children)
Clicking on row will open a new tab
  • If the log invocation was a singleton a new tab will be opened with the log file
  • if the log invocation was a regression a new tab will be opened containing an embedded set of two tabs,
    • The first containing two panes
      • A tree like hierarchy browser
      • A table of log invocations relating the activated tree branch in the hierarchy browser
    • The second contains the log file of the top level regression log invocation

The log files are colourized to highlight the severity of each message and each message also has a tooltip associated with it that provides further details of the message (identifier if applicable, full date and time, filename and line number).


As part of this demonstration the logs are presented with some controls that hover in the top right of the window. These allow the conditional display of the emit time of the message, and the identifier of the message if any is given. They also allow the message verbosity to be altered by changing the severity threshold of displayed messages. Additionally there is a message index that allows the first few instances of each message severity to be easily located - reducing the time required to find the first error message in the log file. It can even be useful when the log file is trivially short as the message is highlighted when the mouse is run over the severity entry in the hierarchy browser.

Further Functionality


The given code is just a demonstration and there is much more useful functionality that could be easily be added.
  • Test result history : mine the database for all previous test occurrences and their status
  • Regression triage : group failing tests by failure mode (error message, source filename, line)
  • Regression history : graph regression status history. Filter by user, scheduled regression, continuous integration runs.
  • Block orientated dashboard : A collection of graphs with click throughs detailing information pertaining to a particular block. Think regressions, coverage, synthesis, layout.

Command Line


We may also want to get the same data in text format from the command line, especially at the termination of a regression command. We can reuse the libraries from the web presentation to gather the information before serializing it to text.

  % db/summary 668
  (        NOTE) [668, PASS] should be success
  ( INFORMATION) 1 test, 1 pass, 0 fail

We could also generate trees of sub-nodes by recursively selecting the children of logs until no more were added, or generate the whole tree by determining the root node and then cutting the required sub tree out. This is left as an exercise for the reader.
It would also be possible to request the JSON from a running web server. We could allow a client to request data in a particular format with the Accept HTTP header, e.g. Accept : text/XML

  % wget --header "Accept: text/XML" \
      localhost:8080/rgr/530 -O 530.xml

But many libraries are available to use JSON, so this is also left as an exercise for the reader.

No comments:

Post a Comment