Gaia/Email/LoggestTestFramework
Overview
gaia-email-libs-and-more currently uses a largely custom testing framework right now that was built for the deuxdrop project based on experiences trying to test Thunderbird.
Files Referenced
Comments do exist in the source files that are worth checking out:
- log.js Logger/actor/etc. implementation and infrastructure
- testcontext.js: Test definition support.
- testdriver.js: Runtime support
Concepts
Asynchronous Steps
Each test is organized into one or more asynchronous steps. Each step runs until all of the asynchronous operations in the step are completed. Each step has a timeout; if the timeout fires before the asynchronous operations complete, the step fails and the test case fails and the test driver framework just tries to run the cleanup code for a test.
There are a few different types of test step; the delineation was just made to try and provide meta-data that could help automated tooling or those reading the tests. The TestContext class defines the following methods:
- setup(): A test step that is just trying to set up the test; these kinds of steps should usually pass unless something is very broken in the system.
- convenienceSetup(): A setup step that is automatically generated by testing support code rather than explicitly written to be part of the test. For example, when our test involves talking to an IMAP server, setting up the account happens automatically.
- action(): A test step that's trying to do something. Currently there is no concept of a 'convenienceAction', although action steps can be marked as "boring" so the UI displayed them with less contrast.
- check(): A test step that's just trying to check the state of the system and not perform any actions. Currently there is no concept of a 'convenienceCheck', although check steps can be marked as "boring".
- cleanup(): Performing shutdown operations where we either care to make sure that the shutdown happens for correctness reasons or where we are trying to clean-up after resources that won't be automatically destroyed. These steps currently will be performed if we abort the test to provide them a chance to perform necessary cleanup, but we might want to stop doing that.
- convenienceCleanup(): like convenienceSetup, a cleanup step that is automatically generated.
- convenienceDeferredCleanup(): a convenienceCleanup that is added to the list of steps after the defining function finishes running. Usually a function that calls convenienceSetup() will also call convenienceDeferredCleanup() to ensure its cleanup function is invoked at the right time.
Test steps are defined/declared as 'test definition' time. Let's look at the boilerplate for a test:
// (this is global code that is run when the JS file is evaluated; but we use // AMD-style modules, so the body of our module is not actually evaluated // until our module is require()d) define(['rdcommon/testcontext', './resources/th_main', 'exports'], function($tc, $th_main, exports) { // This is module-level code that is run when the module is evaluated var TD = exports.TD = $tc.defineTestsFor( { id: 'test_blah' }, null, [$th_main.TESTHELPER], ['app']); TD.commonCase('I am a test', function(T, RT) { // This is test case definition code, it is run when we want to set up our // test. Our calls to T.check/T.action/etc. produce test steps as a // byproduct of our being run. T.action('action 1', function() { // I run asynchronously. I will not run until after the defining function // has fully completed running. }); T.action('action 2', function() { // I do not run until 'action 1' has completed in its entirety, including // any asynchronous operations it was waiting for. }); // actions can also be created in a loop, or by other functions. The calls // to T.action() just need to complete before this function returns. }); // (end 'I am a test') }); // (end module 'define' call)
Code is run like so:
- The 'I am a test' function runs, defining one or more test steps. In this case, it defines 'action 1' and 'action 2'.
- 'action 1' is run. We do not advance to the next test step until all tracked asynchronous operations complete or we time out.
- 'action 2' runs after 'action 1' and all of its asynchronous operations have completed.
- the test is over!
Loggers, Expectations, and Actors
The idea at the heart of this test framework is that logging helps us understand complicated asynchronous processes that are operating in parallel.
The e-mail back-end is instrumented throughout with named loggers. Typically, each notable class in the system has its own structured logging definition. A logger instance is instantiated for each instance of the class and has the same lifetime as that class. Whenever something notable happens, we log a named log entry along with data that provides context.
The testing framework is built around specifying expectations about the log messages that our system (and our test infrastructure) will log while it is run.
"Actors" in the test framework are the term given to the test-framework's counterpart to a logger.
In a unit test, the following code creats an actor that we have dubbed 'A' for UI display purposes:
var protoActor = T.actor('FooProtocol', 'A');
The actor will correspond to a logger that would be created something like this:
var protoLogger = LOGFAB.FooProtocol(this, parentLogger, 'name');
The definition for the logger would look something like this:
var LOGFAB = exports.LOGFAB = $log.register($module, { FooProtocol: { type: $log.CONNECTION, subtype: $log.CLIENT, events: { connected: { hostname: true } disconnected: {} }, TEST_ONLY_events: { connected: { username: false } }, }); // end LOGFAB
The key life-cycle issues related to actors and expectations are:
- How do we know when an actor should be created? This is especially important when multiple loggers of the same type are going to be created.
- How do we know when we care about what happens on a logger/actor versus when we don't care?
This is primarily handled by having the test step definition explicitly reference the actors of interest during a step. If log entries are made against an actor that are not expected or log entries don't match what was expected, an error is generated and the test step fails.
Example:
T.action(actorClient, 'talks to', actorServer, 'using', actorProtocol, function() { actorClient.expect_open('server.nul'); actorServer.expect_accept('client.nul'); actorProtocol.expect_connected(); // and then get the ball rolling. realClientObject.connect('server.nul'); });
testhelpers: Extending actors via mix-ins
In gaia-email-libs-and-more, there is a lot of complex state that we want to generate expectations about, but which would be an absolute nightmare to deal with manually. Also, sometimes you need to do the same thing in multiple tests and don't want the negative impacts of copy-and-paste.
So we need helper code. The implementation that has evolved is to create 'test helpers'. These are actors that define their own test loggers. They are supported by an infrastructure that allows 'TESTHELPER' definitions that specify mix-ins to include in actors. The infrastructure supports a '__constructor' that allows the actor to create its own logger and perform other setup.
The main testhelper file in gaia-email-libs-and-more is th_main.js. It defines the 'testUniverse' and 'testAccount' test helpers. The 'testUniverse' takes care of setting up most of our e-mail back-end test environment; it instantiates a front-end MailAPI, a back-end MailUniverse with MailBridge, etc. All kinds of complex stuff. 'testAccount' automates the creation of an e-mail account.
By convention, they expose a number of different types of methods with some naming conventions:
- do_BLAH(): The do_ prefix indicates that the function will add one-or-more test steps. The function should be called at test definition scope.
- _do_BLAH(): The leading underscore indicates that this is mainly supposed to be a helper method for other methods on this testhelper and should not be directly used by tests.
- expect_BLAH(): The expect_ prefix indicates that the function will add expectations inside the implied current step. It is intended to be executed inside of a test step to help set expectations. This naming convention is misleading since it is also the same convention used by actual log entries, so it can be hard to tell what's a helper method doing something complex and what is actually just an expectation on a logger.
Things
A 'thing' in our testing infrastructure is a named object that the test framework is aware of that isn't going to have its own logger in the implementation. Something that is acted upon rather than acts.
For example, mail folders or mail messages are represented in our tests as things. The support from our test infrastructure is that things support our testhelper mix-in mechanism, and they are understood by the test framework for logging purposes; you can include them in the description of test steps and they will be serialized and have UI support in the log viewer.
There is some special functionality in the UI and the system to support assigning a 'digital name' to things. This was used in the deuxdrop project's unit tests to make the logging system aware of the public keys in use by the unit test. The logging framework would log the public keys without any greater meta-data, but then the logging system could see the keys and map them back to human readable names like "C root" to indicate it was test user C's root signing key. This has not been used in the e-mail app yet, but might be useful to map IMAP UIDs or the like.
Lazy Loggers
The potentially annoying thing about this strategy is that you need to have loggers to log and expect things on. And logger definitions are a bit of a hassle to create; it would be annoying and provide very little benefit if you had to create a whole bunch of custom loggers and entries for each test.
So, for this reason we have the lazy logger. Create one like so:
var eLazy = T.lazyLogger('some-name');
(The 'e' prefix is an unimportant naming convention. The original name of 'actor' was 'entity', but entity was deemed even more confusing than actor. And then switching to an 'a' prefix would be weird because that's the mozilla convention for prefixing argument variables. Now it can be thought of as indicating "something you can set expectations on". Really, just name things whatever you want.)
The lazy logger is a fairly straightforward actor with self-creating logger. It provides the following expectation methods:
- expect_event(name): Expect a named event without a payload that we want to equivalence-check. If there's a payload to check, use expect_namedValue().
- expect_namedValue(name, value): Expect some a value that you want to describe so that someone looking at the unit test has some idea what they're looking at. Because of how our equivalence testing works, you could just use expect_value(value) and forget about the label, but your test will make less sense to people looking at it.
- expect_value(value): Just expect a value. Think of this like an unlabeled assertion.
- expect_eventD(name): See .eventD below.
- expect_namedValueD(name value): See .namedValueD below.
And these are the corresponding logging methods:
- event(name): Log the named thing
- namedValue(name, value): Log the name and value.
- value(Value): Log the value
- eventD(name, detailToShowButNotCompare): This variant lets us log an additional piece of information that will be displayed in the log but will not be equivalence tested. For example, your DOM event object may be full of all kinds of useful data, but a lot of that may not matter to your test or be full of random things like timestamps that is hard to predict.
- namedValueD(name, value, detailToShowButNotCompare): Like eventD, you can provide an extra argument that's not equivalence tested. 'name' and 'value', however, do get equivalence tested.
So here's an example of how you might have a simple (dumb) test step work:
var eLazy = T.lazyLogger('check'); T.step(eLazy, 'my math', function() { eLazy.expect_namedValue('addition', 8); eLazy.expect_namedValue('subtraction', 4); eLazy.namedValue('addition', 6 + 2); eLazy.namedValue('subtraction', 6 - 2); });
All that really matters is that we set the expectations before we log the results.
You can create more than one lazy logger. And if you have multiple asynchronous things happening, you should.
Single-Step tests via TD.commonSimple
If you are just writing unit tests and don't need a test case with multiple steps, you can use TD.commonSimple instead of TD.commonCase.
It looks like this:
TD.commonSimple('test my math', function(eLazy) { eLazy.expect_namedValue('zero', 0); eLazy.namedValue('zero', 0); });
You get a free lazy logger.
Implementation Details
Object equivalence
We have comparison helper functions 'smartCompareEquiv' and 'boundedCmpObjs' in log.js at line 847(currently).
The basic properties this are:
- === fast-path and for things that aren't objects
- arrays and objects use our object comparison function
- we have a finite recursion depth on arrays and objects in order to avoid dying horribly on circular object structures. The current depth is 6, which was arbitrarily chosen based on the depth of data structures in deuxdrop.
- objects that implement toJSON() have toJSON called and we compare that
- we iterate over the contents of an object using for...in
Neat testhelper things
testUniverse in th_main.js will automatically thunk "console" for you so that it shows up in the unit test logs. If you aren't creating a testUniverse, you might want to use its thunkConsoleForNonTestUniverse helper.
The fruits of all of this
Here's an example of what our unit test output looks like: https://clicky.visophyte.org/examples/arbpl-loggest/20120621/unit-test-level/
And here's an example of what our runtime log output looks like (at 'general' logging level): https://clicky.visophyte.org/examples/arbpl-loggest/20130521/
Learn more about enabling runtime logging at Gaia/Email/SecretDebugMode