Tuesday, 7 August 2012

Testing Chain of Inference: A Model

Pass? What's a pass?

I've heard these questions recently and wonder how many think about this. Think about automated test suites: isn't a pass, a pass, a pass? Think also about regression testing - especially where tests are repeated - or where they are repeated in a scripted fashion. 

The following is an exploration of traps in thinking about test observations and test results. To help think about them I tried to model the process.  

Gigerenzer illustrated a chain of inference model, ref [1], which he used to illustrate potential mistakes in thinking when considering DNA evidence in legal trials.

Observed Pass -> Confirmed Pass Model
I used the chain of inference idea to apply to a "test pass":

Elements in the Model

Reported Pass. This is one piece of information on the road to making a decision about the test. It could be the output of an automated suite, or a scripted result. 

Product Consistency. Think: What product are you testing and on what level? This influences how much you will know about the product (or feature, or component, etc). Which features or services are blocked or available? Third-party products in the build - are their versions correct? Is any part of the system or product being simulated or emulated, and how does this reflect what you know about the product? What provisioned data is being used, and does that affect what you know about the product? What risk picture do you have of the product, and has the test information changed that?

Test Consistency. Think: Have my test framework or harness changed, and does this affect what I know about the product? Has the product changed in a way that may affect the testing I do? Has there been any change in typical user behavior, and if so, does this affect the information you think you're getting from the test (test observations). Is any behavior simulated, or not known, in the test?

Confirmed Pass. Think: Taking into account the test observations, the information about the product and test consistency can the result be deemed to be "good enough"? If not, is more information needed - maybe testing with different data, behavior or product components? (Note, "good enough" here means that the information from the test is sufficient - the test/s with the data, format, components and environment. It is not a template for an absolute guide (oracle) to a future "pass" - that would be falling for the representation error…)

Representation Error. A short cut in the evaluation of the result is made, specifically excluding information about product or test consistency. Think: This is equivalent to telling only a partial story about the product or testing.

Lucid Fallacy. Ref [2]. This is the simplification of the decision analysis (about the reported pass) to "a pass, is a pass, is a pass". Think: An automated suite, or a script, that produces a pass - that's a pass right? The assumption that an observed "pass" is a real "pass" is a simplification of the judgement about whether a test result might be good enough or not. 

Proxy Stakeholder Trap. Think: "Pass? That means the product is ok (to deliver, release, ship), right?" It is quite fine for a stakeholder to set their own "gate" about how to use a "pass". The trap here is for the tester who makes the jump and says, "the pass means the product is ok to…" - this is the trap of reading too much into the observation/s and transforming into a proxy stakeholder (whether by desire or accident).

This model helps visualize some significant ideas.
  1. A reported pass is not the same as a confirmed pass
  2. The labelled lines show traps in thinking about a test - shortcuts in thinking that may be fallible.
  3. There is no direct/immediate link between a test being "OK" and a product being "OK"
  4. Observing an automated/scripted "pass" implies that there needs to be a "decision or judgement" about a "pass" - to decide whether it is good enough or not.

Q: Are all reported passes confirmed in your environment? 

Test Observation -> Test Result Model
The above model can be used as a basis for thinking about the chain of inference in test results, reports and extending this thinking to exploratory approaches.

Elements in the Model

Test Observations. These are the notes, videos, memories or log files made during the testing of the product. They may cover the product, the test environment, the test data used, the procedures and routines applied, feelings about the product and the testing, anomalies and questions.

Product Elements. Think: Putting the components of the product into perspective. This is the product lens (view or frame, ref [3]) of the test observations. Are the test observations consistent with the product needs / mission / objectives?

Testing Elements. Think: The parts of the testing story come together - this is the test lens (frame) on the test observations. Are the test observations consistent with any of the models of test coverage? Do they raise new questions to investigate? Are there parts that can't be tested, or tested in the 

Project Elements. The aims of the project should ultimately align with those of the product, but it is not always the case. Therefore, the project lens (frame) needs to be applied to the test observations. Are they consistent with the immediate needs of the project?

Test Results. Think: Pulling the test observations together - both raw and with the perspectives of the project, product and testing elements - compiling and reflecting on these. Is there consensus on the results and their meaning or implication? Has the silent evidence, ref [4], been highlighted? What has been learnt? Are the results to be presented in a way tailored for the receivers?

Context-free Reporting. Think: When the thinking about the test observations and results are not consistent with project, product or testing elements then the context of the result is not involved. The result becomes context-agnostic or a context-free report.

Individual -> Multiple Test Observation Model
Combing the above two models gives:

This has been a look at some traps in thinking that can occur when thinking about test observations, test results and when implicitly making decisions about those.
  1. A test observation and a judgement about a test observation are different.
  2. A test result and a decision about a test result are different.
  3. A test result and a feeling about a test result are different.

[1] Calculated Risks: How to know when numbers deceive you (Simon & Schuster, 2002, Gigerenzer)
[2] Wikipedia: Lucid Fallacy: http://en.wikipedia.org/wiki/Ludic_fallacy
[4] The Tester's Headache: Silent Evidence in Testing