Tuesday, July 21, 2009

Front Loading Testing… For teams with a high Dev to Tester Ratio

I was talking parts of a team at lunch today and it was evident they were in pain.
They had one tester, and a large number of devs.
The tester was not able to keep up with dev productivity, and as a result the debt in test was such that the devs were now ceasing new development and instead helping with the testing debt.
While its awesome they did that, the fact it was necessary was and is a problem.

The problem with this approach, as John Sullivan @Sensis once pointed out to me- is that the practice is “Ping Ponging”, as in you’re throwing resources to solve one problem (which more than likely wont solve the core problem, and will happen again), and then you’ll take them back and use them elsewhere when the immediate pain has been alleviated.

The most obvious and common perceived solution is simply to get more testers.
Of course, the words “easier said than done” don’t quite cover the futility of this, even if you had a Stork that somehow magically solved this, you still would not probably be working efficiently and something would need to change to make good use of said Stork delivered product.

What I usually propose in this situation is to front load testing, that is- to bring the preparation earlier into the life cycle.
Usually, when the tester is slammed, so is the Business Analyst. They’re the two frequent pain points aka bottlenecks.

I suggest you to get your Tester and your BA to work together on the stories and do the acceptance criteria and test cases up front as part of your normal planning phase.
What this means is, your tester and your BA have brainstormed, asked questions, written down all the test cases they can think of. Any questions that needed the BA to go ask the business, they’ve most likely done before a dev started producing code in anger. When the story hits “Ready For Development”, it really means that the story requirements are there, the test cases are there, and you could go write your automation, or if you decided to skip that step and go write the dev code you could (but as a TDD proponent I’d advise against skipping it).
It also decreases the chance that the story will “bounce” (god I hate that term, but anyway…). Bouncing is when under test, the functionality was not implemented as the requirements suggested subsequently the story goes back into development. Sometimes its ok, but frequent occurrence may suggest a bit of a problem – be careful not to place too much value on 'bounce counts' it can really kill team morale.

If we do this, what is the real benefit?
It means that the tester has performed the most important service that a professional tester can, that which is specialised to them, the rest of the test work should be able to implemented by developers, with some follow up by the tester.
The refinement of the story- very important step, often when this occurs late, especially after development – is very painful.
The list of test cases is important.
Some test cases are a quite appropriate for automation, some may not be and may need hands on testing, or working with integration points and data preparation that you may not have the ability to control.
It should be a warning sign that the tester never does anything later in the lifecycle. At some point they should be checking that all the boxes have essentially been ticked for the story to be complete. Sometimes its as simple as loading up a web page in a browser, a bit of a basic check followed up by executing the automation against an integration environment and seeing them pass, and bingo – complete story.

Be careful about roles. They are a guide, not a limiting definition.
Tester and BA, whoops I did it.
When using the approach I’m describing, the line between the two should be blurred- sometimes completely broken. I expect it; in fact I count on it.
The tester in many ways, behaves like a BA when they write test cases, test cases are requirements.
I’m sure some of you devs out there have had a tester come up to you about a problem for which you couldn’t see a requirement in the story. e.g. They tried to submit a form without entering anything in. The smart play by the tester is to go to the BA (or business in absence of a BA) and ask what the behaviour should be. When they agree with the tester, the test case is now a requirement. It’s that simple, end of debate, it was missing in the original story, and now it isn’t.
Using the approach I’m describing, the dev would never have seen the story without that requirement.

“but if you move a tester forward, you still have one resource, and they’re still constrained, how does that help?”
The answer is this, they are not the sole resource doing tasks that previously they were, and now these can be done by others more efficiently (devs are experts at writing code, testers are not, no matter how much we try to).
Additionally the tasks they do up front can be:
  • Estimated and tracked easily, they’re just brainstorming and writing test cases (acceptance, functional, non-functional etc.)
  • Have substantially less overhead as the tester wont be bogged down in one story, they’ll move through them rather more quickly, without compromising on coverage.
  • The tester will become an expert on the functionality, reducing your co-dependence on your BA. When your BA is in a meeting with business, your tester will be around to answer questions.
  • Can be done while developers are busy working on the current iteration.
    While devs are implementing an iteration, the tester and the BA are preparing the next.
    It will be painful during the first time through, consider an Iteration 0 for technical debt to assist if possible.
  • The test case list can be discussed as to whom should/could do it and be distributed. The automation may be done by devs, the manual tests may be run by business Subject Matter Experts (SME’s), if you’re still short, then you can talk to the business about getting more help in, it will be a quantifiable task list, should they decide not to, then quality is what they’re sacrificing, but at least they’re given the choice.

Why is it that Testers don’t build communities, but developers do?

Ok so I decided to do a bit of a search for testing communities- found this link several levels deep from a google search:
http://www.getzephyr.com/resources/qa_communities.php

Let me say off the bat, I’m already a member of the Software Testing Club.
In my home city, Melbourne (Australia), there is MAST (Melbourne Association of Software Testers). In Sydney where I’m currently at- there is no such group (that I'm aware of).

There are many “Communities” out there but I have trouble calling them that.
Many are centered and often provided by a tool manufacturer around a given testing tool, ie. Implementation communities, less practice oriented. HP/Mercury have one, even my company ThoughtWorks’s Selenium, has a community I would say a majority of which are lurkers, or problem posters. ( I must confess, I was such a person- and now that I know enough about selenium, I seldom hang out on openqa.org)

It made me wonder, why testing communities- especially those that meet in person and discuss testing, and testing challenges tend not to be more common...
Developers on the hand, revel in getting together and talking/arguing.

Why don’t testers do this?
Is it that we can go ask developers for technical problems, whereas developers only have other developers to go to?
Are we self isolating- and by and large- happy that way?
Are we consumers rather than contributors?

I reflect back at the time I’ve spent as a tester, the number of times I got to work with other testers was very frequent, so I cant attribute it to being forced isolation, at least not in my case.
I want to see communities be created, and I want to see them become creative, self-evolving entities. The absence of such organizations bothers me, makes me feel that sometimes testing has become rather stale.

Tuesday, July 14, 2009

Automation: we hit a wall of creativity.. and nobody seems to have realised it.

If you ask a person on the street (not a programmer/tester) what they think automation means- they may well say it means this:



or perhaps even:



another possibility:




And technically, they're right.. they are all forms of automation.

Ask a developer what automation is to them- it will be probably something like this:
It starts with this:


and then becomes...



or even another derivative output:


There are some patterns...

1) the former examples create something..
a car..
a ticket you can use to catch a train with..
a machine that sorts mail..

2) They are all examples of something humans used to do.
humans were a bottleneck when producing cars, and quality was variable which is not desirable..
humans selling tickets are less cost effective than a machine that works 24/7/365 days a year (presuming they're not vandalized ;))...
mail sorting by hand is slow, and prone to human error...

When we look at the computer version, we do not create a tangible object so much, as we create information.
Programming automation, specifically, test automation- tells us whether something met a certain specification; an expectation.
There are also other forms of automation in common existence that lay people may be aware of:
  1. automatic debiting of bank accounts
  2. spam mail
  3. search engine bots/traversers
  4. Tivo/PVR's
The thing about each of these, is they have more in common with the earlier industrial/ticketing examples, than they do with test automation.

Debiting results in money being moved from one place to another... while they're just 1's and 0's and nothing physical is involved, still- there is a metaphor that most people understand that the transaction is akin to moving physical money.

Spam mail, is an annoyance, and one that is difficult to stop. The automation involved is brought to our attention through its promiscuous nature, if it were quiet and subtle (like legitimate email distribution), it would be most likely to be discreet.

Search engines, return results based on a query, which is obtained by robots searching and caching.. its information that it returns, but it is tangible information- that a user can view and interact with. Probably the closest product to Test Automation so far.

Tivos and PVR's take a schedule, or a users viewing habits and records them for them, they manage their own space, and purge old recordings automatically. An example of a more complex automation existing in the consumer space.

What is interesting to me, is that Testing automation has pretty much stopped progression at producing a report, a green, a pass- or alternatively a fail.
We produce a report, which may be transformed into HTML, or even sent via email- but that's about it.
While other forms of automation in the outside world have moved forward, getting faster, better, taking complex actions and bundling them up into accessible packages- we still are using, and indeed re-implementing the same old concepts.
We seem stuck.
Reluctant to take forward steps...
We seem comfortable, not seeking more.
You're probably asking- what more? what is he talking about? isnt green/red enough?

Here are some examples of directions we could go further down...

  1. Automatic Analysis of a test result.
    We currently collect the minimum of information during a test.
    Something met what we were expecting, or it didn't.
    Boolean, yes, no- that's it.
    In some cases, we normalize our comparison with tolerances, just so it is less likely to fail. Big deal.
    What I'd expect, is that when a fail occurs, the test should interrogate the app, understand what went wrong and record this information.
  2. Automatic Defect Creation / Updating
    Tied with #1, upon analysis, we check our bug database to see if this issue has occurred before.
    Bug bug = findBugInBugDatabase(testResultPayload);
    if (bug.exists == true)
    {
    if (bug.status == "closed")
    {
    bug.status = "re-opened";
    bug.incidence_count + 1;
    bug.addNewReason(testResultPayload);
    }
    else if (bug.exists == false)
    {
    bug.status = "new";
    bug.recordDefectPattern(testResultPayload);
    bug.incidence_count + 1;
    bug.addNewReason(testResultPayload);
    }

    ok, you get the gist.
    I've done this before, but never in a Continuous Integration/Agile environment. In a big product, but the pattern matching was pretty crummy. There are plenty of ways of doing this, eg.
    * disabling exception handling and using exceptions as a language to communicate to tests so the detail can be captured
    * using a web service to call when a test fails, passing in the component under test, the field being verified, and the particular error and using a server piece to examine the source code and return back a likely match.
  3. Automatically Close a Bug (reverse of above, but dependent on..)
    If we already know we have a failing test, and we know the pattern that caused it, then conversely we'd know that the bug was resolved too. No reason why the return trip isnt possible too.
  4. Self Correction.
    For example, test fails. Test is targeting features existing in Feature blah, Feature blah is made up of functionality.cs, Developer/Tester x checked in changes to that file on the last check in. The execution before that check in passed, automatic action to take, revert developer/tester x's changes, and notify them you've done this to protect the codebase. Suggest they check/re-run the test before trying again.
  5. Short Cut to Fail
    TestNG does this well- the ability to set soft dependencies, so should a particular suite depend on a particular test passing eg. a smoke test, if the smoke test fails- it doesn't run the dependent suite as it knows not to bother.
  6. Short Cut to Pass
    Conversely, if we know that an area of the build has not changed, but another has, follow the functional and integration dependencies and run only the tests that are absolutely needed. (Believe clover does this but I haven't seen this functionality myself)
  7. Code suggestions
    If a result is a fairly common one, suggest. For example "you've gotten a null pointer exception, ensure the following variables/objects are set: " etc. We have access to the call stack, there are things we can do to make the programmers job easier. We're not using any of them right now.
  8. Tests for Free
    If we know that no functionality should take more than n number of seconds to run, irrespective of whether a test looks for this behaviour, test based on this behaviour anyway, so it can be dealt with.
    Another manifestation of this might be, we have a text field. We know that text fields are what we type into, and we submit them. Text fields are also where someone might want to inject SQL to attack a database behind the scenes. If we had a library that automatically applied pressure on this behind the scenes, that would be of value wouldn't it?
  9. Metrics
    They are something that we typically build on later, or rely on a CI tool to demonstrate for us. What is most frequently the case, is that they depend on unit testing frameworks results to produce metrics from. Why do we use CI tools this way? Couldn't we do this better in the test frameworks themselves? Possibly a more flexible method that allow us to track various different factors and essentially data mine them, for later benefit? Point out trends to us perhaps automatically?
While there are proprietary tools that do some of these, most of us wouldn't use them unless we had to, when it comes to lightweight tools, there is not much out there. I have a side project working on some of these, but it distresses me that these aren't openly observed goals in the broader community.

Anyway, I think I've droned on enough for this sitting, but I hope my thoughts were successfully transferred and add some value to those of you out there inclined to sit through to the end ;-)

Thursday, July 9, 2009

Continuous Integration without a CI server...

As an opening comment, I've borrowed extremely heavily from the work from C Stevenson on dealing with a Legacy project in an Agile way. I consider this paper a must read, I hope those guys in the UK forgive the amount I've poached ;-)

Having a server that does CI (Continuous Integration) for you, does not mean you understand CI, or are doing it well.
You should probably take the time to understand CI before just pressing that chubby digit on that mouse button and downloading your CI weapon of choice.

I believe, you should be able to do CI without them.

The goal of CI, is to provide rapid feedback after integrating many people's work.
CI server tools are one way of doing it. It can be done differently.

I think the view of CI server tools such as Cruise, Cruise Control /.net and Hudson (to just name some popular ones) is to centralise and manage the pipeline of changes in a nice orderly fashion.
Centralisation has its advantages I have no argument here.
However what I'm proposing is decentralising, by front-loading the responsibility of integration on the developer who is about to check in.
Source control systems handle change just fine.
You can do it without a CI server.

I think CI servers do a great job at what they do, I cannot fault them in their purpose or execution (some of them better than others of course), but I've noticed a few anti patterns spring up through some people's dependence on them.
I call it "leaning" on the CI server.
What I'm talking about is doing a very brief "Smoke" test and then check in.
You know full well that the test isnt comprehensive, there is substantial risk, and indeed likelihood of a break occurring, but you lean on the server to deal with that instead of taking responsibility yourself.
It is an artificial speed that is gained through this anti pattern.
When (and I mean When) it fails- it then it takes more than double the normal build time to fix it.
How?
1) The time it took to break in the first place
PLUS
2) Whatever time it took to fix the issue and check in the fix
PLUS
3) the time it took the build to run a second time to pass.

Presuming for a moment that nobody else has checked in on a broken build, and that's how you end up with a perpetually broken build.
If this presumption is correct, and people are abiding by it, it means that can be a considerable time that an entire team cannot check in their work.
Checking in should be something they can do whenever they want, we should be enabling it, not finding ways and tools of disabling it or hampering it.

The more mature places I've seen have run their full tests as a pre-commit (thanks for reminding me of this Julio!). I think they've got the right idea- and its where I'm going with this piece; but before I get into that, there are some other pre-requisites I'd like to address.

The method I'm going to propose is very much contingent on the quality of the code, and indeed how it is being tested before it is checked in..

This is a method I was only recently introduced to by other thoughtworkers at a recent coding dojo. I must say- I thought I understood TDD before this but I was indeed wrong. Once you follow this method, it truly changes your approach to development.

To me, when you say you use TDD to implement your code- this is what I think you mean..
  1. You write your unit test to what you expect a given method should return
  2. You write the bare minimum of functional code to get that test to pass
  3. You refactor/clean up.
  4. You re-test and ensure it still passes.
  5. You write your next test and repeat.
When you hit a wall and need to refactor (and you will), you refactor your implementation code, in both your worker methods, and your unit tests, but not the test assertions just the data being used to assert on eg. integer becomes an object that represents what was an integer.
The underlying point of the test should remain.

If you follow this method- your code that you check in should always be tested. There should be extremely high trust as there should be no gaps, even just at a coding level.
If other devs were doing this too- you'd be more likely to trust their work (presuming they weren't exhibiting anti patterns such as James Carr speaks of).
If you use the same method rigorously, you will begin to trust their work, and they trust yours, the only real risk that exists, is integration testing and functional testing that requires more than just unit testing.

If you had done just this, then updated to the top of the given trunk/branch you were working from- and then-

Ran your full functional, non-functional (eg. performance, static analysis, dynamic analysis, security) and integration tests before checking in- would you say that would give you confidance you're not negatively impacting the code base?

I put it to you, that you have, and indeed you have made a CI server redundant.
Especially if you can turn around and talk to your team and tell them what you're checking in.
You've covered all the important functions that a CI server does for you, but you've done it by talking to other human beings who are directly impacted by what you have changed.

The point of this, is that if everyone is following this standard, and they're always only checking in on a passing build- coupled with communication- the build should never be "broken".

Some of you at this point are probably saying "Nice theory", but you need to consider that being an agile project does NOT mean downloading Cruise Control, getting a build light, stand ups, retrospectives.
Those are activities and tools people do/use when they need to provide feedback or receive it, to encourage communication.
If you dont need a stand up, dont do it! (Jeff Rogers once pointed this out to me) if you can communicate effectively amongst your team without it and everyone understands: its all fine..
Same applies to the build.
You could write post it notes that say 6 words- as long as everyone understands and knows what they need to do- you are fine!
I think the biggest mistake we make is defaulting to rigorous tools and process, instead of just the process we need to produce working software often, especially when we're breaking in new people, the wrong message can be sent.
The first time I worked on an 'agile' project, I completed it and went onto the next without ever understanding what the hell agile really even meant. It is all too easy to confuse tools and process with being agile.

I'm about to change tact here for a moment.
Up until this point I've been talking from a very high level "you should" position- I'm now going to switch to a more "how to" mode... keep in mind- some of these things wont work for you.
Use your own head (consider the context of your situation) after reading what I've just said- what I'm about to cover now is one possible interpretation of it borrowing once again heavily from Stevenson et al.

  1. Talk to your team about how they'll write TDD code. You all need to be on the same page, and you need to enforce it- equally, without fear or favour. Nobody has a get out of jail free card. You're part of a team of equals. Rogues need to be hauled in or moved on. You cannot afford to suffer fools with this approach.
  2. You need to treat your code base as you would if you had the most prized possessions and its ownership was split evenly among your team. If you treat your codebase as a rubbish dump- this wont work.
  3. When you perform architecture refactorings, you need to do it as a team, not as a pair.
    Get your whole team involved. No developer I've ever met likes seeing 325 changed files in an update and discover "oh yeah, I refactored the architecture, want me to explain what I did?"
    Discuss it before hand around a whiteboard, then all of you spend whatever time you need implementing the changes.
    Do it as a team, that way everyone understands the changes, and do it often throughout the project.
  4. Aim to keep your build to 5 mins. Push as much testing into unit and low level functional tests (eg. javascript tests such as screw unit), only put tests that must use slow technology when you have to. (eg. frameworks like Selenium RC, compared to unit tests and javascript tests, they're slow).
  5. Use parallelization to manage your execution of slower tests. Inventory them often, merge tests that are expensive/test little. A method I've seen used successfully, is to use a tool like Selenium Grid, set up runner agents on all dev machines (hide them away in a vm somewhere, MS VirtualPC is free, there are other opensource alternatives), and use the grid controller to point your new little bot net of grid agents at your developers machines to help speed up your tests to allow your developers to maintain a 5 minute build target.
    If you exceed 5 minutes, understand why its exceeding 5 minutes- and aim to get it back to 5 minutes. Add more agents, or re-examine your tests.
    Team morale is directly tied to build time (I agree Julio! have seen this on more than a few projects)
Anyway- thats it for now, if you'd like me to go into more detail on any facet in this post please let me know..