9: Keeping Usability Testing Simple
Krug mentions that the most common call to a usability consultant comes from a firm that is launching its site in two weeks - a last-ditch effort to tidy up some of the odds and ends, mostly superficial things. Which his to say, it's too little, too late, and for all the wrong reasons.
Focus Groups are Not Usability Tests
A common mistake is attempting to use a usability test as a focus group. The two are nothing alike.
- A focus group brings together a group of people to gather their opinions about products, whether their past experience or their reaction to proposed concepts.
- A usability test involves one person at a time, observing them as they try to use something to determine whether they can.
The main difference is that a focus group is opinion and speculation, whereas a usability test determines whether something can be used for its intended purpose. Ultimately, people may say that they like something, whether or not they can figure out how to use it.
Focus groups should be used to generate ideas, and usability tests to determine whether the ideas are any good in practice.
Basic Advice on Usability
Krug has three bits of advice about usability testing:
Usable Sites Require Usability Testing
Usability testing is indispensable to building a site that people will be able to use.
- You know too much about their business. The user doesn't have the same level of knowledge, and what seems intuitive to you is completely opaque to them.
- You have your own agenda. The user is not interested in doing the things you want him to do, but in his own agenda.
- Design theory doesn't always bear out. While a designer is trained in what ought to work, only testing will determine if it actually does.
He draws an analogy between testing a site with users and talking to foreign visitors about your country. They tend to be puzzled by the things you take for granted, and fail to notice the things you think should be prominent.
Testing Always Produces Value
Testing with one person is better than no testing at all, and this holds true even if he is the "wrong" user (does not match the demographics of your target market).
(EN: He goes off on a tangent after stating that, and I'm not sure I can accept the breadth of this statement. There are some things anyone can test - generally sites meant for "anyone" - but there are other things that can only be tested with a qualified participant. It would be a grave mistake to redesign a car after testing it on a person who has never driven.)
Early is Better
When you're at the point where the code has been written and the site is ready to release, changes are more costly than if they are caught early.
It's no trouble to change the color of a button - but if the steps of the task are in the wrong order, that's a show-stopper and many firms would rather plow forward than redesign the entire process.
However, if usability testing was done with paper prototypes before the first line of code is written, then reordering the steps in the task is a much less significant change to make.
Testing on the Cheap
Usability testing isn't that intense: the basic approach is to give a person a task to do, a tool to do it with, and see if they succeed.
To gain the kind of scientific accuracy that is required for testing a new medicine, months of experiments are done with thousands of test subjects to ensure that the test is as accurate as humanly possible, so the proof is with 99.99% certainty.
Because usability came out of an academic environment, it brought with it the scientific method and an inordinate amount of rigor and an unreasonable demand for precision. A usability test could cost $20,000 to $50,000 or more.
In 1989, Jakob Nielsen's paper titled "Usability Engineering at a Discount" suggested a less rigorous process, and he demonstrated statistically that you can get "good enough" results with far fewer participants. This dropped the price for a test to under $10,000 per round.
Even this is a lot of money, particularly for smaller firms, and the price was prohibitive to doing adequate usability testing. It can be done even cheaper than that.
Krug suggests that getting a professional involved and renting a lab is still a good idea if you can afford it. But if you can't, you don't have to dispense with usability testing altogether - you can give up a little accuracy and comprehensiveness and get it done on the cheap.
Here's how that can be done:
- Run a test that takes an hour or less, and you can easily do four rounds of testing in one morning or afternoon.
- Reward participants with a $25 gift card, and it costs $100 to get four people to test
- Bring participants into a regular office rather than a lab. Let observers watch though a video feed to a conference room
- Your "report" can be a bulleted list of items in an email, indicating the problems you witnessed
Test Early and Often
He suggests that every web development team should have one day a month when they do usability testing.
This approach keeps the test simple, and enables the team to test often. Testing is not a major event that requires weeks of preparation.
Limiting it to just once a month also enable you to focus on what is most important, so that you will get what you need, and it gives you a month to correct any issues you find.
A routine test schedule means there is going to be a test on a specific date - maybe the second Friday of every month. People expect it to happen, and you don't have to argue that something needs to be tested before you can schedule a test.
It is also easier for people to make time on their schedules to observe the test. It's not a special event that is announced on short notice, but something known far in advance that can be done as a matter of course.
How Many Repetitions Do You Need
Krug suggests that three is the magic number. In terms of the scientific method and statistical significance, people will argue that this is not enough, and that you need at least thirty repetitions to be statistically valid. That is true, but it does not matter.
The purpose of a usability test isn't to prove something. You are not a scientist, and you are not likely to publish your results in academic journals. You are only looking for indications of things that might be problematic. So all you need are insights, not ironclad proof.
You don't need to find all the problems - just the big ones. It's especially important because you don't generally have the resources to fix all the problems, so there's little sense in investing in a test that will find things that you don't have the time, money, and bodies to address. Three users is enough to find the major problems, and their results likely will keep your team busy for the rest of the month.
(EN: One of the ironies of this is that the people who argue for rigor are the same ones that refuse to pay for it. If you're testing on the cheap, it's because the project sponsor doesn't want to pay for a laboratory test. And if it's not important enough to pay for the rigor, then it's not important.)
Finding Participants
Participants can be found in many places - user groups, trade shows, even using Facebook to find them.
But much depends on the precise kind of user you're looking to attract. You can easily find college kids and get them to come in for a $25 gift card - but if you're looking to test doctors or people with high net worth, your cost and effort will both be higher.
In the end, Krug balks and suggests downloading the "How to Recruit Participants for Usability Studies" pamphlet off the Nielson-Norman Group website.
Test Locations
Formal usability testing often takes place in a laboratory environment, set up with the test device (computer or mobile phone) in a room rigged with sound and video recording equipment and a separate observation lounge.
Informal usability testing can be done in your own office or in a conference room. Anywhere you have a computer and two chairs.
However, it should be a rather quiet space so that the session is not interrupted and the participant doesn't feel intimidated by the observers. It's a good idea to put the observers in another room and let them view remotely, watching the participant through a camera and the screen through a screen-recording program. This can be done very cheaply with today's equipment and software.
Test Facilitator
The proctor or facilitator of a test can be anyone, provided they can control their own behavior.
You will need to be able to refrain from interfering unnecessarily so that the participant does what he is inclined to do, not following your instructions. You need a lot of patience, calmness, empathy, and the ability to listen attentively.
Aside of providing basic guidance (not step-by-step instructions) and observing, the facilitator should encourage the participant to "think out loud" as much as possible to get an idea of the reasons he is encountering difficulties.
Test Observers
Krug insists that as many people as possible should observe the usability test. There is nothing quite as convincing to people as seeing things for themselves - and for many, it can be a transformative experience that teaches them humility about their attempts to predict what will work.
These individuals will need to be set up in an observation room (which can be a conference room), wired in so that they can see the test in progress, but remote enough that they cannot interrupt the test.
He also suggest involving them in the evaluation: get people in the room to write down the three most serious usability issues they observed, so that they can feel included in the process.
(EN: I have had problems recently in the observation room, with people whose behavior is counterproductive - such as those who insist on discussion solutions while the test is going on. Some coaching is needed.)
Choosing What to Test
In a perfect world, you could test everything - but unless you have an inordinate amount of time and money, that like won't be possible.
And so, choose the tasks that are most important to get right, so that the site fulfills its purpose.
An e-commerce site will want to test how users find a product, add it to a shopping basket, and go through the check-out process. If they have to log in to shop, you will need to test creating an account as well.
There are also tasks of secondary importance. If a site requires users to have accounts, creating an account is not the only task. People will need to login when they return to the site, retrieve a forgotten password, retrieve a forgotten username, update their account information, etc.
Also, keep in mind that the user will need more time - about twice as much - in a test situation than is necessary for an experienced user to complete the tasks. They will struggle with some tasks, you will be talking to them about their impressions, etc.
As an aside: be careful about the way you describe tasks to a user so that your instructions do not coach him to do things he might otherwise not be inclined to do. Ask someone to find an article about something and he will zoom in on the word "articles" in your navigation - it would be better to ask him to find "information" and see if he goes for the articles on his own.
Sample Test Agenda
Krug provides a sample agenda for a one-hour usability test:
- 4 minutes are spend welcoming the participant and explaining the situation
- 2 minutes are used asking general questions about the participant to put them at ease and gauge how they match your audience
- 3 minutes can be spent on a "home page tour" to let them orient themselves before performing a task (EN: this seems odd to me - the tests I've seen let them do this on their own, rather than a separate task)
- 35 minutes are left on the task itself. The participant is told what he needs to do, along with essential details to do it, and let loose on the test model. The proctor asks questions such as "what are you thinking?" or "what do you expect?" from time to time.
- 5 minutes are spent in a debriefing, asking them to reflect on that ask they just performed and asking questions about what was simple or difficult about it
- 5 minutes are the wrap-up, thanking them, paying them, and showing them out.
Sample Test Session
(EN: Krug spends several pages on a sample test session, showing illustrations of what is on the screen, describing how the user behaves, and modeling some of the dialogue between proctor and participant. It follows the pattern above an is too idiosyncratic to annotate.)
Typical Problems
Krug describes the three most typical problems discovered in usability testing:
- Confusion. The user simply doesn't "get it." They look at the site and don't know where to begin, or are completely sure of doing the wrong thing.
- Language. The user knows what he is looking for and can name it, but the words on the page don't match the name he has in mind
- Perception. The user is unable to see what he needs because he is distracted from it, either by something specific that takes attention away or the collective confusion of the page
Deciding What to Fix
After the text, hold a debriefing - this should be done as soon as possible, while the experience is still fresh in everyone's mind. Testing in the morning and ordering lunch is often a great way to keep people around (and get more to show up).
Each round of usability testing provides a number of indications of problems on the page that get in the way of the user accomplishing the task. The debrief allows people to talk about what they witnessed, and to negotiate which findings are worth fixing.
One danger: people like to solve simple problems. They will totally go for changing the color of a button, and shy away from rearranging the content of the entire page. Krug advises that you should be ruthless in fixing the most serious problems first.
(EN: If you can keep the discussion focused on problems observed, rather than specific solutions, the important things are easier to agree upon. However, you also have to coach people to avoid jumping to solutions, as it seems to be the natural inclination to look for a quick fix, rather than considering the actual nature of the problem.)
He lists some of the techniques he uses:
- As each person to make a list as they observe, then compile them into a single list at the start of the meeting. Keep count when multiple people noticed the same thing.
- Vote on the ten most serious problems on the list, informally. Use that as well as the count to rank them.
- Do not throw away anything. Even the problems that aren't serious can go on a "wish list" for when time permits.
- Ensure all feedback is based on actual observations. A session such as this is an inviting opportunity to inject personal tastes and pet peeves.
- Take user suggestions with a grain of salt. Test participants will want to tell you what to do or suggest new features. The point of the test is to determine whether what exists is usable - not to allow users to add to scope based on personal preferences.
- Avoid temporary problems. There may be moments where a user hesitated or went astray, but then discovered the right path. Take the attitude that "all's well than ends well."
(EN: Krug also suggests that he lets people talk solutions - "how are we going to fix this in one month?" but I find that to be a bad practice. Unless they are designers, they don't know how to fix a design problem, and the solution may be worse than the problem. Again, focus on identifying issues in this meeting, and defining solutions is a separate process.)
Testing Alternatives
Krug mentions a few new tricks in testing:
- Remote Testing. The user does the task in their own home or office while allowing you to observe remotely. This is even cheaper because you don' need equipment and don't have to travel. It also engages people in the environment and using the equipment they actually will use.
- Un-Moderated Remote Testing. The user is in his own space, but also on his own time. All instructions are received in text, and the user takes the test and sends you a recording.
(EN: Krug mentions no drawbacks and provides no testimonials on these. Some of the limitations are obvious, and my sense is that they need more shakedown time to become useful. Wait for others to experiment and improve.)
Dealing with Objections
Krug offers some snappy answers to common objections to doing usability testing:
- "We don't have the time." - Usability takes less time than the incessant arguments over what people think are usable. It saves the time it takes to re-do things when peoples' opinions are wrong.
- "It's too expensive." An adequate test can be done for a few hundred dollars, and the cost of doing it beats the cost of rework
- "We don't have the expertise." A usability expert will get better results, but anyone can do an adequate job and get good-enough results.
- "We don't have a lab." You don't need one. All you need is a computer and two chairs. Cameras and screens for observation are also good, but optional.
- "We wouldn't know what to do with the results." When it comes to usability, the results tell you what to do, and the most serious problems are hard to miss.