3 - Prioritize Testing Opportunities
At the onset of an optimization program, the options for testing can be overwhelming - there are thousands of pages and each contains dozens of things you might tinker with, and you cannot test everything at once.
Even if you had unlimited resources, you would still face a number of limiting factors. The amount of traffic your site receives will determine how long a test must run to achieve statistical significance, the capacity of your testing software to track multiple variables is likely limited without degrading responsiveness, and some things you wish to test may be mutually exclusive.
As such, prioritizing tests is critical - and largely based on three criteria: potential, importance, and ease.
Use Data to Prioritize Tests
Sharp constrictions in your conversion funnel are a clear indication of areas that are in need of improvement - but some intuition is involved. Although the data may show that customers are dropping out in a specific place, the real problem might be at an earlier step in the process.
One example is shopping cart abandonment in ecommerce sites, which generally has nothing to do with the shopping cart page itself. In one instance, visitors had to add an item to the shopping cart to find out the shipping charges. The author suggests that moving the shipping information to the product detail pages would alleviate the problem. (EN: More likely it would not - if the customers felt the shipping costs to be too high, they would abandon anyway. The real fix to this problem is to negotiate better shipping rates or offer free shipping, not move the unpleasantness to a different page.)
Web site analytics data can also be useful. This identifies the path users take through your site, from the page that they land on (which isn't always your home page) through the one that they leave on. You can follow the path users take through your site to see where they are dropping out, and this generally indicates problem areas. The author goes into tedious detail about this.
There's a sidebar in which the author grouses about design standards - standards documents are generally used to ensure consistency, but the author feels they are too restrictive and prevent testing new ideas. (EN: This is a matter of practice - if the standards are used as an excuse not to try anything new, that's a problem. But in most instances standards are flexible and if you can prove a different approach is better, either an exception can be made or standards can evolve. But it does require testing to be more methodical - i.e., if you're proposing change the color of a button, you must test it on all pages and prove it works site wide, not just in one flow, or the design of a site becomes an inconsistent patchwork. It's pretty easy to test site-wide changes if the site is well coded and your testing software is good.)
Prioritize Pages with High Potential for Improvement
One factor by which to prioritize test opportunities is their potential for improvement - that is to say that is if you have a three-step flow in which 25% of people bail out before finishing step one, 10% bail before finishing step two, and 15% bail before finishing step three, then the first step is the biggest problem, with the most potential for improvement.
At this point "a certain amount of personal judgment and experience" are needed to generate hypotheses as to what the problem might be (and per an earlier point, where it actually is), and more qualitative methods like customer feedback, usability tests, and focus groups can help suggest where it might be.
(EN: The author champions Google Analytics, and has to do a whole lot of explaining, on the verge of writing a software manual, as to the clumsy way he works around its limitations. I'm skipping a lot of this nonsense - as it's completely unnecessary if you have the right tool for the job.)
He does note that many conversion funnels begin with persuading the customer to do something, then end with giving the ability to do it, and the two parts are distinctly different in the kinds of problems they pose. Persuasive problems generally pertain to a credible and relevant message whereas transactional problems generally pertain to usability issues.
(EN: Another issue with the author's model is that it is transactional. It does not care what the user does once the transaction is complete, until they come back to perform another transaction. This is a serious problem that leads firms to ignore or neglect customers between transactions, which is a critical time for customer retention and a strong determinant as to whether the customer returns or chooses a separate merchant the next time. I don't expect this author to have a broader vision than a single transaction, which is a serious limitation.)
(EN: Next, the author becomes blinded to the granular details, as the Google Analytics tool only delivers data to the page, and does not measure fields within a page or the user's scrolling and mouse movement activity - which leaves him to stare dumbly at the page and guess what might have gone wrong. My sense is that this is not very helpful, either. Skipping forward to a place where he seems to have regained his footing.)
Quantitative data provides a general overview of the behavior of many people, and is very good at measuring things that can be reduced to numbers - which is to say it provides a sense of what is happening, but not why it is happening. To understand the reasons behind the numbers often requires qualitative research, which gathers non-numeric data.
(EN: It's worth noting that many firms skirt this by attempting to guess what the numbers mean without doing research. That seldom ever goes well, and when it does it is often coincidence rather than inspiration.)
The author describes usability testing as a process by which you invite individuals to complete a series of tasks on your site, speaking aloud to tell you their train of thought as they proceed through the pages. It can be very enlightening to hear what goes through peoples' minds as they interact with a site, and you will quickly notice how things you assumed were simple and obvious seem to perplex users.
On-site surveys are another option. Various tools enable you to survey users, even those who leave the site without taking any action. A qualitative survey ask open-ended question that enable to respondent to provide information for you to consider. The key three questions to ask are the purpose of the visit, whether they were able to complete their task, and what they found helpful or hindering along the way. These can help pinpoint specific issues to address, and other things you should be reluctant to tinker with.
The author also adds his own questions to the mix: whether they would recommend the products to others, what they consider to be the product's main benefits, and what kinds of information the site does not provide. Such questions can help identify when satisfaction with a site visit is because of the merchandise rather than the site's design. The author cautions about misusing these survey for market research - marketing surveys seek to gain broader and deeper insight into the customer's mind and you merely wish to know about their experience of the site.
It's also suggested that these surveys must be very short or people, particularly non-buyers, will not be willing to invest the time. (EN: One approach to getting a breadth of information is to ask different questions of different visitors. The number of responses matters less to qualitative research - so getting twenty responses to each of twenty questions is more valuable than getting four hundred responses to the same question. It's not going to be statistically analyzed.)
Another approach to getting feedback is to leverage email surveys. (EN: The author takes a very sloppy stab at this, and even crosses a few lines into very bad practices, so I'm dropping his suggestions.)
The author notes that "There is significant danger of overconfidence in qualitative information." It is anecdotal in nature, and even when a number of people seem to be telling the same stories, it is still a small group of people. This is sometimes exacerbated by researchers who wish to aggrandize the results, indicating that 75% of customers believe X when they have interviewed four people.
There is also the propensity to interpret the findings of qualitative research to suit preconceptions. Consider when a participant indicates that something is "all right" - such a remark can be reported as a high level of enthusiasm or being virtually indifferent.
(EN: Seems to me the author totally missed one of the best forms of qualitative information: actually listening to what your customers say on their own. One of the weaknesses of market research is it prompts people to speak, and causes them to make up something to say. Unsolicited remarks such as Web site feedback and comments on blogs and social media are a more genuine form of expressions, where you can find the answers to questions you never thought to ask.)
Prioritize Important Pages
Another way to prioritize testing is to seek to improve the pages that are the "most important," though the author waffles on the criteria that should lead you to regard them as important.
One thing that might be considered to make a page important is the level of attention it receives: a page that receives 10,000 views a day certainly seems more important than one that receives only 1,000 views a day. (EN: Be careful of this assumption. A page viewed infrequently by visitors who then go on purchase from you is more important than one viewed very frequently by visitors who do not purchase at all.)
A certain level of traffic is necessary for a page to be testable at all within a reasonable amount of time. As a swag, the author suggests that if you have 1,000 visitors a day to a given page, then an A:B split test will take about two weeks to return statistically significant results. If it's only 100 visitors a day, the test might fail to produce reliable results in a year.
(EN: The estimate seems about right by my experience, with some refinement: you should look to the volume of traffic not just to the page, but to the page at the end of the line. That is, it's 1,000 hits to the page that loads when a form is submitted, not to the page containing the form. Also, the length of time depends on the number of variations. If it's only two versions in an A:B test, 1,000 visitors a day can produce results in only a few days; but if it's eight variations, three weeks to a month is a more reasonable expectation. Also, if you do your test on a smaller portion of the traffic, it will also take longer to complete - if the test version is shown to 10% of the audience rather than 50%, you will need more time to get results.)
Another factor to be considered is which pages are the most common entrance and exit points. An entry page has a high degree of influence over whether the user will continue to interact with the site or bounce out immediately. An exit page indicates where the user left the site, and if it is anything besides the last page of the transaction, it is likely an opportunity to keep visitors from leaving without performing the desired exist.
The cost of driving traffic to a page may also factor into its importance. If you are paying for an advertisement that drives visitors to a landing page, it is very important to make sure that page is optimized, as opposed to a page where visitors enter your site from essentially free sources of traffic. However, you should do a careful analysis to determine the cost of traffic - search engine traffic seems like it's essentially free, but if you're paying someone for SEO or buying placement, there are costs associated to this traffic.
The author (finally) acknowledges the Pareto Principle - in that a raw head-count is not always the best approach because a small percentage of your traffic generates the majority of your business (EN: Pareto suggests 80/20, but analyses of customer loyalty suggest it is often more along the lines of 90/10).
Prioritize Easy Test Pages
The author's final consideration is to prioritize pages that will be easy to test. Again, the concept of "easy" is very loose and there are various criteria to be considered. Primarily, it pertains to the time, resources, and costs of executing a test.
- In general, an "easy" test involves a single page, or a few pages in a flow, as opposed to a test that will impact many (or all) of the pages across an entire site.
- A test is also easy if it is a change to content or appearance rather than functionality. If there must be changes on the back-end systems to accommodate a test, the programming takes time and money.
- Pages that are non-dynamic (all users see the same layout all the time) are easier to test than dynamic ones (where the content shown varies by case or even by individual user).
- Single-channel tests (web site only) are easier than tests that impact multipkle channels (a transaction started on the Web site may be completed on a mobile device or by telephone, and behavior must be tracked across channels)
- A test that monitors the progress of one goal is easier than a test that monitors multiple goals.
- A loosely defined goal (customer buys any item) is easier than a test that has a tightly defined one (customer buys a specific item)
- Pages that cause stakeholders a great deal of anxiety are harder to test because the proposal will need more socialization and sign-offs
Tests should not be prioritized solely on the basis of their ease: harder, more sensitive, and more technically complex tests may often produce much more important outcomes - but testing programs often have slack times between tests, and having a "quickie" that can be dropped into a space where the next major test is still being prepared enables you to make good use of otherwise lost time.
The author returns to the notion of politics within an organization, as there are very seldom instances in which a tester has carte blanch to fiddle with the website, particularly pages that are generating revenue or supporting organizational goals.
- There is fear that the company may lose revenue, even a little bit, by testing something that may not work, so they may refuse to authorize a test or seek to pull the plug prematurely if the initial numbers don't show improvement.
- There may be considerable investment of ego or credibility in the way things are. Someone who fought to have things done their way will be embarrassed when their opinions are discredited by a test.
- There may be questions of capacity. If the shipping department or customer support is barely able to handle its current load, even a 10% increase could be disastrous
(EN: The author doesn't indicate how to overcome such resistance, as the internal politics of an organization is highly idiosyncratic and very delicate - so it's ultimately a "caveat" without an action plan.)
Prioritize with a Weighting Table
The author proposes a rather facile solution to taking the factors into consideration: scoring potential, importance, and ease on a ten-point scale and then adding the numbers together.
(EN: The approach here seems a bit facile, and it's likely no simplistic formula can reliably be used. However, when the value of optimization is proven out and multiple departments are competing for limited resources, likely some arbitrary means of determining which tests get priority is necessary.)
The author also suggests that prioritization be a rolling process. Each time a new test is proposed, rate it and re-sort the list of upcoming tests to ensure that highly valuable tests are not made to wait in line behind less valuable ones that were proposed earlier.