Bad User Research and the Rejection of UCD
Down with user-centered design (UCD). That’s a call I’m hearing these days. Usability testing and iterative development are not a productive way to arrive at good design, says Donna Maurer at OzCHI 2006 in Sydney. A panel at SXSW in Austin this year suggested user research is at best a waste of resources. Users cannot make effective design decisions. UCD can be blamed for over-featured products and stagnating UI designs. John Maeda, the new maestro of simplicity, chides Jakob Nielsen for emphasizing usability testing, saying, “[Use] the Steve Jobs method…. You don’t use focus groups. You just do it right.” Even usability über-guru Don Norman, one of the early proponents of user-centered design (or human-centered design –I don’t think there’s a difference) now warns it leads to incoherent designs or designs suitable only for beginners.
There seems to be a growing counter to usability in the form of the Great Designer Theory. You don’t need a design process. You don’t need testing to develop a UI. You hire someone with the right design Magic, and blindly obey whatever s/he says. If you have the right designer, she or he will simply Know the right way. The Great Designer doesn’t need any facts from research. That’s what Maeda is talking about, and it’s apparently embraced by the likes of interaction designer Robert Hoekman, Jr. Norman appears to have joined them and, like Maeda, cites the Apple Way: “Apple replaced its well known, well-respected human interface design team with a single, authoritative (dictatorial) leader.”
My first thought was that these people don’t understand UCD or the heavily overlapping discipline of usability engineering. I certainly have experienced people who think I do things like opinion surveys or focus groups to design and evaluate a product. They’re often the same ones who think usability engineering is about prettying up the screens at the end of development. And Great Designer Theory may at first seem to be true. For example, you can go relatively far as a web site designer following basic design principles and human factors heuristics without involving any users, but only because so much of the web’s usability is completely abysmal. What does it say about our designs that we need heuristics like “make text legible” and “be sure links go somewhere”? These are not problems that take a usability test to discover. They should be so evident they don’t make it to the first prototype.
As for the Apple Way, I get the impression that because no one has heard of Apple doing user research on OS X or iPod, they assume that none was done. This is a highly suspect assumption, especially considering the lengths Apple has gone through to test other things. Alternatively, these commentators may have heard “Apple doesn’t do user research” from top people in Apple (read, Steve Jobs), but I am skeptical of any characterization of Apple’s design process from a source looking for a marketing spin. I’d be particularly skeptical of any characterization directly or indirectly from Steve Jobs that places Steve Jobs in the center. The man is a master of self-promotion, if you catch my drift. In any case, it’s in Apple’s interest to convince us they have the Magic that cannot be reduced to a design process that anyone else can do. Distortions of their inner workings are not without precedent.
But then I had a much more disturbing thought. What if it’s not due to a misunderstanding of UCD or usability engineering in principle? What if it’s a reaction to UCD and usability engineering as it’s practiced? Maybe there is a bunch of usability or user experience professionals (or whatever they call themselves) that aren’t doing it right. Maybe there is a dangerous population of practitioners who themselves don’t understand UCD or usability engineering. These misguided souls are misusing usability tools, wasting clients’ money producing mediocre designs, and ultimately discrediting usability engineering as a discipline and UCD as a method. I’ve run across bad practices on occasion, and any profession has a few losers, but what if it’s more than a few isolated incidents? Indeed, what if it’s becoming the central tendency of the field? Don Norman seems to think so.
Et Tu, Don Norman?
No, Norman really hasn’t turned against UCD, but apparently enough have misinterpreted his essay that he’s had to post a clarification. Part of the confusion is due to Norman making a big thing out of what is really a mere shift in attitude or emphasis. Another part is due to Norman addressing what he sees as two problems with a single solution. The first problem is about the limits of his idea of UCD, and the second problem is about usability engineering as it is currently practiced. Let’s look at the first problem first.
UCD and Activity Centered Design
The limit of UCD, according to Norman, is that it doesn’t focus on activities. Activities are not tasks, says Norman:
There is a subtle difference [between tasks and activities]. I use the terms in a hierarchical fashion. At the highest levels are activities, which are comprised of tasks, which themselves are comprised of actions, and actions are made up of operations. (UI Garden, 12/15/05)
In other words, activities are tasks, at least tasks as defined by Ben Shneiderman. The idea of a ranking tasks into hierarchies is at least as old as the third edition of the Shneiderman and Plaisant. Shneiderman calls all the tasks in the hierarchy “tasks.” Thus, activities are merely the tasks at the top of a Shneiderman task hierarchy, although for Shneiderman the top is “intentions,” rather than super-task activities.
I’m happy that Norman is recognizing tasks. It’s true that tasks were given little coverage in Norman’s Psychology of Everyday Things (POET), which outlined his version of UCD. And I certainly want to encourage Norman and others to explore new theoretical directions in order to advance usability engineering. I mean, when you consider the impact Norman made by drawing inspiration from Gibsonian cognitive psychology in writing POET, he might be doing the same again with inspiration from Activity Theory.
But before getting all excited by tasks and branding a whole “new” way of doing design, I wish he’d thought to check with his pal Jakob Nielsen. Under a heading of “Know the user,” Nielsen has written:
A task analysis is extremely important as early input into the system design. The users’ overall goals should be studied, as well as how they currently approach the task, what their information needs are, and how they deal with exceptional circumstances or emergencies…. We should not just analyze the way users currently do the task, but also the underlying functional reason for the task: What is it that really needs to be done? What are merely surface procedures that can, and perhaps should, be changed? (p14)
That was in 1992 (Nielsen, J., The usability engineering life cycle. Computer, 25(3), March 1992, p12-22). Designing for tasks is nothing new but rather part of the canon of usability engineering. User research is supposed to include the tasks.
Maybe Norman is right and we should move a step up in the task hierarchy and look at the bigger picture. Frankly, I think it’s quite an insight to recognize that you need to take a broader perspective if you intend to bring about a truly revolutionary product. As a usability engineer, you could focus on making it easier to select a song to play from a portable CD player, but when you’re finished you still have just another portable CD player.
On the other hand if you want to revolutionize the entire way people listen to music, then you need to look at the entire music-listening activity. Do that, and when it comes time to actually design the product, you’ll find you’re doing much more than making a new user interface. You need to make a new system. There will be multiple components and each one that comes in contact with a user will benefit from UCD.
As described in Stephen Levy’s book The Perfect Thing, the iPod was a success not only because it had a shuttle or because it looked nice. It succeeded because of Apple’s effective and usable combination of a number of technical advances: high-capacity mini-hard drives, the mp3 codex, Firewire, and broad-band internet. It succeeded because of established racks of iTunes servers with usable web pages to provide the music. It succeeded because of usable iTunes software on home computers to store, organize, and quickly transfer music to the iPod. It succeeded because an aggressive marketing campaign convinced consumers they won’t look like dorks walking around with white earphones in their respective ears. It succeeded because it succeeded, which encouraged third party vendors like Bose to build docking stations taking the user experience from personal to social. That’s what it takes to make a revolution. An effective user interface is a critical but small part of the overall effort.
By definition, a revolutionary product is one that changes the way we live. So Norman is right that if you want to predictably create a revolution, you need to design how people will behave differently due to the new technology. I suppose you can call such behavioral changes “adaptation,” but Norman then concludes that UCD can be harmful to such revolutions because it has the goal of reducing human adaptation to the machine. Non sequitur. In the case of a revolutionary product, a user does something new they like to do but couldn’t do (at least not easily) until they had the right technology. There’s nothing in UCD that will prevent that. Indeed, by making technical advances easily usable, it improves the chances of it. The sort of adaptation that UCD fights against are cases of users doing something they don’t like to do, or can’t do very easily, but have to because of product design. That’s an entirely different kind of adaptation.
In any case, it’s not always your job to make a revolution. Sometimes you have to just improve an ol’ CD player. It’s good advice to focus on higher levels of task abstraction on the occasions when a revolution is in the making, but that’s an adjustment to classical usability engineering, not an entirely alternative design approach. This is something Norman has made clear himself, saying, “I consider activity [centered design] to be a refinement of [UCD].”
User Research Gone Astray
Whatever the merits of Activity Centered Design in improving UCD, it’s clear that Norman thinks we’re doing it wrong. He sees there is:
- Too much attention on individual pages (or windows) without considering the flow.
- Uncritical acceptance of feature and design suggestions from users.
- Research wasted on irrelevant details about the users.
- Not enough research on tasks (both at the activity level and the minute step-by-step details).
And Norman isn’t the only one. Larry Constantine apparently is also seeing users making design decisions. The anti-research panel at SXSW this year was evidently based on the premise that usability professionals are acquiescing to user design requests in the name of UCD. The Cooper Newsletter warns that some user researchers are getting carried away with unnecessary details about their users. Dan Saffer at Adaptive Path thinks much research is a waste and pushes out good design.
This is scary stuff. It’s as if we’ve spent so many years designing static web pages, where users pretty much just sit and stare passively at the screen, that we’ve forgotten that real applications are vastly more interactive, and there’s more to a UI than the page layout and the menu. It as if a bunch of usability professionals heard the phrase “user research” and thought that meant “learn all you can about the user personally,” the result being tasks getting the short end of the stick. If this is what’s happening, then we have a big problem.
There is a solution. It’s called usability engineering. As the old quote from Nielsen above indicates, user research has never really meant only research the user. There are two other things that you need specified to design an effective user interface: the task and the environment. In my experience in personal computer interfaces, the task is the most important because the environment and user are fairly “standard.” We take for granted that our users are educated and fairly affluent adults of our own general culture who are visually capable with reasonably good typing, literacy, and mousing skills (and no, just because your web site passes some minimal accessibility standards does not mean you’re designing for the visually impaired; there’s a difference between accessibility and usability). On the other hand, if your users are children from cultures all around the world, then you better research them a whole lot more. In most UI design work, it can be assumed the environment for the user involves being comfortably seated in front of a large color screen with proper lighting and HVAC control, and with little competition for attention and no more than the usual social and time pressures. On the other hand, if your environment includes a depressurized aircraft encountering a severely turbulent storm at 25,000 feet with one engine surging and vibrating, then you better focus on the environment more (and did I mention the cockpit is starting to fill with smoke?).
We call it “user research,” but it is more than learning about the user as an individual. It’s just that the user is often the best source of information about the task and the environment. If you want to understand what the users want to do, where, and why, then find out from them directly. Don’t infer task and environmental attributes from user attributes. For example, don’t figure that because you have many rural users, they’re more likely to have dial-up than broadband. Find out from your users if they have dial-up or broadband (and if you can’t find out from them, then by all means find out from somewhere else). That’s what Nielsen meant by “know the user”; he included in that knowing the task and environment.
When done properly, user research results include information about the user, environment, and the task, with often the majority of the information collected being about the latter. If user research for a project does not produce environmental and task information, then it’s bad research. I fear that’s exactly what most research is these days.
Research versus Design? Research is Design?
So, logically, because there’s bad research being conducted, we should give up on research, right? Do less research and more design, is the message from Donna Maurer. We need design-centered design. Save us, o, Great Designer.
That’s the wrong solution, but more research (even good research) and less design is a bad idea too. An apparent inclination to pit researchers and designers against each other fails to recognize that both are essential. Both have a role and neither is a replacement for the other. The entire idea that research and design are alternatives suggests another fundamental error in the practice of user research today, and that is letting the user do the designing.
Proper user research, whether it’s up-front research early in the development cycle, or usability testing after prototypes are constructed, is never about letting the user design the product. There are sometimes very good reasons to include subject matter experts on a development team, and once in a while one user out of many may have a useful design insight that is worth capturing. However, that doesn’t mean the design work should be delegated to users, with the usability professional acting as a simple conduit.
Users are experts on the task, the work environment, and themselves. Users generally are not going to have the technical knowledge or training to be designers. They don’t know the reasonable alternatives or how to evaluate them. When they do make design suggestions often they are impractical or unhelpful if actually implemented with current technology. More often, the ideas are unimaginative and too close to current conventions. My former boss tells the story of years ago when one of the first computer controlled electric locomotives was being developed. They brought in a crack team of top train engineers to design the cab. Clean sheet design! Electronic displays! Fly-by-wire controls! Anything you want! And when the train engineers were done, they had the best steam locomotive cab ever made.
Certainly a researcher may ask users what features they may want in a product, or what things they like or don’t like about a prototype, but that’s only the starting point for a full inquiry. Why do they want such-and-such a feature? What purpose does it serve? Now you’re learning about the goals for the task. Why don’t they like the prototype? Do the information categories conflict with their own taxonomies? Now you’re learning about the user’s mental model of the task. User research isn’t for getting designs. User research is for measuring or predicting human performance for a given design.
I don’t know how it came to pass that users would be designing products, if that’s what’s happening. Maybe some usability professionals were confused by the UCD adage to “involve the user in the development process,” and didn’t understand how to properly involve the user. Maybe it started with early web pages, where the technical and design issues were minimally complex, and, in fact, a user actually had a better chance of designing a usable web page than some of the random people who were at that time making sites without knowledge of either the task or design. At least a user knows some of the necessary information, which is more than, say, someone from marketing who wants to take a crack at this new web thingy.
Research and Design > Research + Design
Good designers are crucial. Research can point out the problem, but it won’t tell you the solution. All the research in the world means nothing if you don’t have the people with the background and imagination to come up with potential solutions. Then it’s up to research again to test potential solutions and return the data to the designer as a new problem to solve.
Also, while I always feel best when the superiority of a final design is verified by human performance data, I wouldn’t necessarily say that such research-supported results are essential for all design objectives. A design typically has multiple goals, and it may not always be possible or practical to empirically test its effectiveness on all of them. Or even if all goals are tested, they may return conflicting results: Design A is easier to learn, but Design B is faster to execute once it is learned. With imagination and luck, a designer achieves superiority on all goals with a single design, but performance tradeoffs among competing designs are common and unavoidable. There is always a need for human judgment to decide such tradeoffs, and that is a judgment to be made by the designer.
But the designer should be making an informed judgment. You need good designers, but you also need good research. Thomas Edison estimated that innovation is 1% inspiration and 99% perspiration. Thinking up imaginative ideas is hard, but even harder, in raw amount of worker-hours, is acquiring the necessary factual background to know what ideas to think up. Harder still is the selecting, shaping, and discarding of ideas until you have something that actually works in the real world. That only comes about through research and testing.
It was brilliant that the Wright brothers figured out that the key to steering an airplane was roll control, but that creative insight came only after heavy research and hands-on study of aerodynamics. December 17, 1903 was not the first time they tried wing warping. They spent four years testing it in gliders, the first ones being wisely unmanned. Only after multiple iterations did they get it right. User interface development works the same way only on a smaller scale. You do your research, and then you get the ideas, and then you test (on real users), evaluate, refine, discard, until you got something that performs.
Here’re some examples of how it is supposed to work:
Problem 1: How do you implement Feature X?
Wrong: Mock up a screen, explain it to some users, show the control for Feature X and ask, is this a good label? What would you call it? Is this a good place to put it? Do you like blue or green better?
Right: Interview user on Goal A which will be accomplished through Feature X. Note what words they use to describe it. Look through their materials to see what codes they use. Diagram the task to see where use of Feature X fits in the sequence of sub-tasks and how it relates to the users’ goals. Check usage logs to estimate the feature’s frequency of use. Take all this data and then make a mockup or prototype that accounts for the work frequency and flow. Do a proper usability test where users handle the prototype and you ask, in some fashion, “To accomplish Goal A, what would you do?” Watch and note the following: Do they find the control for Feature X? Do they seem to understand what it means? Do they use it correctly? Do they use it when you they should? Interview them about any rough points to understand the reasons. Redesign, lather, rinse, repeat.
Problem 2: What graphic code should be used for the values of Attribute Y?
Wrong: Ask users what the code should be. They answer “Red, yellow, and green. We’ve always used red, yellow, and green.” Implement it as #FF0000, #FFFF00, and #00FF00.
Right: Look at how Attribute Y is used in the existing system. Note that it’s currently coded #FF0000, #FFFF00, and #00FF00. Learn that these attribute values have a ranked meaning, with red being more severe than yellow, which is more severe than green. Note that the use of color coding appears to be highly effective to users because they have to scan large graphic displays in order to pick out color coded items, namely either the red items or both red and yellow items depending on the specifics of the task in that instance. Recognized that, because of other design considerations for the new system, the colored items are to be shown on a white background, so it is important that red and yellow stand out more than green. See that the users have meetings where they bring printed screen captures of the graphic displays to distribute and discuss in groups. Notice that printer is black and white and that this results in gray scales ranked as red – green – yellow, counter to the order of the attributes values.
Observe that this inverting of yellow and green in the gray scales causes some confusion in the meetings. Recall that about 7 or 8% of male users have some form of color-blindness. Accept that for too many reasons to cover here much of the task for the new system has to be or should be the same as the old system (in this instance): a large graphic display is still the best way to present all the information; users will still have to go to meetings with black-and-white printouts under their arms.
By calculating L u v values, determine colors that have high distinctiveness from each other and for which red and yellow contrast more against white than the green. Include a blue tint to the green to aid perception for partially colorblind users. Verify the colors have gray scales (whether from printing or due to color blindness) that are ranked consistently with the attributes. Ultimately, implement as shades #E00000, #FF8400, and #C0FFE0.
Consider including redundant shape codes. Test a mockup on users using both screen shots and black-and-white print outs: do they quickly learn to see your “green” as green and your “yellow” as yellow? Can they quickly sort out reds and yellows? Does the ranking of the shape codes at least not conflict with the users’ expectations? Evaluate, re-design, lather-rinse-repeat.
Problem 3: User says a window showing a table of data objects absolutely must have the ability to sort the objects on Attribute Z, just like the legacy application does.
Wrong: Add the sorting feature.
Right: Ask them why. Have them walk you through steps of using the sort feature in their task. See notepaper with numerical notes. Note that they use the paper and the sorting feature to count the number of data objects with each value of Attribute Z to see if they are approximately equal. See how they add and remove objects and change various other related attributes, then resort the list again to re-assess the number of objects with each value of Attribute Z. Do not implement the sorting feature (at least not for this purpose). Instead, implement a “summary pane” that gives a running readout of the number of objects with each value of Z. Consider highlighting Z values with outlier numbers of objects. Prototype, test, lather, rinse, repeat.
That’s how research and design is supposed to work together.
Misusability
Do you work with misusability engineers? Are they doing bad research and delegating design to the users? Take this test:
- Do they use focus groups and take their raw output as the feature list for the next release?
- Do they say things like, “Twenty percent of our users are 55 or older, so we have to design for people who are technophobic”?
- In looking for a design vision, do they go to users and ask questions like, “Would you like something simple, like Google?”?
- Is their reason for every design decision, “That’s what the users want,” or “That’s what the users are used to”?
- Do they tell you the archetypal user has a Corgi named Ajax, but neglect to mention the user is likely to use the product while dragging a roll-away suitcase through an airport (i.e., the product should be fully operable with one hand)?
- Do they want physical representations of the users to “sit in” on design meetings but they can’t tell you clearly and succinctly why each type of user would want to use the product?
- Do they use force-choice surveys of users to pick among competing designs?
- Is their idea of a “usability test” to show PowerPoint slides of the new UI and ask users, “so how do you like it?”
- And when the loud-mouth in the back of the room says, “Use bright blue on deep red for that text,” they reply, “Sure”?
- Are they doing only the above but claiming to do user-centered design?
For every item above you answer yes to, add one point of misusability. Subtract the sum of the points from that the logarithm of the number of times they perform good research and design as described earlier. Divide by pi, then throw the number in the wastebasket, and let’s take an honest look at ourselves. Does our research and testing really provide the right information to inform design decisions? Does it help us measure or predict human performance for the designs we are or will be considering? Are we really doing user-centered design?
Summary Checklist
Problem: Usability professionals waste research on learning irrelevant user attributes, and delegating design decisions to the users.
Potential Solution: Basic old-school usability engineering and UCD.
- Research users to learn about the users, the use environment, and, most importantly, the task.
- Conduct usability tests where users interact with the prototype to accomplish a given task, while you measure their performance and document their problems.
- Use research results to inform design decisions, not replace design decisions.
Oh, that’s not what I said. That was completly mis-quoted. I said design by testing was not the right approach. I’m still a big believer of good user-centred design.
Usability testing is not user-centred design. It is testing. It should never have been as highly used as a UCD method as it currently is.