This is the third in the series of Learning from Lusers.
Sméagol… found that none of his family could see him when he was wearing the ring…. He used it to find out secrets, and put his knowledge to crooked and malicious uses…. The ring had given him power according to his stature. It is not to be wondered at that he became very unpopular and was shunned…. The [Ring] was eating up his mind, of course, and the torment had become almost unbearable… He hated [the Ring] and loved it…. He had no will left in the matter (J. R. R. Tolkien, Fellowship of the Ring).
Gollum was a luser. He aspired to own a piece of powerful technology but it ended up owning him. Unable to make the One Ring follow his will, he instead had to submit to the Ring’s will. To be fair, the true effects of the One Ring were not apparent in the UI. Oh, sure, there was on-line help, but it was in the obscure Black Speech of Mordor and was only accessible by counter-intuitively throwing the ring into a fire (do not try that with your iPhone). Even then, the help was limited to the About box and didn’t specifically say, “Use of this product will lead to enslavement, criminal behavior, banishment, withdrawal, mental illness, dancing into volcanoes, and a culinary preference for goblin babies.” The powers of the One Ring could not be anticipated from the ring itself, not by using either the analogy heuristic nor the proximity heuristic. The One Ring was magical, which means it had arbitrary and inscrutable powers.
Any sufficiently advanced technology is indistinguishable from magic (Arthur C. Clarke).
Advanced technology is magic, capable of doing things that are extraordinary for the time and place in which they are introduced. Being extraordinary is what makes advanced technology “advanced.” However, by being extra-ordinary, by doing the unexpected, advanced technology becomes magic –apparently arbitrary and inscrutable to the users. This makes advanced technology hard to learn.
Adding to the experience of magic is automation, the essence of computers, where a device does things without direct intervention by the user. By taking work from the user, automation reduces the user’s awareness and control of a device’s inner deterministic mechanism. But while such a reduction in awareness reduces the users’ workload, it also reduces’ the user’s understanding of what is happening, further interfering with the user’s ability to learn the technology. Software evaluates the environment and executes variable actions. Like the One Ring, it seems to have a will of its own.
Manager complains that his department’s printer isn’t working…. As I watch, he selects the printer in his office. I ask him why he’s selecting that if he wants it to print on the department printer. His answer: “The printer in my office is out of ink, so it should print on the other printer.” He had a hard time understanding that his PC isn’t smart enough to automatically go find the printer that has ink. (Shark Tank)
Well, why isn’t the PC smart enough? Users are given the idea of computers being “thinking machines,” well, how much thought does it take to route a print job to another printer if it’s not working?
Compounding the magic of automation in the case of electronic technology is its telekinetic ability, a capacity to act immediately over vast distances with no mechanical connection. Signals can be sent around the world to and from the user’s home. Turn on your computer in a café, and lo, all the world appears before you as if in a crystal ball. This telekinetic ability is a property we otherwise associated with magic spells and enchantments. It also makes technology hard to learn, because things happen too fast, too far, and to small to see, even if the user could see any of it.
[IT technician] gets another frantic call from the admin: no Internet [so he] calls the ISP. It’s more spam complaints…. One of the ISP’s senior techs… tells fish that he’s rechecked the spam complaints, and they’ve all come from one e-mail address: the personal e-mail account of an employee at the office. The system is set up so that most, including this gentleman, have their e-mail forwarded to their home accounts from the company server. This individual, instead of deleting the e-mails he no longer wanted, was reporting them as spam. (Shark Tank)
Chances are the user above didn’t realize he was reporting the email as spam. More likely, he was simply hitting a button to classify the email as spam as a convenient way to get it out of his inbox without deleting it. And perhaps as far as the user was concerned, the email was spam, mostly broadcasts he had little use for that the company sends to a large distribution list (stuff like the cafeteria’s menu for the coming week).
The problem was the user didn’t know the magic: that moving an email letter to the spam folder automatically sends a message out to one’s ISP. In designing apps, we try to make software intuitive by exploiting the analogy and proximity heuristics, in this instance, by making email appear like physical mail. But there’s only so far this can take us.
In the above case, the analogy heuristic failed to predict the effects of categorizing email as spam: moving email to other folders doesn’t send any messages. Certainly moving a physical letter to a folder can’t do that. Reporting something requires some kind of communication, but the user hadn’t made any email complaining about spam. The proximity heuristic also fails. The spam folder is in an email client on the user’s personal computer, probably spatially and temporally not proximal to the ISP. I suspect there was no representation of the ISP at all –no visible connection to the inbox or the spam folder.
Magic and the Power-Learnability Trade-off
Magic is hard to learn, something Xerox PARC researcher Randall Smith discovered 20 years ago as he was developing an early direct-manipulation user interface. Whenever the behavior of the system deviates from the analogy it’s patterned after, learning takes one or two orders of magnitude more time.
But Smith also discovered that sometimes you have to have magic, because magic means power. Sometimes you have to break the analogy to give the users the functionality the analogy cannot provide. Sometimes you will want to automate or have things act quickly over great distances in order to save the user work of closing space and time.
(For more on Smith’s research, see Smith, R. B. (1987) Experiences with the Alternate Reality Kit: An example of the tension between literalism and magic. Proceedings of CHI+GI 1987 Conference on Human Factors in Computing Systems. New York: ACM, 61-67.)
Cooper and Reimann in their book About Face go further: you must always have magic. If your app isn’t at least a little magical, it will actually be worse than the technology it intends to replace. Imagine if we made email even closer to its physical mail analogy. Imagine that to send a letter, rather than simply clicking the Send button, the user had to drag out an “envelope,” drag and drop the letter in the envelope, seal the envelope and stamp it, drag it up to a “mailbox,” don’t forget to flip up the flag, then wait a day or two for the letter to arrive as servers deliberately slow down the transmission to be metaphorically consistent with snail mail.
What would be the point of using email? Sure, it might be more intuitive to someone who has never used computers before, but it’s merely a simulation of physical mail, requiring all the work but with low fidelity representations of the original environment (e.g., a single mouse pointer has replaced two hands and ten fingers). This makes any digital duplication of physical work necessarily harder than the physical work it replaces, especially when it comes time to lick those stamps.
If duplicating old technology with new makes work harder, says Cooper, the only way to make new technology better is to provide new capabilities the old technology doesn’t match. But that means breaking the analogy with old technology. It means having magic.
Given a certain trade-off between learnability and power, the first step in designing the UI for a product is to consider how much magic to have. The answer depends on the situation. More magic for products with good support and training. Less for those without. More magic for products used frequently, where the user is more likely have a chance to learn about the magic, and the effort for learning will pay off. Less for a product used infrequently. More magic if the cost of error can be minimized. Less if the user better use it right the first time or lives might be loss.
Once the level of magic is decided on, you can design the UI to get the maximal power out of that level of magic. To some degree, there is always a power-learnability trade-off. However, it is not always a zero-sum trade-off, and with careful design one can get superior learnability for a given level of power. A key step to such careful designing is understanding how users respond to magic.
The Magic Heuristic
Given at least some magic is always necessary, where does this leave the users? It doesn’t take them long to figure out that advanced technology is magic, that it regularly responds to inputs with arbitrary things on its own over remarkable physical distances and time intervals. This leads users to form a third heuristic to use to understand computers, in addition to the analogy heuristic and proximity heuristic. It’s the magic heuristic, which simply states that advanced technology is arbitrary and inscrutable: advanced technology like computer software does things just because it does. It does not obey the rules that other things obey. It doesn’t have to make sense. Computers don’t make sense. Things work that way just because they work that way.
Support [personnel] takes a personal call from an elderly friend who can’t get her e-mail working correctly. A few step-by-step instructions later, [the support personnel] receives a test message, but he tells elderly user that he sees the problem: All the words in the test message are running together, making it almost impossible to read. User: “Oh, my sister said to leave all the spaces out.” (Shark Tank)
Well, you can guess where that came from. Sister gives the user her email address, “Polly Poweruser 42 at hot mail dot com,” and emphasizes she should “leave the spaces out, otherwise it won’t work,” and somewhere that rule for the email address was confused with the body of the letter. Why would anyone think that you can’t put spaces in an email? Well, why wouldn’t they? It makes about as much sense as any other arbitrary limits of email, like if you close the window before you send it, it won’t be there when you open it up again.
I call the magic heuristic a “heuristic,” but it’s a non-heuristic heuristic, a Zen heuristic, if you will, because while users believe it, it is useless for guiding interaction with computers. It’s just something the user has to accept. Sometimes, it’s something we as designers might even exploit.
I realised that 90% of the calls [I received for tech support] related to network errors…. For each error, the guys in IT Central would issue a patch. The patches, surprisingly, usually worked. Unfortuately [sic] for me, they had to be manually installed on each workstation. This meant that every time the network configuration was “adjusted”, and a patch issued, I would be guaranteed to receive around a hundred calls, starting the moment I walked in the door. Then I had a brainwave…. I snuck into the user settings location on the disk and installed a new icon on everyone’s desktop one night. This icon linked back to a batchfile on the server that would run the latest patch or three. I titled the icon “The Magic Button.” For the next two weeks, whenever I took a call or visited a user’s desk, I would make sure to refer to The Magic Button, and ask if they had tried it. In 90% of cases, it would fix whatever problem they were having. Suddenly, I had a lot of free time on my hands. (Tech Tales – It’s Magic!)
The capacity of designers to exploit the magic heuristic as described above is much more limited than the capacity to exploit the analogy and proximity heuristics. Explicitly invoking the concept of magic is not going to help the user select an action. On the contrary describing something as magical is a barrier to action, not a guide, indicating, “There be great powers here that defy understanding. Do exactly as instructed, but venture no further.” For example, don’t go messing with the magic.bat. If you follow the link above and read the story to its end, you’ll see how the magic heuristic is used to prompt the users to ask for help from a “wizard,” rather than to try something on their own.
Learning the Spells
If users believe computers are arbitrary, how can they learn to use them? Simple. Rote memorization. Do this to get that. Do that to get these. Why? No “why.” It just works that way. Have a problem? Double-click the Magic Button. Commit this magic spell to memory, or failing that, write it down, preferably in Elven.
It’s the requirement to memorize things by rote that makes magic harder to learn than things aided by the proximity and analogy heuristics. We, as experienced users who long ago memorized the idioms of GUIs, don’t realize how much of it is arbitrary.
If you introduce someone to the concept of “right-clicking”, they will be confused for the remainder of the [tech support] call and ask you to specify “right-click” or “left-click” whenever you ask them to click on something. (Tech Tales)
When receiving a string of instructions over the phone, what does right-click versus left-click mean to a user? Why do right click then left click? Isn’t it just some part of the incantation, like wave your wand right then left? Even if the user figures out the general effects of right versus left clicking (and calling it “right-click” and “left-click” doesn’t help), there’s still an element of arbitrariness that’s hard to remember. Why is right click for getting a menu and left click for selection, rather than the other way around? There is no particularly good answer, so just memorize it: left for this, right for that.
The potential confusion that can come from a two-button mouse was a prime reason the Apple Macintosh had a one-button mouse, but the potential power of a second button soon seduced designers to provide some magic at the expense of learnability. And a great power it is –I wouldn’t want an app without context menus; they’re so convenient and fast.
Rules Made and Broken
There are other examples of magical power being traded for learnability. Accelerators are a boon to efficiency, but why is Ctrl-Z Undo? Why does the icon of a floppy mean Save instead of Retrieve? Toolbar icons are so notoriously hard to interpret and remember that one study found that users identify toolbar items more by memorizing the location than by their iconic labels (see also Engel, F.L., Andriessen, J.J. & Schmitz, H.J.R. (1983) What, Where and Whence: Means for Improving Electronic Data Access, International Journal of Man-Machine Studies, 18, pp. 145-160). How is it possible that a folder can contain more folders, which can contain more folders, far more than could ever fit in a physical folder? How can a folder contain an inbox or printers or users, except that it has to be a special kind of folder? Then there’re shortcuts or aliases. Huh? And what’s up with double-clicking?
[The user complains,] “I have some problems with my printer! …It allways [sic] prints twice, no matter what I do!… I take the mouse like this, and I open the W like this (manages a fairly OK double-click on a MS Word 2000 shortcut on the desktop), and then, I write a letter, and I print it, like this (drags the mouse up to the printer icon and doubleclicks)” (Tech Tales)
Double-clicking to do the default action for a data object is another arbitrary little secret to be passed down from wizard to apprentice. It could’ve just as easily have been Ctrl-click, or (if they had asked me) clicking down both mouse buttons at the same time. However, there was at least structure to the idiom –double click a data object to get the most common action, usually to open it. Once the user learns how double-click, at least it was easy to apply from then on.
This is one helpful method of handling magic. If the users have to memorize something, give them one thing that works all over the place. A user interface with only a few rules for interaction will be much easier to learn than a user interface where the steps for each task must be memorized in isolation. Double-clicking is an example such a rule.
Well, that’s the way it used to be. The problem is distinguishing a data object from other UI elements. At one time data objects had distinct appearances from command controls like buttons: they had an icon. Indeed, in the early GUIs, such as Motif and IBM’s Common User Access, “icon” was essentially synonymous with the representation of a data object such as a window or file. If you saw something with a little picture, it was a data object and it took a double-click to get an action executed.
However, along came toolbars, and because the original toolbars were used to control visual attributes (such as bold text), they use pictographic labels, or “icons.” Suddenly, the user is faced with two things that look the same (data objects and toolbar controls), have the same name (“icons”), but have different reactions to user inputs, one taking a double click to get an action, and the other taking a single click. For awhile, users might have had the chance to use the position in the window as a clue to the type of the icon. If the icon were up by the menu, it takes one click, while if were down in the work area, it takes a double click. Anytime you add a conditional to a rule (“if an icon is up by the menu…”), you make it harder to learn or remember. As you keep adding conditionals eventually, the scope of the rule is so small that the user is back to memorizing the steps for a specific task, and learnability is very difficult. In the case of icons, having only one conditional might have offered some hope that all users could learn the rule.
However, then came web pages and their links. Now there were UI elements in the work area, including icons and other imagery, that reacted to a single click. There were also the occasional app that used icons (sometimes only icons) as the labels for its command buttons in the work area. So here’s the rule: single click if it’s up by the menu bar (or could be any margin), or it’s in a web page, (or Help, or anything else like a web page), or if it has a beveled border, or some kind of border, depending on your theme, but double click if it is an icon in the work area, but not a web page, or if it’s in on the desktop, unless you use Active Desktop, or, uh, where was I?
Modern GUIs have evolved to eradicate the entire meaning behind double-click versus single click, and the visual distinction between data object and command. The cues are too subtle and the rules have become too complicated. Many users are not going to effectively discover the rule on their own, and teaching them is too difficult. Users are left to memorize not a single rule, but the case-by-case individual steps for each task. Or, as our example above demonstrates, just double click everything. I know one user that habitually double-clicks links in web pages. Ask such users, why do you double click everything in a computer? Why isn’t a single click enough to say, “do this?” They say, “I don’t know. Sometimes single click doesn’t work. It’s a computer. It just is.”
Magic means memorization, but the memorization can be made much easier if the user can just remember a simple rule. The best rule involves an association between a clear stimulus and a response that can be applied throughout the computer, like there used to be for icons and double-clicking. Teach a user the rote steps for task, and you’ve helped them for a day. Teach them a rule, and you’ve got them off your back for a week or more.
Consistency
Having simple but powerful rules is the first step. Next, you have to follow the rules, and that means consistency. However, consistency is something we don’t consistently have in our UIs. In addition to the memorization that is inevitable with magic, we also force unnecessary memorization due to inconsistent interfaces. It’s magic without any magic, extra work with no benefit of greater power.
The New command replaces the current document with a new document, except if it’s a Multiple Document Interface, or if it’s a web browser. In MS Office, clicking the Close button on the title bar of Word closes only the current document, but clicking the Close button in the title bar of Excel exits the process, closing all associated documents. These are subtleties a user is unlikely to figure out. Instead it all appears arbitrary.
On the web, anything goes. Links can be any color (including black as I saw recently), and have any attributes (underlined or not). Combo boxes may or may not require clicking a Go button. With AJAX, controls react in any old way. A link may act like a tab control or vice versa, to point to a common example. The rule is Click Anywhere, Maybe It Does Something. The confusion of the web bleeds over into the desktop experience; for many users the difference between the web and the desktop is arbitrary too.
Technical Knowledge
Randall Smith makes a distinction between real magic, which adds to functionality, and what he calls “externals,” things the users have to learn that don’t provide any benefit to them. Inconsistency is an external, but mostly Smith is talking about technical limitations that impact the UI.
It turned out that he had saved the document before he had started typing it and, when finished, simply switched the computer off (Rinkworks).
Save is an example of an external. For users the requirement to save is just something you have to know about computers. That you save at the end of creating a document rather than the beginning is an arbitrary detail. I mean it, it makes sense if you understand that, for technical performance reasons, a computer has two memories, a fast ethereal working memory and a slow permanent memory, but how is the average user supposed to know that? The computer is just a nondescript beige box (or is it the TV thingy?). Users don’t have a clue about the technical details of a computer, and in an ideal design, they wouldn’t have to. They wouldn’t know any more about the engineering of a computer than the average driver needs to know about the engineering of his tires. I mean, you know the difference between bias ply and radial tires and why that’s significant, don’t you?
I had to publish a document on the intranet to explain things that everyone with an ounce of common sense should already know, such as “If you hear a beeping coming from the box under your desk that your computer is plugged into, it is not a fire alarm. Do not evacuate the building.” And the ever-popular “Please do not plug in anything to your UPS besides your computer and monitor.” Why? The girl with the laser printer, electric stapler, desk lamp and radio wanted to know why she didn’t have time to save her documents. (Shark Tank)
Well, at least when the UPSs went off the users didn’t call the fire department, which courageously hosed down all the equipment. But what the heck are you supposed to think when a nondescript and persistent alarm goes off in every office on the floor? Why shouldn’t the computer network also be used for the fire alarm network? I mean, aren’t they controlling warships and space stations with Windows now? As for the UPS dying prematurely under the load of a desk lamp, radio, et al., how is a user supposed to know how much energy a UPS has, or, for that matter, how much her electric stapler draws? Frankly, I find the short lifetime of UPSs odd. My personal UPS under my desk is bigger than a car battery. I’d think it could have 1000 watt-hours, enough to power the 350W CPU (at full draw) and the 150W monitor for two hours. There was a design decision made to limit the energy, probably to minimize cost of production, but to the user, such technical limits appear arbitrary.
We could instruct users on the capabilities of technology, and indeed we have to try; there really isn’t any other alternative. However, it has to be acknowledged that users are only going to listen to instruction for so long. Carefully crafted manuals typically go unread as users plunge in to use the app or device, rather than take the time to learn about it. Most users are not intrinsically interested in the inner workings of technology. They have other things they need to do that have higher priority, such as the work they need to get done with the technology. This means any instruction needs to be concise, specific, and provided at the time and place when it is needed.
User’s personal mouse won’t work with her laptop, and she demands that [IT technician] fix it right away. “She said, ‘Look, the laptop is not getting the information from the mouse,’ ” says [technician]. “I asked where the receiver to the wireless mouse was. She looked at me like I was an idiot. ‘Isn’t this a wireless laptop? Here’s my wireless mouse. They should go together.’ Again, I asked where the what-a-ma-thing with the black cable was. ‘I got rid of the extra stuff,’ says user. ‘It wasn’t needed.’” (Shark Tank)
Well, at least the user read something. The box for the laptop said “Wireless!” The box for the mouse said “Wireless!” The user put two and two together, but unfortunately, the arbitrarily correct answer today was five. But it shows that users do read instructions and other documentation. They just will only read if they think they have to in order to accomplish a task, and will only read that much. Instruction and documentation needs to be tailored to this reality.
Fat manuals on a CD are not the way to go. Instead, we need to rely more on providing snippets of instruction of one form or another to guide the user with the task they are currently confronting. There’s more to instruction than formal documentation in a manual or Help. There’s the labels for controls, the interfaces we chose, the message boxes and other feedback we display, the packaging for the product, even the chassis for a piece of hardware.
Thinking back to the user that reported his own company’s email as spam, one solution would have been to incorporate the instruction into the label of the control. If the spam folder were labeled “Send to Spam Police at [ISP],” the user may have had a better chance of anticipating the effects.
In the case of the UPS for a personal computer, a generic persistent loud beeping under the desk is the wrong way to do it (admittedly, it was an outdated UPS). Desktop computers are not servers that usually required urgent attention when power is loss. Desktops are probably off for most of the time, yet the UPS sounds a persistent alarm anyway as if, day or night, something needs to be done right now. Using a beep requires that the key instruction to the user be learned and remembered from long ago (what to do when the UPS beeps, and for that matter, what it sounds like).
The preferred design is to provide a dialog box (perhaps emanating a single beep, in case the user is turned away) that says, “Power loss. Save work and shut down immediately to avoid losing work.” This puts the instruction right where and when the user needs it. It also, you may note, makes use of the proximity heuristic, associating the alert spatially with the user’s open documents on computer, not some mysterious black box under the desk.
As for what may be plugged in to the UPS, instruction on that should be provided on a legend by the outlets of the UPS, there to see when the user is engaged in the task of plugging things in: “Plug in only computer and monitor for reliable performance,” maybe including a pictogram of the CPU and monitor so the user more likely understands what is meant.
Instructions have their weaknesses. For example, users routinely ignore dialog boxes (for reasons to be covered in the next Learning from Lusers), as are written labels on consumer products (probably because they’re assumed to be some absurd warning or disclaimer put there by lawyers, e.g., “Do not bathe with your UPS. Do not drop UPS on pets or small children.”). However, given the arbitrariness of technological capability, it’s all we’ve got.
Pace of Change
The problem: This key user is having problems with the “read aloud” function. For half a month, the user has been trying every imaginable adjustment to get the sound working…. [Says the IT technician,] “A call to the support team for the Web app revealed that the topics created by the Web applications group were programmed with the accompanying voice-overs…. Topics created and added by end users generally were created without a voice-over. Care to guess which topics my user was trying to get sound out of?” (Shark Tank)
Okay, I admit when I read the above story, I didn’t understand at first why user-created content would not be read aloud. I assumed they were using some sort of screen reader software to effect read-aloud. With the current sophistication of synthetic speech, I would think that would be a satisfactory and economically attractive alternative to making someone read the text into a .wav file. But what do I know? Maybe that makes sense now, but it didn’t however long ago the story is set.
As we saw in previous Learning from Lusers, of course you can’t just drag and drop files to a CD-RW to burn a CD, and of course you need physical access to a computer to patch software. Of course you need to plug a network adapter into your laptop to access a wireless network. What, you thought you could just turn on a PC and get on line in some coffee shop? All these “of courses” were true just a few years ago. Today, it’s obvious a wireless mouse needs a receiver plugged into the computer, but what about tomorrow? Will wireless mice be so popular that we’ll have a standardize mouse authentication protocol and computers will be delivered with a built-in receiver? Hard-earned knowledge gained when the user first got a computer is frequently obsolete as fast as the computer is. We used to tell users, “Don’t open attachments from email senders you don’t know,” but then the malware spreaders responded by forging From addresses, so that advice is obsolete. Now we say, “Don’t open suspicious attachments,” a non-specific rule (the stimulus is vague) that’s less likely to be successfully learned.
The rapid changes in digital technology is another reason to rely on “just in time” instruction. Whatever was said last year for that generation of technology may not apply to this year’s generation. Providing such just-in-time instruction with an awareness of what used to be true provides user re-education when it is needed.
Sometimes users have never learned the new capabilities of the latest technology. Sometimes they’re ahead of the technology.
She had clicked the Windows “help” button a week ago and wanted to know why it was taking so long for an engineer to turn up. (Tech Tales)
Now that would be a great feature. Maybe not tech support showing up in person, but it’s certainly feasible to ask for assistance, and the app contacts tech support, perhaps with some of the details of the problem context, and the tech supporter responds with a phone call or chat box. Sure would beat waiting on hold for an hour or two.
And maybe save a user from the fate of Gollum.
Summary Checklist
Problem: Balance learnability with power as appropriate for the users and product usage.
Potential Solutions:
- Have rules relating the UI appearance to UI element behavior.
- Simple Rules: One to one coding between appearance and behavior. Avoid conditionals.
- Powerful Rules: One rule that applies to many situations and tasks.
- Consistency. Follow the rules.
- Teach the rules.
- Describe rules with illustrations and text.
- Teach only a little at time.
- Just in time: Provide instruction at the time and place of the task.
- Write rules with awareness of obsolete rules.
- Design hardware and software to minimize “externals,” arbitrary limits and requirements that provide no functional benefit.