Building the Perfect User Interface (Part 3)

In part 1 of this article, I made a quick and handy definition of user interface: Given technology as a black box, user interface is how you tell the black box what you want it to do. In part 2, I listed some things wrong with the current state of user interface, using Google as a prime example.

So we clearly haven’t yet mastered the science of user interface here in the 21st century. But what is it we’re striving towards? What’s the perfect user interface? In, say, a thousand years, when we have unlimited computing power and unlimited energy (like the characters of my novels Infoquake and MultiReal), what kinds of user interface will we be using?

Apple iMac Let’s take the question one necessary step further: do we really need user interface at all? Or are we evolving toward the point where intelligent tools automatically understand what we’re trying to do? In a thousand years, will the concept of giving commands be obsolete?

Software developers are taking the first tentative steps in that direction now. Apple’s Steve Jobs has always taken that “benevolent dictator” approach: we’ll decide what you, the user, need to handle, and the machine will just automatically handle the rest. Take disk defragmentation, a software task that only the wonkiest of technowonks has any interest in controlling. There isn’t any standard disk defragmenter for Macs, but that’s not because Mac hard disks never need defragmenting. OS X simply does it for you behind the scenes, as this article on the Apple website makes clear.

Microsoft is moving in this direction too. One of the advantages that Windows users have historically held over Mac users is the fact that it’s generally easier to get under the hood and tweak the gears that make the system work. But that’s going away. Not only because OS X has brought command-line tweaking to the Mac, but because Vista is taking away a lot of tweakability from Windows. Disk defragmentation under Vista is a simple on-off proposition; flip it on, and the OS will handle it as needed. Likewise, throughout the operating system, interfaces that were once cluttered with hierarchical menus and interactive dialog boxes are giving way to much smaller lists of context-sensitive tasks. (For more of my thoughts on this, see old blog posts Don’t Worry, Vista Will Handle It and Look Ma… No Program Menus!)

It’s the same long-term trajectory of user interface we’ve seen in automobiles. Look at the user interface for the Model T (pictured, below; original photo, with explanations and more detail, here). Most modern automobiles have reduced this to a standard set of four controls — the gas, the brake, the steering wheel, and the gear shift. It’s not that the car doesn’t still need all those functions, but now the car handles everything itself. It’s not exposed to the end user. If you believe the so-called experts, we’ll all be zipping around in self-driving robot cars within a generation.

Ford Model T ControlsFollow this trend several hundred years, and where does it lead? I talked previously about elevators that automatically know which floor you’re going to via RFID chips in your apartment keys. Why couldn’t that work elsewhere? Maybe you’ll pull into the Starbucks parking lot and find your usual soy milk decaf latte waiting when you get up to the counter. Maybe the refrigerator will automatically order more eggs from the store when you take the last two out. Maybe the polling station will know that you’re a member of the Christian Coalition and have a ballot all queued up with Mike Huckabee’s name checked when you get up to the voting booth.

There’s something very unsettling about these scenarios, and it’s not just the potential privacy hazards. Humans want to be in control of our environment; we instinctively resist environments that control us. Not only that, but we quickly grow bored with environments that coddle us. Humans are designed for dynamism, dissatisfaction, and change; despite the stereotype of modern man as couch potato, as a species we don’t handle stasis well.

So we like to be in control of our surroundings. But how much of this control is just feel-good illusion? When you order a hamburger at Burger King, sure, they’ll make it your way — as long as “your way” only involves their nine predefined toppings. And when you ask for lettuce, you can’t control how much, or whether they use shredded iceberg or delicately layered romaine, or whether it comes from West Virginia or Peru or Ecuador. Burger King’s real slogan should be “Have It Your Way, As Long As Your Way Falls Within the Narrow Parameters of Our Way.”

Read more

Building the Perfect User Interface (Part 2)

(Read Building the Perfect User Interface, Part 1.)

In my first ramble about user interface, I used the toaster as an example of something that is erroneously thought to have a perfect user interface. Perhaps a more apropos example for most techies is the Internet search engine.

Think of any piece of information you’d like to know. Who was the king of France in 1425? What’s the address and occupation of your best friend from junior high school? How many barrels of oil does Venezuela produce every day? Chances are, that piece of information is sitting on one of the trillions of web pages cached in Google’s databases, and it’s accessible from your web browser right this instant.

Google Is a Giant Robot illustrationYou just have to figure out how to get to it — and Google’s job is to bring it to you in as few steps as possible. It’s all a question of interface, and that’s why user interface has been Google’s main preoccupation since day one.

It might seem the model of simplicity to click in a box, type for a search term, and click a button to get your results. But the Google model of searching is still an imperfect process at best. You may not realize it, but there are still a number of Rubegoldbergian obstacles between you and the information you’re trying to get to. For instance:

  1. You need to have an actual machine that can access the Internet, whether it’s a computer or a cell phone or a DVR.
  2. That machine has to be powered and correctly configured, and it relies on hundreds of other machines — routers, satellites, firewalls, network hubs — to be powered and correctly configured too.
  3. You need to know how to log in to one of these machines, fire up a piece of software like a web browser, and find the Google website.
  4. The object of your search has to be easily expressed in words. You can’t put an image or a color or a bar of music into the search box.
  5. Those words have to be in a language that Google currently recognizes and catalogs (and your machine has to be capable of rendering words in that language).
  6. You have to know how to spell those words with some degree of accuracy — which isn’t a problem when searching for “the king of France in 1425,” but can be a real problem if you’re looking for “Kweisi Mfume’s curriculum vitae.”
  7. You need to be able to type at a reasonable speed, which puts you at a disadvantage if you’re one-handed or using imperfect dictation software.
  8. Google has to be able to interpret what category of subject you’re looking for, in order to discern whether you’re trying to find apples, Apple computers, Apple Records, or Fiona Apple.

Some of these barriers between you and your information might seem laughable. But it all seems so easy for you because you’re probably reading this from the ideal environment for Google, i.e. sitting indoors at a desk staring at a computer that you’ve already spent hours and hundreds, if not thousands, of dollars to set up. If you’re running down the street trying to figure out which bus route to take, the barriers to using Google become much steeper. Or if you’re driving in your car, or if you’re a Chinese peasant without access to 3G wireless, or if you’re lounging in the pool, and so on.

Even in the best-case scenario, after you jump through all those hoops, you usually have to scan through at least a page of results from the Google search engine to find the one that contains the information you’re looking for. Google does no interpretation, summarization, or analysis on the data it throws back to you. Some search engines do some preliminary classification of results, or they try to anyway, but it’s generally quite rudimentary. Chances are you’ll need to spend at least a few seconds to a few minutes combing through pages to find one that’s suitable, and then you’ll need to search through that suitable page to find the information you want.

I don’t mean to minimize the achievement of the Google search engine. The fact that I can determine within minutes that a) the king of France in 1425 was Charles VII, b) my best friend from junior high school is currently heading the division of a high-definition audio company in Latin America, and c) in 2004, Venezuela produced 2.4 million barrels of oil a day — this is all pretty frickin’ amazing. But that doesn’t mean we shouldn’t note the search engine’s shortcomings. That doesn’t mean we shouldn’t point out that there are still a zillion ways to improve it. There’s still a huge mountain to climb before we can call Google an example of perfect user interface.

But don’t worry, because Google’s on the case.

Read more

Google’s Instant Translation

In case they weren’t working on enough already, Reuters reports that Google is working on the ability to instantly translate documents. BabelFish on steroids, if you will.

Google logoHow would you build such a thing? You might think that Google’s programmers would try to break down languages into sophisticated formulae depicting sentence structure and grammatical rules and that kind of thing. Wrong. If I’m reading the article correctly, Google’s essentially trying to do the whole thing through pattern recognition.

This means that when you feed this sentence into the Google translator —

Nel mezzo del cammin di nostra vita
mi ritrovai per una selva oscura
ché la diritta via era smarrita.

— the program doesn’t particularly care that you’re trying to translate the opening stanzas of Dante’s Inferno. It doesn’t care about the Renaissance or Biblical allusions or Italian grammar. All it cares about is the fact that 76.4% of all Italian documents with this sentence translate it into English the way Henry Wadsworth Longfellow did:

Midway upon the journey of our life
I found myself within a forest dark,
For the straightforward pathway had been lost.

In other words, Google is treating language translation as one big black box. Let the computer figure out the complex algorithms that magically turn Italian into English; all Google cares about is the result. (Of course, you understand that I’m drastically simplifying things here.)

This is more or less the same way that the Google search engine works, and it’s proven to be a major breakthrough. When you type “Britney Spears” into the Google search box, the computer logic behind the scenes has no idea what or who a “Britney Spears” is. All it cares about is that people who are looking for that particular term are very likely looking for the website of the pop star and not some obscure brand of English toothpick. And the more searching and clicking web surfers do, the more aggregate data Google has to fine-tune its searching algorithms. The end result? Well, Google works. It’s a zillion times more effective than any other search engine, and in an astoundingly large percentage of searches you find what you’re looking for.

So how accurate is the Google translator? The Reuters article tactfully states that “the quality is not perfect” and “it is an improvement on previous efforts at machine translation,” which is a nice way of saying it kind of sucks at the moment.

But the good thing about pattern recognition is that it improves dramatically the more patterns you feed it. And, hey hey, wouldn’t ya know it — Google’s right in the middle of scanning the complete collections of a number of libraries around the world. Certainly there must be hundreds of thousands of source documents and their miscellaneous translations in the Google databases now that are just ripe for analysis.

But not only is such a system likely to improve with time — it could theoretically adapt to changes in the language too. For instance, the Google algorithm might notice that words which were once translated as “colored people” and “Negroes” are now being translated as “blacks” and “African-Americans” instead.

Read more

Why Is Gmail So Irritating?

I switched over to Google’s Gmail about a year and a half ago from Yahoo! Mail, mostly because I wanted a change. I’m on Gmail about half of the time now, while the other half of the time I use Microsoft Outlook 2003.

I like Google. I have great faith in their ability to bring new technology to the masses in an intuitive, highly functional package. Google Maps quickly supplanted MapQuest as my street directory of choice when it came out. And I’ve got high hopes for Writely, an online word processing application that Google bought earlier this year and promptly rechristened Google Docs & Spreadsheets.

So why is Gmail so irritating?

Gmail logoGmail should be a slam-dunk for Google. After all, I can build a simple POP3 application on a ColdFusion web server in a couple of hours, and that includes time for me to consult the Macromedia documentation to fix my mangled CFML syntax. I’m not saying that that’s all there is to it, of course. (If you want to see a ColdFusion-based application gone horribly awry, look at all the flaws in MySpace.) But I don’t have some of the world’s best developers and billions of dollars in cash lying around either.

Here are my major problems with Gmail:

  • Gmail breaks the browser Back button. To me, this is an absolute cardinal sin. Yes, I understand how difficult it is to make a functioning web application that obeys the Back button in a stateless environment like the web. But certainly Google can do better. I back up into blank, non-functioning pages at least two or three times a day, usually when following links from the Gmail module on my Google home page. And when Google isn’t breaking the Back button, they’re opening up new and unwanted tabs in my browser.
  • Gmail breaks the Reload/Refresh button. Try opening an e-mail message, and then hitting your browser’s reload/refresh button. You get taken back to the list of e-mails. I get hung up on this several times a day too.
  • The interface is very, very slow. I lose patience very easily with the “Loading” messages that pop up at the top of the screen — there are actually two different messages, one that appears in the top right and one that appears in the top left — and they’re up there a lot.
  • No folders. Google assumes that we don’t care for the convention of filing our e-mail into different folders. Therefore Gmail does away with this metaphor altogether in favor of its own Label system, which I can’t seem to get used to. Couldn’t they at least give you the option of using folders, even if it’s not set by default?

Read more