Yoknapatawpha Crossing

Archive for the ‘Uncategorized’ Category

Hyphenation and justification on the web

with one comment

There were a couple of recent (reckoned in academic time) articles on best practices for web typography which seem, to me, to miss an important point. And so here we are, as I fulfill the ancient role of Offended Nerd responding to Someone Who Is Wrong on the Internet.

This article deals specifically with two questions regarding hyphenation and justification on the web:

  1. Can you?
  2. Should you?

The Web Typography Gurus have answers to these questions: Yes! and, Of course! But these reflect a simplistic and biased view of the situation. The answers should be: Only with serious tradeoffs, and, Maybe.

Background

Here is the part where I try to provide a common ground for discussion. If you already know anything about typography, you should skip to the next section. The only points I make here that I haven’t seen emphasized elsewhere are

  1. The decision to justify text is purely aesthetic and does not confer material benefit to the reader, and
  2. Justification and hyphenation are separate processes which relate and feed back to each other when used together, but are not irretrievably intertwined.

If you look at whatever book you have nearest to hand, it will probably have justified text. That is, if you look at one whole page at a time, without reading the words on the page, you will see that the text fills a rectangle. The left edges of every line are flush, and the right edges of every line are flush. This effect is produced by adjusting the spacing between words in a line, and sometimes even between the individual letters. When a line has lots of characters, the typesetter squashes the spaces a bit, and when there are fewer characters they stretch the spaces out. This squash-and-stretch process is known as justification; text presented in this way is justified. No matter what.

Text does not have to be justified. You have probably been abused by Word at some point in your life, and so you know that there are also buttons for left-align (where everything lines up on the left of the page, but the right edge of the text is ragged), right-align (the opposite), and center. If you were writing in another language, you might top-align, staircase-align, or something else equally un-American and subversive. In Western writing, justification is an extremely common and traditional way to present text. Gutenberg justified his Bible, although the practice of justification predates him (as you may verify for yourself by finding pictures of basically any illuminated manuscript from before Gutenberg’s time).

The decision to justify text is entirely aesthetic and subjective (more below). I imagine many, or even most, book designers choose to justify text based largely on tradition. Even in the absence of tradition, a designer might choose to justify text in a book for very many good aesthetic reasons, one of which is that it introduces some tension into a page of text. Forcing text to fill out a regular large-scale frame creates a nice image on the page, and also necessitates the adjustment of spacing within each line, breaking up what might otherwise be a monotonous repetition of precisely the same spacing word after word.

Justification can lead to absurd adjustments of inter-word spacing, to the extent that the text becomes more difficult to read (and this is an objective and quantifiable effect; again more below). To combat this, typographers also use hyphenation, the practice of breaking one word across two lines. When hyphenation and justification are used together, there is a virtuous feedback between the two things. But it is very important to remember that they are two distinct processes. You can justify without hyphens, and you can hyphenate without justification, and you can use them both at the same time but not together if you want to make something that really looks bad. The best typesetting systems (which includes talented and experienced people as well as certain computer programs) allow these two processes to feed back to each other, so that you may hyphenate a word to achieve better line spacing five lines later, or move a word up on line to avoid a hyphen three lines previously. Donald Knuth, the designer of a typesetting system known as TeX, has written extensively and entertainingly about these issues.

As with justification, the use of a hyphen-like character to fill space did not originate with Gutenberg. Hyphens, fleurons, extended letters, and other decorative devices that helped fill out space were in widespread use by scribes for many years prior to Gutenberg.

The interesting bit

Others suggest that justification is the best, the most professional, or the most readable/legible way to present text. This is not true.

To see that hyphenated-and-justified text is not necessarily “the best,” you only have to ask the person peddling this proposition what being the best means. Then you will hear that being the best means that it’s the best-looking, the most professional, or the most legible way to present text. As for being the best-looking, this may be true in the judgement of some. But this is transparently a subjective claim, and reasonable people have great differences of opinion. Leading us to the next point.

To see that H-and-J is not necessarily the most professional, you only have to look at how professionals choose to present their work. Off the top of my head, I don’t know of any examples of mass-market paperbacks that have been presented using anything other than H-and-J. But I can come up with many examples of books produced by discerning professionals who deeply value the way their words are presented to the world that eschew H-and-J, namely: The Visual Display of Quantitative Information, Graphic Design Referenced, Designing with type. In addition, Grid Systems uses ragged-right for marginal notes; indeed Knuth discusses at some length the futility of trying to set justified text in a narrow block. And, because I know you’ve been waiting for me to bring down the hammer, Robert Bringhurst’s The Elements of Typographic Style (p 27-28) recommends ragged right when using sans-serif or monospaced fonts, or just whenever the situation demands it.

Boom! We are done. There is no retaliation in the face of Bringhurst.

The point is that although you may hear that hyphenated-and-justified text is what the professionals always use, the claim is belied by what you see when you watch the professionals at work.

As for the third claim, that H-and-J makes for the most readable or legible text by some sort of empirical measure, this is also not necessarily true, and to see this you only have to acquaint yourself with some facts. Specifically, Zachrisson finds no appreciable difference in the reading speed or comprehension of subjects when given ragged-right (evenly-spaced) versus hyphenated-and-justified text. In fact, their experiments showed that the least-proficient readers had an easier time with ragged-right text, and he refers to a master’s thesis by S.P. Powers which found that subjects read ragged-right text faster that H-and-J text. Granted, I am quoting results here from just one book; although I had trouble tracking down other sources of specific information on web, I imagine that a lot more research has been done in this field, that the questions involved are very complicated, etc. But the fact remains: a blanket statement like “hyphenated-and-justified text is more legible” is unconvincing without strong empirical evidence.

So what’s the answer to the aesthetic question, “Should I hyphenate and justify text?” The web typography gurus I’ve been reading want you to believe the answer is “Yes, always yes,” and I’ve endeavored to convince you otherwise, that the answer is “Maybe. Different situations demand different solutions.” The facts of the matter indicate that this decision is purely subjective, that neither style of typesetting is better or worse than the other in any sort of measurable way. And that’s the end of the important part of this article.

Specific comments

But I’m still talking! Subservient to the aesthetic question comes the technical question, “Is it possible for me to hyphenate and justify text in the medium I’m using?” And, in this magical future, if you are publishing on the web the answer is now yes. But the solutions aren’t (yet) worth the price you pay.

Specifically, you can tell a web browser to justify just by flipping a CSS switch, and Fink points out the Hyphenator.js library as a solution for getting good hyphenation. “Good” apparently means “hyphenation the same way that TeX does it,” and that is an unfortunately misleading way of framing the whole issue, because, as I pointed out at the beginning, hyphenation and justification are two different things, and Hyphenator.js tackles only one side of the equation. Giving a web browser five hundred million additional potential line breaks does no good if the browser doesn’t use the options it had in the first place intelligently, and that is where the problem lies. Web browsers today generally do not have good justification algorithms; at the time of this writing it was very easy to see this by going to the Hyphenator.js example page and just checking it out for yourself in Firefox, Safari, Chrome, etc. They all did sort of funny things with the word spacings; some browsers didn’t even space words evenly across the line. Without a good justification engine, it doesn’t matter how many potential line breaks you provide to a web browser; it will still produce unappealing text.

The way to address this issue is not with better hyphenation libraries; it’s with better justification algorithms. And in fact, there is already an implementation of the TeX line-breaking algorithm in JavaScript, and an example of this algorithm used in combination with Hyphenator.js. It’s clear immediately that this produces very nice results, but this program works by using the HTML5 canvas element to precisely position and draw each word, which means that copy/paste from any page rendered with Typeset is broken, and that more generally this approach Breaks the Web. The fact that Hyphenator.js and Typeset exist is great, and they’re certainly important steps in bringing better justified text to the web. But using them involves serious tradeoffs: copy/paste, text search, and a meaningful DOM are, you know, sort of important.

Eventually these issues will pass. One day web browsers will provide industry-leading H-and-J algorithms, high-quality mathematical typesetting, and ponies. But it hasn’t happened yet, and you should not make your users suffer now in anticipation of the future.

Finally, let me close by saying that referencing Wikipedia as a primary source will only lead to embarrassment.

References

Postscript

This whole topic is … ramified. Although I’ve tried to give a balanced argument based on the facts as I understand them, I’ve also turned up a whole lot of other material that I haven’t even had time to go through yet. In case you’re interested, here’s the list so far:

Written by Daniel Grady

March 27, 2011 at 16:32

Posted in Uncategorized

Let me tell you a story

with one comment

About how awesome life is right now. In theory, this is my easy, laid-back, lots-of-free-time semester, but for a variety of reasons, that hasn’t quite materialized yet. Instead, the past couple of weeks have been more like, “wake up at 9, go go go go go, sleep, repeat.” Making a PowerPoint presentation for this weekend’s charity auction, training no less than four people to fill two of the positions I currently occupy, and remembering how to do differential equations are, it turns out, significant time drains.

So why is life awesome? I have all but officially received a summer research internship at ORNL, and a dude from Northwestern just called me to let me know that I’m probably going to be accepted with financial aid. Boo-ya.

Written by Daniel Grady

February 7, 2006 at 11:54

Posted in Uncategorized

Inspect, Sir or Madam, my sleeve…

leave a comment »

Things have gotten to a point where I am truly and sincerely irked at many of the obligations I have. When I first began working as an A/V tech for the college, it was interesting. I got to fiddle with a bunch of crazy equipment, my boss didn’t treat me like an idiot and gave me a lot of responsibility, and I got to drive the awesome old red van around campus. The fact that I did, and do, get paid peanuts for the work is rather beside the point; the job was appealing because it was flexible and on campus. I do still value it for these reasons, but in the past year the job has become rote and annoying, and I have developed a habit of shunting it to the side whenever possible. You can only set up speakers and microphones so many times before it loses its charm.

I’ve experienced the same problem (on a much shorter time scale) at the Flat Hat. This job was initially appealing because it required (I believed) a minimal time investment and consisted entirely of mindless work, for which I would be well-paid. This is true, but as I became better acquainted with the position, it became clear that doing a good job would require a slightly greater time investment, and would consist of work that was indescribably boring but was involved enough to demand attention. I stopped enjoying both of these jobs some time ago, but I do still enjoy receiving a paycheck, and so have continued to suffer them. But they require time, and time is the thing that I’m least inclined to give them.

The real problem with devoting time to these jobs is not the jobs themselves, but what they’re keeping me from. It has reached a point where I am starting to get some (very general, non-specific, vague) ideas of what I want to spend my time on, and these ideas do not include recording a capella performances of mediocre college groups or tallying the advertising accounts at the school newspaper.

When I say “what I want to spend my time on,” I’m not talking about a “Gee, I think I’d like to sit down and read this book for a while” kind of feeling. What I mean is that right now, at the beginning of this semester, I am fired up about my classes and about learning this new material. I’m cracking open my textbooks and thinking, “God damn, this looks really interesting. I want to really sit down this semester and get a good handle on this.” I’m working through a problem and becoming extremely involved in writing a routine to build a B-spline approximation to a function. And I know that it’s horribly, horribly dorky, and I don’t care. Previously in my “college career,” classes and math were things that I did because I was good at them and because there was a nagging feeling in the back of my head that it would all pay off one day; they weren’t things that I was inspired to do in and of themselves. But for the past couple of semesters, math has become something that truly grips my interest; it has become something that I very much want to devote my time to. Population diffusion models? Lay that shit out. Hermitian matrices? Give me a heaping spoonful. It is to the point where I am, bless my little head, excited by all this.

The fact that I am excited is itself exciting, and also frustrating. Exciting for the obvious reasons: I’ve found some things that I am deeply interested in, and that I (would) happily invest my time with. It’s exciting because the longer I study this subject, the more certain I become that I’ve found a niche that I can be happily productive in, and that applying to continue studying this subject in graduate school was the right choice. It’s frustrating because the world at large is essentially ignorant of my newfound direction: random part-time jobs continue to impose themselves, professors continue to assign work, and I don’t have the kind of time that I want for devoting to geekier pursuits.

This is the source of the dim view I currently have of my jobs and other responsibilities. What were initially pleasant and profitable distractions have become merely distractions that I could do without. These are temporary problems, however, and knowing that I’ll be shedding many of these obligations as the semester progresses makes it easier to put up with them now. In fact, I really have so little to complain about it’s laughable. In spite of everything, I fully expect this semester to be a wonderful end to a wonderful four years.

Written by Daniel Grady

January 31, 2006 at 10:14

Posted in Uncategorized

Two Things

with 2 comments

The first is that I just recently finished reading a book called The Amazing Adventures of Kavalier & Clay, by Michael Chabon. It was really good. It’s about New York City and comic books and the early part of the century. Chabon is engaging like John Irving, has a command of English as vast as Conrad’s, and tells a story that is slightly fanciful but steeped in reality. Actually the reason I like it so much is that Chabon perfectly captures the spirit of the age he’s depicting. Reading Kavalier & Clay feels very similar to reading Fitzgerald, if Fitzgerald had come at his subject from a very different direction. But that’s just what I think; what the hell do I know, I’m a math major. Soon to be a math graduate. If I don’t drop out of college before the end of the semester. Anyhow.

The second thing is Google Earth. The other day I was fiddling around on Google Maps, and was very impressed by the technology they’ve got powering that. You can drag maps around smoothly, you can zoom in way far, you can not only get satellite imagery of most places, but actually overlay Google’s road map on top of that satellite image, and they’ve got all this linked with their database, so searching for things close to the spot on earth you’re looking at is easy. Very impressive. Then I found Google Earth.

Google Earth has one caveat: unlike many of Google’s technologies, it is a program you must download, install, and run, rather than load in a web browser. Should you choose to do this, however, you will lose yourself for hours in what has to be one of the most impressive displays of web integration available today.

Google Earth is a program which models the entire globe in real time 3D. It uses satellite imagery to paint the surface, so you can start by looking at the entire Earth framed in your window, and zoom in to a bird’s eye view of your house. Depending on the age of the imagery, you might see your car parked in the driveway. Of course, that’s an awful lot of satellite imagery to store on your hard drive. How does the program pull this off, exactly? In point of fact, whenever you run Google Earth, the program connects to Google’s servers, and they stream you satellite imagery to power the program. That alone is a ridiculous technical accomplishment, but on top of this unprecedented model of the world, they have added their map technology so that you can see not just roads, but businesses, parks, points of interest, and everything else on this true-to-life picture of the earth in addition to viewing flybys of any trips you might be planning; they have built three dimensional models of the metropolitan areas of major cities; and they have linked all of this to an online community that allows users to add their own markers to the globe for all to see. And they are giving all this away for free.

If the fact that this blog is running on Blogger didn’t immediately tip you off, I am a Google fanboy. It’s hard not be a fanboy for a company with Google’s philosophy, though. They look at the internet, and they find something that people are using. They say to themselves, “We are going to take this idea that works pretty well, and we are going to make it work beautifully. We are going to make it do everything that common sense says it should do. We are going to base it on cutting-edge technology, we are going to make it elegant and straightforward, we are going to make it so stultifyingly simple that anyone who knows how to click a mouse can immediately use it, and we are going to give it away for free.” It’s hard to argue with free.

Written by Daniel Grady

November 10, 2005 at 18:07

Posted in Uncategorized

By the way

leave a comment »

One site that I’ve been linking to quite frequently is Wikipedia, and I’d just like to point out what an amazing place it is. If you haven’t checked it out yet, you ought to. The idea is that it’s an encyclopedia, but absolutely anyone can get on there and edit an article, or create a new article. In spite of the sincere efforts of some people determined to wreck the experience, the whole thing works incredibly well. Just go here and start reading, it’s crazy what you run across.

Also, a day or two ago I mentioned Technorati, which is a pretty neat site. The idea there is to keep track of blogs and rate which are the most influential by counting the number of other pages that link to any given blog. It’s the same idea as citation analysis to determine the influence of a scientific paper, or Google’s PageRank thing, but for blogs. A site like Gizmodo is way up on the list because all kinds of people read it and then link back to it, whereas my poor site is at the bottom of the pile. Will you be my friend?

Written by Daniel Grady

May 7, 2005 at 02:38

Posted in Uncategorized

110,000 V Taser Canon

leave a comment »

I’ma put one of these bad boys on the roof of my dorm. That’ll keep those damn construction workers from starting up their noisy machinery at 8am.

Linked from Gizmodo

Written by Daniel Grady

May 3, 2005 at 16:42

Posted in Uncategorized

Things that should not change

leave a comment »

Healthy eating is a good thing that we need to see more in America, and seeing healthier messages in the media is undoubtedly going to contribute to a change for the better. But there are certain lines that just should not be crossed.

Cookie Monster is named ‘Cookie Monster’ because he likes cookies. Not cucumbers, carrots, or chick peas. Cookies. It’s in his bloody name. When Cookie Monster starts telling you that cookies are no good, well… It just makes you want to kill all those damn terrorists who are screwing up our beautiful homeland even more.

Written by Daniel Grady

April 9, 2005 at 17:45

Posted in Uncategorized