Discussion:
typography of the apostrophe (was Re: Significant whitespace (was Re: Blogging sucks))
A. Pagaltzis
2005-10-19 00:11:01 UTC
Permalink
[ Note: if you're replying to this message, please mind the To:
and Cc: -- this is crossposted to the Hates-Software and
Markdown lists. ]
If that can handle things like: "Foto's en agenda's"
properly, then it's smarter than Word. That source of hate
turns those apostrophes into single-quotes, as if "s en
agenda" is a paraphrase.
With
Foto's en agenda's
it produces
Foto’s en agenda’s
so it passes that test. In fact, with
'Foto's en agenda's'
I get
‘Foto’s en agenda’s’
so yeah, it’s smarter than Word.
This is a huge HUGE hate, stemming from the fact that I'm (a)
involved with two alumni boards and (b) extremely pedantic.
With the awesomeness that is Word's smartquote "feature," a
Steve B. Alum `85
Alison M. Bee '91
Jimmy Cerebus `82
Maude Dechere '63
etc.
I'm not that angry that it doesn't know that those are class
years and not numbers that might happen to be in quotes,
because I can see it being non-trivial to take all cases into
account.
Yeah. SmartyPants documents it as a known (and unfixable) bug
that *all* of these will turn into opening curly quotes. (Which
is because it does not try to pair up quotes, unlike Word, but
rather uses the context to guess what kind of quote it is, which
is why it almost always guesses right, unlike Word. With this
case being the inevitable exception.)
However, in standard U.S. typography, the single open quote is
so rarely used (only quotes within quotes) that that particular
"smartness" should be off by default. Even at the beginning of
word, it's still more likely to be an apostrophe signifying
elision. `Tis a hateful, hateful thing, I tells ya.
Sounds like a fair point.
On web sites where I have the capability I swap ' with ’
on the fly because it's pretty and looks better with a
proportional font. If I were to ever need a single open quote,
I would input ‘ by hand.
Now that I think about it, one could consider extending the
hate to proportional typefaces even having a straight quote
mark glyph. If only browsers had been built around latex.
Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
Brian Forte
2005-10-19 01:18:49 UTC
Permalink
Post by A. Pagaltzis
However, in standard U.S. typography, the single open quote is so
rarely used (only quotes within quotes) that that particular
"smartness" should be off by default. Even at the beginning of
word, it's still more likely to be an apostrophe signifying
elision. `Tis a hateful, hateful thing, I tells ya.
Sounds like a fair point.
Except in standard Commonwealth typography the single open quote is
common as muck.

In the US, most style guides stipulate the double-quote (") character
for direct speech and the single-quote character for indirect speech
(ie a speaking character quoting another person to a third-party). In
the UK and Australia, the reverse is true.

I'd be a touch peeved if Markdown's algorithms were changed to
convenience US typographic habit at the expense of mine (and many
others).

Just a quick note from the Antipodes.

Regards,

Brian Forte.
--
Brian Forte, <mailto:***@betweenborders.com>
Writer, editor, scripter, dangerous mind.
A. Pagaltzis
2005-10-19 02:56:23 UTC
Permalink
Post by Brian Forte
In the US, most style guides stipulate the double-quote (")
character for direct speech and the single-quote character for
indirect speech (ie a speaking character quoting another person
to a third-party). In the UK and Australia, the reverse is
true.
Ah; well, SmartyPants (not Markdown) already deals very badly
(ie not at all) with non-English typographic conventions like the
German or French styles.

I *think* John said that was one of the things he wanted to
support at one point; the mechanism which will inevitably be
necessary to support that could well be used to make a US vs
UK/Aus distinction as well.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
John Gruber
2005-10-19 03:20:33 UTC
Permalink
Post by Brian Forte
Post by A. Pagaltzis
However, in standard U.S. typography, the single open quote is so
rarely used (only quotes within quotes) that that particular
"smartness" should be off by default. Even at the beginning of
word, it's still more likely to be an apostrophe signifying
elision. `Tis a hateful, hateful thing, I tells ya.
Sounds like a fair point.
Except in standard Commonwealth typography the single open quote is
common as muck.
It struck me as well that this fellow seemed to recognize that this
would only apply to "standard U.S. typography", but thought the
software should change as such anyway.

And I'd even argue with *that* point. I think there are more cases
where single open quotes are used in the U.S. than there are cases
where an apostrophe is used at the start of a word it indicate
elision.

Some of the common cases like 'tis and 'em could be handled by a
hard-coded list of special cases (which could be localized for other
languages). SmartyPants doesn't do this, but it could. Joe Clark
told me that Matt Mullenweg's quote smartener (built into
WordPress?) has such a list of English terms.

The decades thing is another issue. It's probably the case that
something that matches:

'\d\d\b

e.g.:

'86
'73

is more likely to be an elided decade than the start of a
single-quoted string. I.e. I suspect SmartyPants would curl fewer
apostrophes the wrong way if it started looking for this pattern and
assuming they were years. You'd then get mistakes with something
like this:

'99 Bottles of Beer on the Wall'

The advantage to the current implementation, however, is that
SmartyPants always does the right thing if you avoid this style of
year abbreviation. Just write out the 4-digit year and SmartyPants
will do the right thing.

The people who use this two-year style for years the most are,
typically, in my experience, in higher education. Alumni magazines
use this style to say what year so-and-so gradudated, because
otherwise spelling out the 4-digit year would take up a measurable
amount of space and might seem unnecessary.

SmartyPants is already capable of doing the right thing with

the '80s

because the digit-digit-s is a dead giveaway as to the writer's
intent.
Post by Brian Forte
I'd be a touch peeved if Markdown's algorithms were changed to
convenience US typographic habit at the expense of mine (and many
others).
I wouldn't worry if I were you.

-J.G.
Aaron Swartz
2005-10-19 03:30:14 UTC
Permalink
Post by John Gruber
because the digit-digit-s is a dead giveaway as to the writer's
intent.
What about that disco classic '70s Bottles of Beer On The Wall'?
Michel Fortin
2005-10-19 13:34:17 UTC
Permalink
Post by John Gruber
Some of the common cases like 'tis and 'em could be handled by a
hard-coded list of special cases (which could be localized for other
languages). SmartyPants doesn't do this, but it could. Joe Clark
told me that Matt Mullenweg's quote smartener (built into
WordPress?) has such a list of English terms.
Indeed, this is from the inside of wptexturize function in WordPress:

// This is a hack, look at this more later. It works pretty well
though.

$cockney = array("'tain't","'twere","'twas","'tis","'twill",
"'til","'bout","'nuff",'round","'cause");

$cockneyreplace = array("&#8217;tain&#8217;t","&#8217;twere",
"&#8217;twas","&#8217;tis","&#8217;twill","&#8217;til",
"&#8217;bout","&#8217;nuff","&#8217;round","&#8217;cause");

$curl = str_replace($cockney, $cockneyreplace, $curl);
Post by John Gruber
Ah; well, SmartyPants (not Markdown) already deals very badly
(ie not at all) with non-English typographic conventions like the
German or French styles.
Be sure to read this [wikipedia article about quotation marks][1]. I
would never have thought there was so much quoting styles around the
world. Do we want SmartyPants to handle all that? and in what way?

[1]: http://en.wikipedia.org/wiki/Quotation_mark

- - -

I'm not sure what you mean when you say it adapts badly to French
typography. In French we quote text « this way » and sometime we use
English-style double quotes. Are you suggesting SmartyPants should
replace "double quotes" with « angle quotes » inside French text? The
French version of Microsoft Word does this and I always hated it.
It's a pain when you want mix languages. Not only that, but English-
style double quotes are often used for inner quotations in French,
pretty hard to do with Word.

I can say that on my Canadian multilingual Apple keyboard it's quite
easy to type angle quotes `«»` using opt-z and opt-x, and I can type
double quotes `"` with shift-period. I don't know much about other
keyboard layouts however.

If SmartyPants was to do something good for the French language, I
think it could replace <<this approximation of french quotes>> with
«something better». (I saw that in the comments on my weblog just
yesterday. People write this when they don't know how to type the
right characters.)

- - -

What could be done for French (and, to some extents, for other
languages) is a smart way to replace spaces with unbreakable spaces
where appropriate. French typography conventions usually include an
unbreakable space before the colon and on the inside of french quotes:

Jean a dit : « Il fait beau aujourd'hui ! »

Some people also put a space before a question marks, an exclamation
marks (as illustrated above), and a semi-colon. The space is also
used as a thousand separator in numerals:

1 436,12 - 1 000 = 436,12

The problem is that many people don't know what is an unbreakable
space and are writing normal, breakable spaces at each of these
places. I see that everywhere on French websites and it gives ugly
text wrapping.

I made a PHP script for the French part of my website that deals with
that. By default it doesn't add any space, it only *replaces* normal
spaces with unbreakable spaces where appropriate. This way it
shouldn't cause any harm to English or (hopefully) other languages.
You can also set it to impose a space or no space before and after
such marks.

The script can also replace spaces at other predefined places, let's
say "p. 12", "10 Kb", etc. Obviously, this list can be useful in any
language, and should be localizable.

I haven't published my script yet, but since we are at it, I'm open
to the idea it could merged with SmartyPants, if there is some
interest and John thinks it is appropriate. Otherwise I'll just
release it as a separate project, as planned initially.


Michel Fortin
***@michelf.com
http://www.michelf.com/
Choan C. Gálvez
2005-10-19 14:14:41 UTC
Permalink
Post by Michel Fortin
[...]
I can say that on my Canadian multilingual Apple keyboard it's quite
easy to type angle quotes `«»` using opt-z and opt-x, and I can type
double quotes `"` with shift-period. I don't know much about other
keyboard layouts however.
For Windows users: you can personalize your keyboard layout using
[Microsoft Keyboard Layout Creator][1] (gratis). I've done it and now I
can type «», “”, etc. quite happily.

(There's an [article at my weblog][2] (in Spanish) describing the proccess.)

[1]:
http://www.microsoft.com/downloads/details.aspx?FamilyId=FB7B3DCD-D4C1-4943-9C74-D8DF57EF19D7&displaylang=en
[2]: http://dizque.lacalabaza.net/sotanos/2005/06/personalizar-el-teclado/
Post by Michel Fortin
If SmartyPants was to do something good for the French language, I
think it could replace <<this approximation of french quotes>> with
«something better». (I saw that in the comments on my weblog just
yesterday. People write this when they don't know how to type the right
characters.)
It'd be nice. Spanish uses these guillemot quotes too, althought
everybody seems to ignore that.

Choan
***@alice.0z0ne.com
http://dizque.lacalabaza.net/
A. Pagaltzis
2005-10-19 22:11:00 UTC
Permalink
Post by Michel Fortin
In French we quote text « this way » and sometime we use
English-style double quotes. Are you suggesting SmartyPants
should replace “double quotes” with « angle quotes » inside
French text? The French version of Microsoft Word does this
and I always hated it. It’s a pain when you want mix
languages. Not only that, but English-style double quotes are
often used for inner quotations in French, pretty hard to do
with Word.
Ah, hmm.

In German, there are two interchangable rules: they’re either

„bottom and left doublequotes“

which is problematic, because in many fonts the left doublequote
is angled in a way that it only looks right as an English opening
double quote. See Verdana for an example.

Or, the other option:

»angle marks, but opposite to the French style«

(and no spaces either), and if you want to nest quotations, you
use

›single angle marks for the inner ones‹

This is preferred on the web, due to the aforementioned problem
with doublequotes in many fonts. (Whereas in print, using
doublequotes is customary, since the typesetter has complete
control over which font is used.)

In any case, English-style top-doublequotes are never used, so
in German mode, SmartyPants *should* translate straight
doublequotes to angle marks for German text, as opposed to French
mode.
Post by Michel Fortin
If SmartyPants was to do something good for the French
language, I think it could replace <<this approximation of
french quotes>> with «something better». (I saw that in the
comments on my weblog just yesterday. People write this when
they don’t know how to type the right characters.)
And likewise for >>this ASCII transliteration<<. That might be a
good idea regardless of mode.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
Michel Fortin
2005-10-20 23:58:07 UTC
Permalink
Post by A. Pagaltzis
In German, there are two interchangable rules: they’re either
„bottom and left doublequotes“
»angle marks, but opposite to the French style«
›single angle marks for the inner ones‹
The bugzilla reference from Damian Cugley gave me an idea for
international quotes. Instead of trying to be "smart" about it and
replace standard English by their localized equivalents, why not
define different shortcuts for different quotes. For example, German
quotes could be written like this:

,,bottom and left doublequotes''

This would only require SmartyPants to recognize the `,,` construct
as a lower double quote. As suggested previously, angle marks could
Post by A. Pagaltzis
angle marks, but opposite to the French style<<
angle marks, but opposite to the French style<
Single angle marks may be a problem though, but I'll continue my idea...

The same thing could be done for French:

<<angle marks>>
<< angle marks with a space inside >>

and be turned into this:

«angle marks»
« angle marks with a space inside »

From the table on the Quotation mark Wikipedia entry, I see that
some languages (Estonian, Icelandic) use the same double quotes as
German, but the closing quote is curled in the opposite direction.
Quotes in those languages could be written this way:

,,bottom and left doublequotes``

and turned into this:

„bottom and left doublequotes“

With these rules, you could write in "plain ascii" most of the quotes
of the [Quotation mark table][1].

[1]: http://en.wikipedia.org/wiki/Quotation_mark#Table

A notable exception is ”...” where both quotes are curled the same
way as the closing quote in English (used by Swedish, Finnish, and
Dutch (alternative quoting style) [^1]). The other unanswered problem
is that simple angle quotes could be mistaken and replace a *true*
greater than or less than, so it may be better to forget about them.


[^1]:
Disclaimer: This is all according to the table, which may contain
errors. (It doesn't seem completely right to me for French so...)


Michel Fortin
***@michelf.com
http://www.michelf.com/
A. Pagaltzis
2005-10-22 00:11:20 UTC
Permalink
Post by Michel Fortin
The bugzilla reference from Damian Cugley gave me an idea for
international quotes. Instead of trying to be "smart" about it
and replace standard English by their localized equivalents,
why not define different shortcuts for different quotes.
,,bottom and left doublequotes''
It’s not a bad idea, and supporting it in *addition* to more
specific support would be great.

But with regard to German, relying on this kind of
transliteration has a problem: noone actually writes like that.
Keyboards have no way to enter bottom doublequotes at all, which
is no wonder considering the common non-Unicode charsets do not
even contain such a character. So the end result is that everyone
writes German with straight doublequotes for quotations.

But since English-style quotes are incorrect in German, there is
no reason for properly typeset German to contain them; it would
make sense for anything that typographically smartens text to
assume that in German text, straight quotes as supposed to become
angle marks.

So I think it would be nice to be able to tell SmartyPants which
language a document is in, and have it adjust some defaults
accordingly, when it makes sense to do so. Having explicit ASCII
transliterations for the various quote marks is still useful, so
that when you’re mixing languages in a single document, you could
use the particular kind of quotes you want explicitly.

So far, I don’t see this as trying to be particularly clever;
there would be one set of defaults per document. What would
certainly be folly is for SmartyPants to try to support
mixed-language documents by changing defaults during processing.
That is, as far as I can tell, why the Mozilla folk are having so
much sorrow over bug #16206.