Discussion:
typography of the apostrophe (was Re: Significant whitespace (was Re: Blogging sucks))
(too old to reply)
A. Pagaltzis
2005-10-19 00:11:01 UTC
Permalink
[ Note: if you're replying to this message, please mind the To:
and Cc: -- this is crossposted to the Hates-Software and
Markdown lists. ]
If that can handle things like: "Foto's en agenda's"
properly, then it's smarter than Word. That source of hate
turns those apostrophes into single-quotes, as if "s en
agenda" is a paraphrase.
With
Foto's en agenda's
it produces
Foto’s en agenda’s
so it passes that test. In fact, with
'Foto's en agenda's'
I get
‘Foto’s en agenda’s’
so yeah, it’s smarter than Word.
This is a huge HUGE hate, stemming from the fact that I'm (a)
involved with two alumni boards and (b) extremely pedantic.
With the awesomeness that is Word's smartquote "feature," a
Steve B. Alum `85
Alison M. Bee '91
Jimmy Cerebus `82
Maude Dechere '63
etc.
I'm not that angry that it doesn't know that those are class
years and not numbers that might happen to be in quotes,
because I can see it being non-trivial to take all cases into
account.
Yeah. SmartyPants documents it as a known (and unfixable) bug
that *all* of these will turn into opening curly quotes. (Which
is because it does not try to pair up quotes, unlike Word, but
rather uses the context to guess what kind of quote it is, which
is why it almost always guesses right, unlike Word. With this
case being the inevitable exception.)
However, in standard U.S. typography, the single open quote is
so rarely used (only quotes within quotes) that that particular
"smartness" should be off by default. Even at the beginning of
word, it's still more likely to be an apostrophe signifying
elision. `Tis a hateful, hateful thing, I tells ya.
Sounds like a fair point.
On web sites where I have the capability I swap ' with ’
on the fly because it's pretty and looks better with a
proportional font. If I were to ever need a single open quote,
I would input ‘ by hand.
Now that I think about it, one could consider extending the
hate to proportional typefaces even having a straight quote
mark glyph. If only browsers had been built around latex.
Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
Brian Forte
2005-10-19 01:18:49 UTC
Permalink
Post by A. Pagaltzis
However, in standard U.S. typography, the single open quote is so
rarely used (only quotes within quotes) that that particular
"smartness" should be off by default. Even at the beginning of
word, it's still more likely to be an apostrophe signifying
elision. `Tis a hateful, hateful thing, I tells ya.
Sounds like a fair point.
Except in standard Commonwealth typography the single open quote is
common as muck.

In the US, most style guides stipulate the double-quote (") character
for direct speech and the single-quote character for indirect speech
(ie a speaking character quoting another person to a third-party). In
the UK and Australia, the reverse is true.

I'd be a touch peeved if Markdown's algorithms were changed to
convenience US typographic habit at the expense of mine (and many
others).

Just a quick note from the Antipodes.

Regards,

Brian Forte.
--
Brian Forte, <mailto:***@betweenborders.com>
Writer, editor, scripter, dangerous mind.
A. Pagaltzis
2005-10-19 02:56:23 UTC
Permalink
Post by Brian Forte
In the US, most style guides stipulate the double-quote (")
character for direct speech and the single-quote character for
indirect speech (ie a speaking character quoting another person
to a third-party). In the UK and Australia, the reverse is
true.
Ah; well, SmartyPants (not Markdown) already deals very badly
(ie not at all) with non-English typographic conventions like the
German or French styles.

I *think* John said that was one of the things he wanted to
support at one point; the mechanism which will inevitably be
necessary to support that could well be used to make a US vs
UK/Aus distinction as well.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
John Gruber
2005-10-19 03:20:33 UTC
Permalink
Post by Brian Forte
Post by A. Pagaltzis
However, in standard U.S. typography, the single open quote is so
rarely used (only quotes within quotes) that that particular
"smartness" should be off by default. Even at the beginning of
word, it's still more likely to be an apostrophe signifying
elision. `Tis a hateful, hateful thing, I tells ya.
Sounds like a fair point.
Except in standard Commonwealth typography the single open quote is
common as muck.
It struck me as well that this fellow seemed to recognize that this
would only apply to "standard U.S. typography", but thought the
software should change as such anyway.

And I'd even argue with *that* point. I think there are more cases
where single open quotes are used in the U.S. than there are cases
where an apostrophe is used at the start of a word it indicate
elision.

Some of the common cases like 'tis and 'em could be handled by a
hard-coded list of special cases (which could be localized for other
languages). SmartyPants doesn't do this, but it could. Joe Clark
told me that Matt Mullenweg's quote smartener (built into
WordPress?) has such a list of English terms.

The decades thing is another issue. It's probably the case that
something that matches:

'\d\d\b

e.g.:

'86
'73

is more likely to be an elided decade than the start of a
single-quoted string. I.e. I suspect SmartyPants would curl fewer
apostrophes the wrong way if it started looking for this pattern and
assuming they were years. You'd then get mistakes with something
like this:

'99 Bottles of Beer on the Wall'

The advantage to the current implementation, however, is that
SmartyPants always does the right thing if you avoid this style of
year abbreviation. Just write out the 4-digit year and SmartyPants
will do the right thing.

The people who use this two-year style for years the most are,
typically, in my experience, in higher education. Alumni magazines
use this style to say what year so-and-so gradudated, because
otherwise spelling out the 4-digit year would take up a measurable
amount of space and might seem unnecessary.

SmartyPants is already capable of doing the right thing with

the '80s

because the digit-digit-s is a dead giveaway as to the writer's
intent.
Post by Brian Forte
I'd be a touch peeved if Markdown's algorithms were changed to
convenience US typographic habit at the expense of mine (and many
others).
I wouldn't worry if I were you.

-J.G.
Aaron Swartz
2005-10-19 03:30:14 UTC
Permalink
Post by John Gruber
because the digit-digit-s is a dead giveaway as to the writer's
intent.
What about that disco classic '70s Bottles of Beer On The Wall'?
Michel Fortin
2005-10-19 13:34:17 UTC
Permalink
Post by John Gruber
Some of the common cases like 'tis and 'em could be handled by a
hard-coded list of special cases (which could be localized for other
languages). SmartyPants doesn't do this, but it could. Joe Clark
told me that Matt Mullenweg's quote smartener (built into
WordPress?) has such a list of English terms.
Indeed, this is from the inside of wptexturize function in WordPress:

// This is a hack, look at this more later. It works pretty well
though.

$cockney = array("'tain't","'twere","'twas","'tis","'twill",
"'til","'bout","'nuff",'round","'cause");

$cockneyreplace = array("&#8217;tain&#8217;t","&#8217;twere",
"&#8217;twas","&#8217;tis","&#8217;twill","&#8217;til",
"&#8217;bout","&#8217;nuff","&#8217;round","&#8217;cause");

$curl = str_replace($cockney, $cockneyreplace, $curl);
Post by John Gruber
Ah; well, SmartyPants (not Markdown) already deals very badly
(ie not at all) with non-English typographic conventions like the
German or French styles.
Be sure to read this [wikipedia article about quotation marks][1]. I
would never have thought there was so much quoting styles around the
world. Do we want SmartyPants to handle all that? and in what way?

[1]: http://en.wikipedia.org/wiki/Quotation_mark

- - -

I'm not sure what you mean when you say it adapts badly to French
typography. In French we quote text « this way » and sometime we use
English-style double quotes. Are you suggesting SmartyPants should
replace "double quotes" with « angle quotes » inside French text? The
French version of Microsoft Word does this and I always hated it.
It's a pain when you want mix languages. Not only that, but English-
style double quotes are often used for inner quotations in French,
pretty hard to do with Word.

I can say that on my Canadian multilingual Apple keyboard it's quite
easy to type angle quotes `«»` using opt-z and opt-x, and I can type
double quotes `"` with shift-period. I don't know much about other
keyboard layouts however.

If SmartyPants was to do something good for the French language, I
think it could replace <<this approximation of french quotes>> with
«something better». (I saw that in the comments on my weblog just
yesterday. People write this when they don't know how to type the
right characters.)

- - -

What could be done for French (and, to some extents, for other
languages) is a smart way to replace spaces with unbreakable spaces
where appropriate. French typography conventions usually include an
unbreakable space before the colon and on the inside of french quotes:

Jean a dit : « Il fait beau aujourd'hui ! »

Some people also put a space before a question marks, an exclamation
marks (as illustrated above), and a semi-colon. The space is also
used as a thousand separator in numerals:

1 436,12 - 1 000 = 436,12

The problem is that many people don't know what is an unbreakable
space and are writing normal, breakable spaces at each of these
places. I see that everywhere on French websites and it gives ugly
text wrapping.

I made a PHP script for the French part of my website that deals with
that. By default it doesn't add any space, it only *replaces* normal
spaces with unbreakable spaces where appropriate. This way it
shouldn't cause any harm to English or (hopefully) other languages.
You can also set it to impose a space or no space before and after
such marks.

The script can also replace spaces at other predefined places, let's
say "p. 12", "10 Kb", etc. Obviously, this list can be useful in any
language, and should be localizable.

I haven't published my script yet, but since we are at it, I'm open
to the idea it could merged with SmartyPants, if there is some
interest and John thinks it is appropriate. Otherwise I'll just
release it as a separate project, as planned initially.


Michel Fortin
***@michelf.com
http://www.michelf.com/
Choan C. Gálvez
2005-10-19 14:14:41 UTC
Permalink
Post by Michel Fortin
[...]
I can say that on my Canadian multilingual Apple keyboard it's quite
easy to type angle quotes `«»` using opt-z and opt-x, and I can type
double quotes `"` with shift-period. I don't know much about other
keyboard layouts however.
For Windows users: you can personalize your keyboard layout using
[Microsoft Keyboard Layout Creator][1] (gratis). I've done it and now I
can type «», “”, etc. quite happily.

(There's an [article at my weblog][2] (in Spanish) describing the proccess.)

[1]:
http://www.microsoft.com/downloads/details.aspx?FamilyId=FB7B3DCD-D4C1-4943-9C74-D8DF57EF19D7&displaylang=en
[2]: http://dizque.lacalabaza.net/sotanos/2005/06/personalizar-el-teclado/
Post by Michel Fortin
If SmartyPants was to do something good for the French language, I
think it could replace <<this approximation of french quotes>> with
«something better». (I saw that in the comments on my weblog just
yesterday. People write this when they don't know how to type the right
characters.)
It'd be nice. Spanish uses these guillemot quotes too, althought
everybody seems to ignore that.

Choan
***@alice.0z0ne.com
http://dizque.lacalabaza.net/
A. Pagaltzis
2005-10-19 22:11:00 UTC
Permalink
Post by Michel Fortin
In French we quote text « this way » and sometime we use
English-style double quotes. Are you suggesting SmartyPants
should replace “double quotes” with « angle quotes » inside
French text? The French version of Microsoft Word does this
and I always hated it. It’s a pain when you want mix
languages. Not only that, but English-style double quotes are
often used for inner quotations in French, pretty hard to do
with Word.
Ah, hmm.

In German, there are two interchangable rules: they’re either

„bottom and left doublequotes“

which is problematic, because in many fonts the left doublequote
is angled in a way that it only looks right as an English opening
double quote. See Verdana for an example.

Or, the other option:

»angle marks, but opposite to the French style«

(and no spaces either), and if you want to nest quotations, you
use

›single angle marks for the inner ones‹

This is preferred on the web, due to the aforementioned problem
with doublequotes in many fonts. (Whereas in print, using
doublequotes is customary, since the typesetter has complete
control over which font is used.)

In any case, English-style top-doublequotes are never used, so
in German mode, SmartyPants *should* translate straight
doublequotes to angle marks for German text, as opposed to French
mode.
Post by Michel Fortin
If SmartyPants was to do something good for the French
language, I think it could replace <<this approximation of
french quotes>> with «something better». (I saw that in the
comments on my weblog just yesterday. People write this when
they don’t know how to type the right characters.)
And likewise for >>this ASCII transliteration<<. That might be a
good idea regardless of mode.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
Michel Fortin
2005-10-20 23:58:07 UTC
Permalink
Post by A. Pagaltzis
In German, there are two interchangable rules: they’re either
„bottom and left doublequotes“
»angle marks, but opposite to the French style«
›single angle marks for the inner ones‹
The bugzilla reference from Damian Cugley gave me an idea for
international quotes. Instead of trying to be "smart" about it and
replace standard English by their localized equivalents, why not
define different shortcuts for different quotes. For example, German
quotes could be written like this:

,,bottom and left doublequotes''

This would only require SmartyPants to recognize the `,,` construct
as a lower double quote. As suggested previously, angle marks could
Post by A. Pagaltzis
angle marks, but opposite to the French style<<
angle marks, but opposite to the French style<
Single angle marks may be a problem though, but I'll continue my idea...

The same thing could be done for French:

<<angle marks>>
<< angle marks with a space inside >>

and be turned into this:

«angle marks»
« angle marks with a space inside »

From the table on the Quotation mark Wikipedia entry, I see that
some languages (Estonian, Icelandic) use the same double quotes as
German, but the closing quote is curled in the opposite direction.
Quotes in those languages could be written this way:

,,bottom and left doublequotes``

and turned into this:

„bottom and left doublequotes“

With these rules, you could write in "plain ascii" most of the quotes
of the [Quotation mark table][1].

[1]: http://en.wikipedia.org/wiki/Quotation_mark#Table

A notable exception is ”...” where both quotes are curled the same
way as the closing quote in English (used by Swedish, Finnish, and
Dutch (alternative quoting style) [^1]). The other unanswered problem
is that simple angle quotes could be mistaken and replace a *true*
greater than or less than, so it may be better to forget about them.


[^1]:
Disclaimer: This is all according to the table, which may contain
errors. (It doesn't seem completely right to me for French so...)


Michel Fortin
***@michelf.com
http://www.michelf.com/
A. Pagaltzis
2005-10-22 00:11:20 UTC
Permalink
Post by Michel Fortin
The bugzilla reference from Damian Cugley gave me an idea for
international quotes. Instead of trying to be "smart" about it
and replace standard English by their localized equivalents,
why not define different shortcuts for different quotes.
,,bottom and left doublequotes''
It’s not a bad idea, and supporting it in *addition* to more
specific support would be great.

But with regard to German, relying on this kind of
transliteration has a problem: noone actually writes like that.
Keyboards have no way to enter bottom doublequotes at all, which
is no wonder considering the common non-Unicode charsets do not
even contain such a character. So the end result is that everyone
writes German with straight doublequotes for quotations.

But since English-style quotes are incorrect in German, there is
no reason for properly typeset German to contain them; it would
make sense for anything that typographically smartens text to
assume that in German text, straight quotes as supposed to become
angle marks.

So I think it would be nice to be able to tell SmartyPants which
language a document is in, and have it adjust some defaults
accordingly, when it makes sense to do so. Having explicit ASCII
transliterations for the various quote marks is still useful, so
that when you’re mixing languages in a single document, you could
use the particular kind of quotes you want explicitly.

So far, I don’t see this as trying to be particularly clever;
there would be one set of defaults per document. What would
certainly be folly is for SmartyPants to try to support
mixed-language documents by changing defaults during processing.
That is, as far as I can tell, why the Mozilla folk are having so
much sorrow over bug #16206.
Post by Michel Fortin
This would only require SmartyPants to recognize the `,,`
construct as a lower double quote. As suggested previously,
angle marks, but opposite to the French style<<
angle marks, but opposite to the French style<
[…]
The other unanswered problem is that simple angle quotes could
be mistaken and replace a *true* greater than or less than, so
it may be better to forget about them.
Not only that, but at least the right-pointing single angle mark
<em >clashes </em >with HTML tag parsing. Granted, so long as the
angle marks point “inward,” German-style (opening angle mark
points right, closing points left), you can probably disambiguate
95-99% of the cases correctly.

But from where I sit it seems impossible to disambiguate single
angle marks that point “outward” from HTML tags. (F.ex., what
about the letter <p>?)

So I’d say a transliteration for single angle marks right out of
the window.
Michel Fortin
2005-10-22 22:00:54 UTC
Permalink
Post by A. Pagaltzis
Keyboards have no way to enter bottom doublequotes at all, which
is no wonder considering the common non-Unicode charsets do not
even contain such a character. So the end result is that everyone
writes German with straight doublequotes for quotations.
This sadden me, I find them nice these german quotes.

(I've found out that on my keyboard I can type shift-option-^ to get
bottom double quotes. On the downside it seems I can't type an
opening single quote.)
Post by A. Pagaltzis
Post by Michel Fortin
The other unanswered problem is that simple angle quotes could
be mistaken and replace a *true* greater than or less than, so
it may be better to forget about them.
Not only that, but at least the right-pointing single angle mark
<em >clashes </em >with HTML tag parsing. Granted, so long as the
angle marks point “inward,” German-style (opening angle mark
points right, closing points left), you can probably disambiguate
95-99% of the cases correctly.
I specifically ignored the HTML tag problem because I don't think it
is a problem. SmartyPants tokenize the tags prior translating
anything. If it wouldn't, think about what would happen to tag
attributes :-) Anyway, my point is SmartyPants doesn't search for
quotes inside tags and wouldn't mistaken a tag for a quote.

Parsing of `<<` wouldn't be easy either. It would probably be
necessary to convert `&lt;&lt;` and `&gt;&gy;` to quotes, as a tool
like Markdown will automatically escape less-than characters with
entities if they are not a tag. And what about `<<b>>`? Markdown will
convert it to `&lt;<b>>` and you would have to figure out a way to
make a quote out of that. Or is it a tag?

I haven't realized until now how much complexity angle quotes (simple
or double) could be.
Post by A. Pagaltzis
But from where I sit it seems impossible to disambiguate single
angle marks that point “outward” from HTML tags. (F.ex., what
about the letter <p>?)
This wouldn't work with `<p>`... it could work if you leave spaces as
common French typography suggest (`< p >`), but then how do you know
it isn't part of some mathematical stuff?

1 < p > 2 <=> p > 2
Post by A. Pagaltzis
So I’d say a transliteration for single angle marks right out of
the window.
One side problem I think of, and it applies to double angle marks as
much as single angle marks, is that `<<` and `>>` are somewhat common
operators in many programming languages. Just like the
emphasis_by_underline problem with paths and programming symbols in
Markdown, converting `<<` and `>>` could lead to display problems in
comment boxes where sample code is outside a `<code>` tag. This
wouldn't be catastrophic (and certainly a smaller concern than the
underline problem), but it may be a good idea to think of this in
advance.
Post by A. Pagaltzis
Actually you’re mixing up the styles. English would be
[...]
Right, my error. And all is really missing in the current SmartyPants
is a special rule for `,,` as a lower double quote.
Post by A. Pagaltzis
Post by Michel Fortin
A notable exception is ”...” where both quotes are curled the
same way as the closing quote in English (used by Swedish,
Finnish, and Dutch (alternative quoting style) [^1]).
This could be written in transliteration as
''...''
which results in two right doublequotes.
... and happens to work perfectly already with current SmartyPants. I
somewhat thought SmartyPants tried to be "smart" in this case, but it
does not. Great!

* * *
Post by A. Pagaltzis
I don’t see the harm in having transliterations available. They
would be useful when you really want to do something that the
language settings would not allow. SmartyPants would just be
useless to germanophones if they *had* to use these.
I completely agree with this way of thinking.

To summarize, an upcoming "international" SmartyPants could do this:

| Input | Output |
| -------- | ------------------------ |
| " | “smart” [customizable] |
| ' | ‘smart’ [customizable] |
| `` | “ |
| ` | ‘ |
| '' | ” |
| ,, | „ |
| << | « |
| >> | » |

with some some presets for different languages.

Making `<<` work correctly alongside tags could be a problem however.


Michel Fortin
***@michelf.com
http://www.michelf.com/

Jelks Cabaniss
2005-10-20 04:09:55 UTC
Permalink
Post by Aaron Swartz
Post by John Gruber
because the digit-digit-s is a dead giveaway as to the writer's
intent.
What about that disco classic '70s Bottles of Beer On The Wall'?
Disco? What about "rock 'n roll"?


/Jelks
Damian Cugley
2005-10-20 17:43:03 UTC
Permalink
Post by Jelks Cabaniss
Disco? What about "rock 'n roll"?
Worse—it should have two apostrophes, as in 'rock 'n' roll' ('rock 'n'
roll'). Almost any quotation guessification system is going to have
trouble getting that right.

Trying to generate correct quotation marks mechanically is always
going to be a lot of trouble. It is a little bit crazy that computer
keyboards still have weird symbols like § ± ` ~ on them but not the
marks of quotation. I have seen keyboards with keys for &#x2018; and
&#x2019; (turned comma and apostrophe), but these were a specialized
typesetting system. French keyboards do not match QUERTY keyboards, so
it seems crazy to me that they don't have dedicated « and » keys.

The convention used in TeX and Unix of letting ` stand in for an
opening quotation mark had a certain amount to be said for it, and if
the standards committees had spent less time castigating people for
conflating U+0060 and U+2018 and instead ratified a variant of ISO
8859-1 that substituted U+2018 and U+2019 in positions 0x60 and 0x27
then the world would be a better place. They could add œ and Œ
ligatures while they were at it, so that it would be possible to use
this charset to write French, a language strangely undersupported by
ISO.

Anyway, none of the above is germaine to Smartypants. My
recommendation--for what it's worth--is that Smartypants should stick
to the fairly simple and comprehensible system it is at present. The
exceptions are rare enough that leaving autors to write &#x2018;n' on
occasion should be acceptable Trying to add language-specific
cleverness is openning a can of worms; see the attempts to get HTML
4's rash promises of language-sensitive quotation marks in Mozilla for
an example <https://bugzilla.mozilla.org/show_bug.cgi?id=16206>

--
Damian Cugley, Alleged Literature
http://www.alleged.org
Michel Fortin
2005-10-20 19:25:03 UTC
Permalink
Post by Damian Cugley
Anyway, none of the above is germaine to Smartypants. My
recommendation--for what it's worth--is that Smartypants should
stick to the fairly simple and comprehensible system it is at
present. The exceptions are rare enough that leaving autors to
write &#x2018;n' on occasion should be acceptable. Trying to add
language-specific cleverness is openning a can of worms; see the
attempts to get HTML 4's rash promises of language-sensitive
quotation marks in Mozilla for an example <https://
bugzilla.mozilla.org/show_bug.cgi?id=16206>
I agree with what you say. I'll just add that I almost never write
entities nowadays, I type directly the characters that I want: with
unicode on my website, I see little reason not to.

Your bugzilla reference gives me an idea: maybe Markdown could create
`<q>` tags on the fly when it encounters quotes... Ok, just kidding!


Michel Fortin
***@michelf.com
http://www.michelf.com/
Jelks Cabaniss
2005-10-20 19:24:13 UTC
Permalink
Post by Damian Cugley
Post by Jelks Cabaniss
Disco? What about "rock 'n roll"?
Worse—it should have two apostrophes, as in 'rock 'n' roll' ('rock 'n'
roll').
Thank you. (How embarrassing!)
Post by Damian Cugley
Almost any quotation guessification system is going to have
trouble getting that right.
You could $cockney-fy "rock 'n' roll" along with "'tis" and the others.
(You have to wonder how Shakespeare would feel about his "Whether 'tis
nobler in the mind ..." being referred to as "cockney". Slings and arrows
indeed.)

Aaron's example, OTOH, would slip through everything.
Post by Damian Cugley
Trying to generate correct quotation marks mechanically is always
going to be a lot of trouble.
Having written a "Smart Quotes" clip for my text editor -- long before I'd
heard of Smartypants -- I agree. But I've found that mechanical generation
+ a "$cockney" override handles 99%+ of the bulk of texts I've converted.
Post by Damian Cugley
It is a little bit crazy that computer
keyboards still have weird symbols like § ± ` ~ on them but not the
marks of quotation.
Or em/en-dashes, or copyright/trademark/registered symbols, or [whatever
your local use cases]. (My U.S. keyboard does not have § or ± on it, BTW
and FWIW.) There are only so many keys. Besides the usual "chorded"
(ALT/Option) input methods and the tortuous "Insert Symbol" methods, most
OSes do have something along these lines available:

http://tinyurl.com/4fymo (Microsoft)


/Jelks
Jon Noring
2005-10-20 20:06:47 UTC
Permalink
Post by Jelks Cabaniss
Worse -- it should have two apostrophes, as in 'rock 'n' roll'
Thank you. (How embarrassing!)
Both the single and double quote-like characters are used for a
variety of purposes, and not only for quotations and grammatical
apostrophes.

There's also the "prime" and "double prime" as used in mathematics
(differential, for example), the "foot and inch marks" (like 6'2"),
angle of arc (5 deg. 25' 33"), as a non-breaking modifier (like
"Ja'afar" from the 1001 Arabian Nights -- note, this is NOT an
apostrophe!)

Of course, the recommended approach is to use the appropriate
typographic characters from Unicode -- then all is solved. All the
above-mentioned uses of quote-like marks (including the typographic
left and right "curly" quotes to define a quotation) have their own
Unicode characters. There will be no ambiguities and no need to do
weird gyrations to differentiate the several uses of quote-like marks.

Of course, if ASCII-conformance is required, where one can only use
the keyboard " and ' characters in a "master" document, then games
have to be played to differentiate the " and ' used as quotation marks
from the other uses of these characters.

Jon Noring
Jelks Cabaniss
2005-10-20 20:59:00 UTC
Permalink
Post by Jon Noring
There's also the "prime" and "double prime" as used in mathematics
(differential, for example), the "foot and inch marks" (like 6'2"),
angle of arc (5 deg. 25' 33"), as a non-breaking modifier (like
"Ja'afar" from the 1001 Arabian Nights -- note, this is NOT an
apostrophe!)
Indeed. That echoes what Damian was saying -- "smart quoting" can only go
so far (precluding AI :). I've run up against the foot and inch marks issue
myself before.

But even with all these counter-examples, it's amazing how much common
everyday text is converted perfectly.
Post by Jon Noring
Of course, if ASCII-conformance is required, where one can only use
the keyboard " and ' characters in a "master" document, then games
have to be played to differentiate the " and ' used as quotation marks
from the other uses of these characters.
Or *selectively* smart-quote.


/Jelks
François Granger
2005-10-21 16:31:34 UTC
Permalink
Post by Michel Fortin
Disclaimer: This is all according to the table, which may contain
errors. (It doesn't seem completely right to me for French so...)
The french is ok, but one detail: the note says "1Ž4-em / non-break". This
is not exactly the rule. The space is mandatory, it is a thin space (une
espace fine). The size of this space vary according to the font used but is
usually 1/3 of the "quadratin" wich may be bigger than the em space.
--
je soutiens Médicalistes : http://www.medicalistes.org/divers/adherer.php
John Gruber
2005-10-21 18:08:50 UTC
Permalink
Post by Michel Fortin
I'm not sure what you mean when you say it adapts badly to French
typography. In French we quote text « this way » and sometime we use
English-style double quotes. Are you suggesting SmartyPants should
replace "double quotes" with « angle quotes » inside French text?
I have gotten a significant number of requests for this, actually.
Post by Michel Fortin
The French version of Microsoft Word does this and I always hated it.
It's a pain when you want mix languages. Not only that, but English-
style double quotes are often used for inner quotations in French,
pretty hard to do with Word.
Any such support in SmartyPants would be an option, not a default.
Post by Michel Fortin
If SmartyPants was to do something good for the French language, I
think it could replace <<this approximation of french quotes>> with
«something better».
That's another popular request from writers using languages that use
guillemets.
Post by Michel Fortin
I made a PHP script for the French part of my website that deals with
that. By default it doesn't add any space, it only *replaces* normal
spaces with unbreakable spaces where appropriate. This way it
shouldn't cause any harm to English or (hopefully) other languages.
You can also set it to impose a space or no space before and after
such marks.
The script can also replace spaces at other predefined places, let's
say "p. 12", "10 Kb", etc. Obviously, this list can be useful in any
language, and should be localizable.
I haven't published my script yet, but since we are at it, I'm open
to the idea it could merged with SmartyPants, if there is some
interest and John thinks it is appropriate. Otherwise I'll just
release it as a separate project, as planned initially.
Could be a good addition. More or less fits as a SmartyPants feature
if you think of SmartyPants as a system for controlling all things
related to web typography and layout, not just punctuation.

-J.G.
Loading...