Discussion:
MultiMarkdown and MathML - new feature and request for help
Fletcher T.Penney
2006-06-10 21:02:22 UTC
Permalink
A lot of people have expressed interest in combining math features
with Markdown, but I am not aware of any real developments from these
requests.

I was looking around and toying with [ASCIIMathPHP](http://
www.jcphysics.com/ASCIIMath/) and integrated it with MultiMarkdown
and my xhtml2latex XSLT transforms.

You can include math as an inline formula by using a markup similar
to inline code, such as ``x^2 + y^2 = 1`` (note the double ``).

You can include a formula as a separate paragraph in the same way, or
with a leading tab like:

`x_(1,2) = (-b+-sqrt(b^2-4ac))/(2a)`

(Note the single use of ` when prefaced by a tab)

The leading tab is not required, but is allowed as I suspect most
people would like to be able to indent the formula to distinguish it
from regular text, and don't want it interpreted as a code block.


The processing occurs in a several steps:

1) ASCIIMathPHP is run on the source markdown document converting the
formulas into MathML blocks.

2) MultiMarkdown is then run in the usual manner, with or without
SmartyPants

3) You can then optionally use XSLT to transform the XHTML into LaTeX
or whatever by using an updated version of my xhtml2latex, or your
own XSLT files. This includes the use of [XSLT MathML Library]
(http://xsltml.sourceforge.net/) to convert the MathML into LaTeX.

This allows you to generate XHTML with embedded MathML, or to
generate LaTeX source with the math properly displayed. More
importantly, it allows you to use ASCIIMath to enter your formulas
which is MUCH more human readable than either MathML or LaTeX.



THE PROBLEM

I have run into a couple of snags, that I am sure would be quite
simple to fix if I knew more about XHTML and XSLT. To demonstrate, I
have included a sample plain text file (.txt), as well as two
versions of XHTML output - (.xhtml and .html) and a LaTeX file
(.tex). These were all generated automatically from the source plain
text file.


The .xhtml setup was designed using the layout from http://
www.mozilla.org/projects/mathml/authoring-example.xhtml

It renders properly in Firefox (with the exception of that ugliness
in square root signs, but that is not my doing...), but I cannot use
this to go through my xhtml2latex.xslt workflow. It appears that the
problem lies in the xmlns attribute:

<html xmlns="http://www.w3.org/1999/xhtml">

If I remove that attribute, the xsltproc stuff works fine.


The .html file is built using the usual DOCTYPE etc from
MultiMarkdown. It does not render properly in Firefox, but it DOES
go through my xhtml2latex.xslt files properly, to generate valid
LaTeX output (i.e. the .tex file).

Also, if I take the .xhtml file and rename it with a .html extension,
it no longer renders properly in Firefox.


Once you have the .tex file, it goes without a snag through pdflatex
to generate a pdf.



REQUEST FOR HELP!!!

I would appreciate any input available in how to smooth this process
a bit, specifically:

1) How can I create a valid document with a .html extension?
(requiring .xhtml is going to break a bunch of other stuff)

2) How do I fix my xhtml2latex stylesheets (http://
fletcher.freeshell.org/wiki/Markdown_and_XML) to work with a file
that has the xmlns attribute applied to the html node? (Or with
whatever comes out of an answer to #1 above)

3) Any suggestions on the markup syntax? I sort of arbitrarily
chose the use of an extra `. I am sure there is a better way of
doing this, and would love to hear input.

4) And less importantly, is there a perl version of ASCIIMath out
there somewhere? It would be great to be able to combine the code,
but this can be worked around. Definitely lower on the priority list.


Once I get all of this working more smoothly, I will release the
updated software for public use. For now, you can see how it works
with sample documents


Thanks in advance!

Fletcher


PS> Yes, I realize that the included mathematical equations are not
all correct. It's just a demo...
--
Fletcher T. Penney
***@alumni.duke.edu

I'll go through life either first class or third,
but never in second.
- Noel Coward
Michel Fortin
2006-06-10 23:18:04 UTC
Permalink
Post by Fletcher T.Penney
1) How can I create a valid document with a .html extension?
(requiring .xhtml is going to break a bunch of other stuff)
MathML (or any other XML language for that matter) is not recognized
when the browser parse the file with the HTML parser. So for MathML
to work, you need ".xhtml" or a "application/xhtml+xml" MIME type.
And yes, unfortunately, that can break other stuff.
Post by Fletcher T.Penney
2) How do I fix my xhtml2latex stylesheets (http://
fletcher.freeshell.org/wiki/Markdown_and_XML) to work with a file
that has the xmlns attribute applied to the html node? (Or with
whatever comes out of an answer to #1 above)
Instead, try to answer this question: how to specify an XML namespace
from the stylesheet?
Post by Fletcher T.Penney
3) Any suggestions on the markup syntax? I sort of arbitrarily
chose the use of an extra `. I am sure there is a better way of
doing this, and would love to hear input.
Well, if you want to be sure Markdown doesn't change the content of
your expression, you could wrap it into a processing instruction tag:

<?ascii2math x_(1,2) = (-b+-sqrt(b^2-4ac))/(2a) ?>

This is less pretty, but has less chance to break too. With your
syntax, what happens if you have some `<` in a formula? With the one
above, only `?>` would be problematic, but you shouldn't see that too
often in maths.
Post by Fletcher T.Penney
4) And less importantly, is there a perl version of ASCIIMath out
there somewhere? It would be great to be able to combine the code,
but this can be worked around. Definitely lower on the priority list.
I'm not aware of any.


Michel Fortin
***@michelf.com
http://www.michelf.com/
Fletcher T. Penney
2006-06-10 23:53:18 UTC
Permalink
Post by Michel Fortin
Post by Fletcher T.Penney
1) How can I create a valid document with a .html extension?
(requiring .xhtml is going to break a bunch of other stuff)
MathML (or any other XML language for that matter) is not
recognized when the browser parse the file with the HTML parser. So
for MathML to work, you need ".xhtml" or a "application/xhtml+xml"
MIME type. And yes, unfortunately, that can break other stuff.
Placing

<meta http-equiv="content-type" content="application/xhtml+xml" />

in the .html file doesn't work (I had already tried...) Is there
another way around this, or must the extension be .xhtml? Surely
something as simple as a filename extension is not this important on
the web in 2006...
Post by Michel Fortin
Post by Fletcher T.Penney
2) How do I fix my xhtml2latex stylesheets (http://
fletcher.freeshell.org/wiki/Markdown_and_XML) to work with a file
that has the xmlns attribute applied to the html node? (Or with
whatever comes out of an answer to #1 above)
Instead, try to answer this question: how to specify an XML
namespace from the stylesheet?
Ok, I'll bite... So how does one do this?
Post by Michel Fortin
Post by Fletcher T.Penney
3) Any suggestions on the markup syntax? I sort of arbitrarily
chose the use of an extra `. I am sure there is a better way of
doing this, and would love to hear input.
Well, if you want to be sure Markdown doesn't change the content of
<?ascii2math x_(1,2) = (-b+-sqrt(b^2-4ac))/(2a) ?>
This is less pretty, but has less chance to break too. With your
syntax, what happens if you have some `<` in a formula? With the
one above, only `?>` would be problematic, but you shouldn't see
that too often in maths.
Definitely less pretty. I have no desire to require something like
"<?ascii2math ... ?>". This is clearly not what Gruber intended
when he wrote, "without looking like it’s been marked up with tags or
formatting instructions." And this is exactly what I have tried to
avoid with MultiMarkdown-specific markup. I don't mind an
unobtrusive quotation mark or two, but want to avoid things that look
remotely like "programming"...

As for the "<" issue, I don't follow. There is no use of "<" or ">"
in my current ascii2math syntax, and those symbols work just fine.


Fletcher
--
Fletcher T. Penney
***@alumni.duke.edu

I have noted that persons with bad judgment are most insistent that
we do what they think best.
- Lionel Abel
Michel Fortin
2006-06-12 02:55:13 UTC
Permalink
Post by Fletcher T. Penney
Post by Michel Fortin
Post by Fletcher T.Penney
1) How can I create a valid document with a .html extension?
(requiring .xhtml is going to break a bunch of other stuff)
MathML (or any other XML language for that matter) is not
recognized when the browser parse the file with the HTML parser.
So for MathML to work, you need ".xhtml" or a "application/xhtml
+xml" MIME type. And yes, unfortunately, that can break other stuff.
Placing
<meta http-equiv="content-type" content="application/xhtml+xml" />
in the .html file doesn't work (I had already tried...) Is there
another way around this, or must the extension be .xhtml? Surely
something as simple as a filename extension is not this important
on the web in 2006...
When I talked about the MIME type, I was really talking about the
HTTP "Content-Type" header sent by the web server. Using a <meta>
element does not trigger the browser into using the XML parser, but
having a ".xhtml" will make most servers send the right content-type.

If your problem is really the file extension, a server could be
configured to send a ".html" file just like if it was
".xhtml" (sending the "application/xhtml+xml" content type).

If the problem is how the browser handles the page when parsed as
XML, I'm afraid there is no way around this that I know of. Either
you use the XML parser and MathML does work in Firefox, either you
use the HTML parser and MathML does not work.

I'd point out that a discussion list about Markdown may not be the
best place to ask about problems browsers have with XML parsing and
MathML content; you could probably get better answers elsewhere.
Post by Fletcher T. Penney
Post by Michel Fortin
Instead, try to answer this question: how to specify an XML
namespace from the stylesheet?
Ok, I'll bite... So how does one do this?
I have no idea. I'm sure it's pretty simple as XSL was built for XML
and namespaces are a so important concept of XML, but I've almost no
experience with XSL. I was only giving you something more precise for
your search.
Post by Fletcher T. Penney
Post by Michel Fortin
Post by Fletcher T.Penney
3) Any suggestions on the markup syntax? I sort of arbitrarily
chose the use of an extra `. I am sure there is a better way of
doing this, and would love to hear input.
Well, if you want to be sure Markdown doesn't change the content
of your expression, you could wrap it into a processing
<?ascii2math x_(1,2) = (-b+-sqrt(b^2-4ac))/(2a) ?>
This is less pretty, but has less chance to break too. With your
syntax, what happens if you have some `<` in a formula? With the
one above, only `?>` would be problematic, but you shouldn't see
that too often in maths.
Definitely less pretty. I have no desire to require something like
"<?ascii2math ... ?>". This is clearly not what Gruber intended
when he wrote, "without looking like it’s been marked up with tags
or formatting instructions." And this is exactly what I have tried
to avoid with MultiMarkdown-specific markup. I don't mind an
unobtrusive quotation mark or two, but want to avoid things that
look remotely like "programming"...
As for the "<" issue, I don't follow. There is no use of "<" or
">" in my current ascii2math syntax, and those symbols work just fine.
Well, I'm happy to know it works. Somehow I thought you were parsing
Markdown before math blocks (that should have caused that result). By
rereading your first mail, I see that ASCII2Math runs before
Markdown, so that problem won't be there and my suggestion of using a
processing instruction tag becomes irrelevant.

As John pointed out, your syntax conflicts somewhat with Markdown
already, and I'd also bet that you can't put an example of your math
syntax inside a code block or a code span without having it
transformed with tags (because math is being handled before code
spans and code blocks). That will be hard to avoid unless you add a
new step inside Markdown for math processing. Depending on what you
intend to do with this, the issue may or may not have much importance
however.


Michel Fortin
***@michelf.com
http://www.michelf.com/
Dr. Drang
2006-06-11 02:49:35 UTC
Permalink
Post by Fletcher T.Penney
A lot of people have expressed interest in combining math features
with Markdown, but I am not aware of any real developments from these
requests.
I was looking around and toying with [ASCIIMathPHP](http://
www.jcphysics.com/ASCIIMath/) and integrated it with MultiMarkdown
and my xhtml2latex XSLT transforms.
You can include math as an inline formula by using a markup similar
to inline code, such as ``x^2 + y^2 = 1`` (note the double ``).
You can include a formula as a separate paragraph in the same way, or
`x_(1,2) = (-b+-sqrt(b^2-4ac))/(2a)`
(Note the single use of ` when prefaced by a tab)
The leading tab is not required, but is allowed as I suspect most
people would like to be able to indent the formula to distinguish it
from regular text, and don't want it interpreted as a code block.
As Fletcher may recall, I had a math-enabled version of MultiMarkdown
going several months ago. It differed from his in that

* it used $$ and $$$ as the inline and display equation delimiters,
respectively;
* the equations had to be written in LaTeX form, not ASCIIMathML; and
* it output HTML that was to be interpreted by [jsMath][1], not ASCIIMathML.

The double and triple dollar signs were chosen because they reminded
me of the usual TeX delimiters, but left the single dollar sign with
its usual, non-TeX, meaning.

I chose jsMath over ASCIIMathML because it works on more browsers.
jsMath is essentially a JavaScript rewrite of the TeX equation
formatting engine. It works on browsers that don't understand MathML.
True, the ASCIIMathML syntax is more Markdown-like, but TeX is pretty
well established as an equation syntax (for those of you who are not
native speakers of English, that's a bit of Midwestern
understatement).

One of the other advantages of jsMath is that it's easy to write XSLT
for it. jsMath uses <span class="math">...</span> to delimit inline
equations and <div class="math">...</div> for display equations. These
are easy to convert into $ and $$ for LaTeX.

<!-- inline math -->
<xsl:template match="span[@class='math']">
<xsl:text>$</xsl:text>
<xsl:value-of select="node()"/>
<xsl:text>$</xsl:text>
</xsl:template>

<!-- display math -->
<xsl:template match="div[@class='math']">
<xsl:text>$$</xsl:text>
<xsl:value-of select="node()"/>
<xsl:text>$$</xsl:text>
<xsl:value-of select="$newline"/>
<xsl:value-of select="$newline"/>
</xsl:template>

Recently, however, [I decided to give up on my fork of
MultiMarkdown][2] and go back to the standard version. Fletcher is too
damned prolific, and I didn't want to keep inserting my code into
newer and newer versions of MultiMarkdown. To this end, I started
delimiting my equations with \\( and \\) for inline and \\[ and \\]
for display. Regular MultMarkdown (or Markdown itself) just turns the
double backslash into a single. jsMath has a preprocessor, the
tex2math plugin, that will take these LaTeX delimiters and convert
them to <span class="math"> and <div class="math"> before the usual
jsMath processing. This is what I use for equations in [my ongoing
solutions manual][3] for Den Hartog's *Mechanics* textbook.

For documents that are to be run through LaTeX for paper output, I
pass them through

1. a preprocessor that converts the \\(..\\) and \\[...\\] to <span
class="math">...</span> and <div class="math">...</div>;
2. MultiMarkdown
3. an XSLT processor using my stylesheet with the above stanzas for math.

Having written this long message, how is it responsive to Fletcher's
post? First, I'd prefer LaTeX syntax and jsMath to ASCIIMathML. My
preference comes from the wider applicability of jsMath, but it turns
out that XSL is easier, too. And since you're using LaTeX for final
processing (if you're publishing to paper), it makes sense to use
LaTeX syntax for the equations.

As for delimiters, I'd prefer something a little more TeX-like than
backquotes, but I'm flexible. I had about 20 posts up at the
*Mechanics* site with the multiple dollar sign notation before I
decided to give up on my fork of MultiMarkdown; now I have about 40
posts up using the \\( and \\[ notation.
--
Dr. Drang

[1]: http://www.math.union.edu/~dpvc/jsMath/welcome.html
[2]: http://www.leancrew.com/all-this/2006/05/mechanics_blog_update.html
[3]: http://www.leancrew.com/mechanics
A. Pagaltzis
2006-06-13 05:00:26 UTC
Permalink
Post by Dr. Drang
First, I'd prefer LaTeX syntax and jsMath to ASCIIMathML. My
preference comes from the wider applicability of jsMath, but it
turns out that XSL is easier, too. And since you're using LaTeX
for final processing (if you're publishing to paper), it makes
sense to use LaTeX syntax for the equations.
You may want to take a look at itex2MML, if you haven’t already.
Jacques Distler uses it in combination with Markdown on his
weblog.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
Johannes Grosse
2006-06-13 06:06:56 UTC
Permalink
Hello to everybody,

I am glad to see that there seems to be some movement towards a
semi-official math framework.

Currently I use a simple perl script which reads LaTeX commands from
the markdown text and feeds them into latex to produce images. Of
course LaTeX does not have a particularly beautiful syntax, but it
works, is decently fast (less then 1 second for the example page below)
and the images are understood by any browser.
See <http://wwwth.mppmu.mpg.de/members/jgrosse/texdown/TeXdown-Readme.html>
for an example sheet. (Don't be frightened by the 0.2 version. ;)

Is there any way of telling Markdown (in an unobstrusive way) that
some pieces should be processed by other software? (I envision some
sort of light-weight, readable cweb, without rearrangement of pieces.)

Just a thought, which---I am aware---is probably way to ugly for a
lean Markdown, so please be gentle, regards
--
Johannes Grosse <http://wwwth.mppmu.mpg.de/members/jgrosse/>
Dr. Drang
2006-06-13 14:23:37 UTC
Permalink
You may want to take a look at itex2MML, if you haven't already.
Jacques Distler uses it in combination with Markdown on his
weblog.
I was hesitant to get into itex when I first looked at it because it
is described as a "dialect" of LaTeX,. Since I planned on using LaTeX
to produce paper output, I wanted real LaTeX. I've learned, however,
that the dialect is so close I may never run across the differences in
my work.

But for web output itex2MML has the same disadvantage as ASCIIMathML:
you need a MathML-enabled browser to view it. Firefox is fine
(although the Mac version stopped doing MathML for a few revisions);
IE works with a plugin, I guess; Safari is out; and I don't know about
Opera. Since a lot of my current visitors use Safari (me, too), it
wouldn't be right to shut them (or me) out. I really wish the WebKit
developers took MathML more seriously, but until they do I'm sticking
with jsMath.

Actually, I'm less concerned about having different backends for
producing math on web pages than I am about having different math
notations in the Markdown source documents. And I'm less concerned
about the equation delimiters--although I agree with JG that they
should not conflict with current Markdown syntax--than I am about
what's between the delimiters. Changing how you process documents is
simpler than changing the documents themselves; it is, at least, when
you have dozens or hundreds of documents.

Since the intent of MultiMarkdown is to produce XHTML that can be
easily transformed into LaTeX, and since LaTeX notation is commonly
used by people who need equations in their work, I think LaTeX should
be the notation used. Now, there may be a bright, MathML- and
CSS-drenched future in which I can print a web page and have it look
as good as a LaTeX document does now. When that day comes, it won't
matter what notation I used to get my equations. But until then, I
think LaTeX is our best bet.
--
Dr. Drang
Fletcher T. Penney
2006-06-13 22:10:12 UTC
Permalink
Post by Dr. Drang
you need a MathML-enabled browser to view it. Firefox is fine
(although the Mac version stopped doing MathML for a few revisions);
IE works with a plugin, I guess; Safari is out; and I don't know about
Opera. Since a lot of my current visitors use Safari (me, too), it
wouldn't be right to shut them (or me) out. I really wish the WebKit
developers took MathML more seriously, but until they do I'm sticking
with jsMath.
I agree that the lack of standardized support by more browsers for
MathML is an issue, though I suppose this will improve with time. I
fully admit, however, that is simply a guess on my part.
Post by Dr. Drang
Actually, I'm less concerned about having different backends for
producing math on web pages than I am about having different math
notations in the Markdown source documents. And I'm less concerned
about the equation delimiters--although I agree with JG that they
should not conflict with current Markdown syntax--than I am about
what's between the delimiters. Changing how you process documents is
simpler than changing the documents themselves; it is, at least, when
you have dozens or hundreds of documents.
Changing the delimiters is a non-issue. I didn't think about the
conflict with writing about ` characters in a code block. There will
be a new delimiter, I just need to figure out what. Input welcome.

I agree about a standardized notation, however. And for me, that
means something that is as close to human readable as possible, in
the spirit of Markdown.
Post by Dr. Drang
Since the intent of MultiMarkdown is to produce XHTML that can be
easily transformed into LaTeX,
Well, the intent is to have a common plain text syntax that is human
readable that can be transformed into a variety of output formats,
including, but not limited to, XHTML and LaTeX. As for quality math
layout, I suspect that XHTML and LaTeX (or dialects) are the only
real major options for now.
Post by Dr. Drang
and since LaTeX notation is commonly
used by people who need equations in their work, I think LaTeX should
be the notation used
Here's where I disagree. By this logic, we should all write in
XHTML. The point of Markdown and Multimarkdown is to simplify this,
and allow a plain text "email-like" document to be used to create
valid, high quality XHTML, that can optionally be used to create
other formats. I _don't_ want to have to learn LaTeX in order to
write math that can then not be easily read by non-LaTeX folks. The
ASCIIMath syntax appears to me to be a viable alternative for _most_
math needs (perhaps not all), and it appears to be easily read in raw
form, just like Markdown. To me they seem like similar approaches to
related problems.

I plan on sticking with ASCIIMath (or something similar) in native-
MultiMarkdown, but I don't plan on changing anything to break
compatibility with custom approaches (such as yours) to allow raw
LaTeX within MultiMarkdown. I agree that MathML is not the best
output format, but it seems like the best alternative for the moment
(I want the document to stand on it's own, not to need a javascript
engine to do on-the-fly processing...)
Post by Dr. Drang
Now, there may be a bright, MathML- and
CSS-drenched future in which I can print a web page and have it look
as good as a LaTeX document does now. When that day comes, it won't
matter what notation I used to get my equations. But until then, I
think LaTeX is our best bet.
Agreed. Even a fairly default LaTeX document has a level of quality
that seems to exceed anything else I have seen. (Though Apple's Pages
can come close. If I could figure out how to easily combine Pages
and Markdown.... ;)

But my support for LaTeX's output does not extend to it's syntax.
Don't get me wrong, it's not bad, but (Multi)Markdown is better. And
I believe that there should be something similar for math.

John Gruber
2006-06-11 16:30:36 UTC
Permalink
Post by Fletcher T.Penney
You can include math as an inline formula by using a markup similar
to inline code, such as ``x^2 + y^2 = 1`` (note the double ``).
This conflicts with Markdown itself. In Markdown, you can use
multiple backticks as code delimiters so that you can include
literal backticks in the code span.

Input:

`` `backtick` ``

Output:

<code>`backtick`</code>

-J.G.
Loading...