Discussion:
seemingly no good way to end bulleted list and start code block
Matt Kraai
2007-10-06 23:37:21 UTC
Permalink
Howdy,

The following bug report was sent to the Debian BTS. Is there a way
to have a code block immediately follow an unordered list?

----- Forwarded message from Joey Hess <***@debian.org> -----

Consider this markdown:

* bla
* bla2

this should be treated as code block

and it is not ...

but if bullets are not above this, it works

If the first code block is indented with 8 spaces, or 2 tabs, it will be
treated as a nested code block inside the second list item, but there seems
to be no way to make it be seen as a code block that immediately follows
the list.

----- End forwarded message -----
--
Matt
Lou Quillio
2007-10-06 23:52:07 UTC
Permalink
Post by Matt Kraai
If the first code block is indented with 8 spaces, or 2 tabs, it will be
treated as a nested code block inside the second list item, but there seems
to be no way to make it be seen as a code block that immediately follows
the list.
A known shortcoming.

http://www.mail-archive.com/markdown-***@six.pairlist.net/msg00724.html

A secondary, explicit code block syntax would work around it. So
far, there's only some 'like' for that idea, no love.

LQ
A. Pagaltzis
2007-10-07 02:06:50 UTC
Permalink
there seems to be no way to make it be seen as a code block
that immediately follows the list.
Clumsy, but there is:

* bla
* bla2

<!-- -->

this should be treated as code block

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
Matt Kraai
2007-10-07 03:11:26 UTC
Permalink
Post by Matt Kraai
there seems to be no way to make it be seen as a code block
that immediately follows the list.
* bla
* bla2
<!-- -->
this should be treated as code block
Thanks for the workaround.
--
Matt
Michel Fortin
2007-10-07 02:41:48 UTC
Permalink
Post by Matt Kraai
Howdy,
The following bug report was sent to the Debian BTS. Is there a way
to have a code block immediately follow an unordered list?
As Aristotle pointed out, there's a workaround: use an HTML comment,
like that:

* bla
* bla

<!-- comment -->

this is a code block.

It's not optimal, but the only solution currently.

Note that creating a second paragraph in the list item is correct
behaviour, not a bug. That you can't write a code block immediately
following a list is unfortunate, but since it would be done using the
exact same characters as for adding a paragraph to a list item, it's
a design issue we can't really solve without changing some part of
the syntax.

- - -

This issue has been raised many times, and I've been personally biten
by it too. I also noticed that with PHP Markdown Extra, many new
features (footnotes, definition lists) behave like list items and
can't be immediately followed by an indented code block.

So I'm seriously thinking about adding a second (unindented) code
block syntax to PHP Markdown Extra that would avoid this issue
entirely. Something like this:

Regular paragraph

~~~
Code block
~~~

Of course, unless John adopts it for Markdown, it'll only benefit
users of PHP Markdown Extra.


Michel Fortin
***@michelf.com
http://michelf.com/
Lou Quillio
2007-10-07 03:37:13 UTC
Permalink
Post by Michel Fortin
Note that creating a second paragraph in the list item is correct
behaviour, not a bug.
And I should've been clearer that it's not a bug, and should've
linked directly to the workaround. Apologies.
Post by Michel Fortin
So I'm seriously thinking about adding a second (unindented) code block
syntax to PHP Markdown Extra that would avoid this issue entirely.
I'd welcome it, since I currently use an ugly wrapper to handle
this. It's enough for me to reference PHP Markdown Extra, if not
Markdown proper. If Andreas (Maruku) picked it up, better still.
Wish there were an 'Extra' for Python.

LQ
John MacFarlane
2007-10-07 03:55:07 UTC
Permalink
Post by Michel Fortin
So I'm seriously thinking about adding a second (unindented) code block
syntax to PHP Markdown Extra that would avoid this issue entirely.
Regular paragraph
~~~
Code block
~~~
I like the idea of doing something about this problem. I've been
thinking about adding something like this to pandoc, too, and it
would be good if possible to reach some agreement on a syntax.

If I recall, in the previous thread there were two suggestions for
solving the problem:

1. adding delimited unindented code blocks, along the lines of
what you suggested above

2. adopting the convention that two blank lines end a list (or
footnote or other block element in which indentation is used
to indicate continuation blocks).

Do people have views about the comparative advantages and disadvantages
of these approaches?

An advantage of (2) is that it provides a clean way to have two
consecutive lists of the same type.

An advantage of (1) is that it allows you to cut and paste code without
adding or removing indentation. It's also something that you see quite
often in emails, and thus goes along with markdown's general philosophy
of using established conventions.

However, I prefer ======== to ~~~~~~~~~~ for this purpose, because
====== is vertically centered on the line and gives you a more
symmetrical look. Compare:

~~~~~~~~~
example 1
~~~~~~~~~

=========
example 2
=========

As long as a blank line is required before the string of ======='s,
there is no danger of confusing this with a Setext-style header.

John
Michel Fortin
2007-10-07 12:47:08 UTC
Permalink
Post by John MacFarlane
Post by Michel Fortin
So I'm seriously thinking about adding a second (unindented) code block
syntax to PHP Markdown Extra that would avoid this issue entirely.
Regular paragraph
~~~
Code block
~~~
I like the idea of doing something about this problem. I've been
thinking about adding something like this to pandoc, too, and it
would be good if possible to reach some agreement on a syntax.
That would be good indeed. Here's what I have in mind at the moment:
the *flat* code block starts on a line with zero to three spaces, is
followed by three or more tilde `~`, then followed by optional space
and tabs and a newline. The code block spans until an identical line
is found, not counting whitespace after the marker.

If the marker is indented by 1 to 3 spaces, the corresponding number
of space is removed from the start of each line in the code block.
This means that, if you wish, you may indent the code block a little
to make it stand out more.

So basically it looks like this:

To install a package, you can now use this:

~~~
pear install michelf/package
~~~

where package is the name of the package to install from this
channel.

But since I didn't require a newline neither before nor after the
code block, you can write it in a more compact way if you prefer:

To install a package, you can now use this:
~~~
pear install michelf/package
~~~
where package is the name of the package to install from this
channel.

Using indentation, it looks like this:

To install a package, you can now use this:
~~~
pear install michelf/package
~~~
where package is the name of the package to install from this
channel.

or this:

To install a package, you can now use this:

~~~
pear install michelf/package
~~~

where package is the name of the package to install from this
channel.

I'm not sure the indentation feature is so useful. After all, you can
use the old syntax if you want indentation. What do you think?
Post by John MacFarlane
If I recall, in the previous thread there were two suggestions for
1. adding delimited unindented code blocks, along the lines of
what you suggested above
2. adopting the convention that two blank lines end a list (or
footnote or other block element in which indentation is used
to indicate continuation blocks).
Do people have views about the comparative advantages and
disadvantages
of these approaches?
An advantage of (2) is that it provides a clean way to have two
consecutive lists of the same type.
I'm not against option 2, but I don't see it as a replacement to
option 1 (for the reasons enumerated below).

It also has more potential of breaking existing documents. Imagine if
someone put multiple paragraphs and headers in a big list item, and
one header is preceded by two blank lines to make it stand out more.
The content of that list item would become a code block. Not pretty.

We could allow this only between list items: add an additional blank
line to break out of the current list; but not working for code
blocks. The worse that could happen to existing documents then is
that some lists could be broken into separate consecutive lists;
that's much less damaging than turning some list item's content into
a code block.
Post by John MacFarlane
An advantage of (1) is that it allows you to cut and paste code without
adding or removing indentation. It's also something that you see quite
often in emails, and thus goes along with markdown's general
philosophy
of using established conventions.
I will add that it makes possible white lines at the start or at the
end of a code block, as well as multiple consecutive code blocks.
Post by John MacFarlane
However, I prefer ======== to ~~~~~~~~~~ for this purpose, because
====== is vertically centered on the line and gives you a more
symmetrical look.
That depends on the font. For instance, tildes are vertically
centered as I look at them in Monaco 13 in my email client. I agree
it's a drawback however: many fonts display tildes above the middle
line.
Post by John MacFarlane
As long as a blank line is required before the string of ======='s,
there is no danger of confusing this with a Setext-style header.
For a parser aware of the new syntax, you're right. To a regular
Markdown parser, it'll look like a paragraph followed by a header. To
a regular Markdown user, it'll just look odd, especially if you add
more lines and blank lines in your code block. For instance:

=========
foreach ($this->document_gamut as $method => $priority) {
$text = $this->$method($text);
}

return $text;
=========

I share your concern about tilde being not always vertically
centered, but I still think it's better -- and less confusing -- to
use something clearly different from other Markdown constructs.


Michel Fortin
***@michelf.com
http://michelf.com/
John MacFarlane
2007-10-07 16:10:36 UTC
Permalink
I'm not sure the indentation feature is so useful. After all, you can use
the old syntax if you want indentation. What do you think?
I'd prefer to keep it simple and leave out the indentation feature.
I'm not against option 2, but I don't see it as a replacement to option 1
(for the reasons enumerated below).
It also has more potential of breaking existing documents. Imagine if
someone put multiple paragraphs and headers in a big list item, and one
header is preceded by two blank lines to make it stand out more. The
content of that list item would become a code block. Not pretty.
Good point.
We could allow this only between list items: add an additional blank line
to break out of the current list; but not working for code blocks. The
worse that could happen to existing documents then is that some lists could
be broken into separate consecutive lists; that's much less damaging than
turning some list item's content into a code block.
This would complicate parsing quite a bit. At this point I'm inclined
to keep changes as simple as possible, and just to implement option (1)
without any version of (2).

On the issue of ~~~ vs ===, you give two reasons for preferring ~~~:

(a) Because ~~~ is not used for Setext headers, we would not need to
require a blank line before a code block. You could have a code block
~~~
like this
~~~
which would not be possible with ===.

(b) Because === already has a use in markdown, using it to mark off
unindented code blocks might confuse people and parsers who aren't
familiar with the new syntax. Non-extended markdown parsers would
parse these code blocks as a regular paragraph followed by a header.

On (a): Partly because ~ already has a use in pandoc for inline
text formatting (~~strikeout~~ and ~subscript~), and partly because
it makes parsing easier, I'd be in favor of requiring blank lines before
and after the new-style code blocks.

On (b): Non-extended markdown parsers will make a mess of the
new code blocks with either syntax, since they won't know to interpret
the text between ~~~ as verbatim text. I don't see a big advantage
here for the ~~~ syntax. Also, as I noted, in pandoc ~s are already used
to indicate ~~strikeout text~~ and ~subscripts~. I can see that the ===
syntax might cause problems with existing syntax highlighters, though.

Perhaps an alternative would be to use ++++s instead of ~~~~s.
Advantages: Not currently used for anything in markdown or extensions,
vertically centered on the line in most fonts. Disadvantage: ugly?

I'd be interested in hearing what others think. Although I have a
preference for ===, I'd be willing to go with ~~~ just to prevent a
proliferation of different syntax extensions.

One more thought. I think it would be useful to allow something like
this:

~~~(haskell)~~~~~~~~~~~~~~~~~~~~~~~~~~~~
inlineNote = try $ do
failIfStrict
char '^'
contents <- inlinesInBalanced "[" "]"
return $ Note [Para contents]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If the string of ~~~~s that introduces the code block contains a
parenthesized string, this would be treated as the "class" attribute of
the code block. This would make it possible to postprocess the
output with a syntax highlighter, or use a javascript syntax
highlighter.

John
Milian Wolff
2007-10-07 16:17:00 UTC
Permalink
Post by John MacFarlane
One more thought. I think it would be useful to allow something like
~~~(haskell)~~~~~~~~~~~~~~~~~~~~~~~~~~~~
inlineNote = try $ do
failIfStrict
char '^'
contents <- inlinesInBalanced "[" "]"
return $ Note [Para contents]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
That one looks really good!
--
Milian Wolff
http://milianw.de
OpenPGP key: CD1D1393
Waylan Limberg
2007-10-08 03:41:09 UTC
Permalink
Figures, I respond to the old discussion, then see the new one. Oh
well, Micheal covered my points in more detail here.

On 10/7/07, John MacFarlane <***@berkeley.edu> wrote:
[snip]
Post by John MacFarlane
On (b): Non-extended markdown parsers will make a mess of the
new code blocks with either syntax, since they won't know to interpret
the text between ~~~ as verbatim text. I don't see a big advantage
here for the ~~~ syntax.
Well, with Michael's proposed indentation rules, the code could still
be indented if it's ever expected to be feed to non-extended parsers.
The only difference being that you lose two consecutive blocks. But
with the two snippets wrapped in tildes (or whatever we go with) it
should be clear to the reader that they are separate. I think this is
reason enough to keep the proposed indentation options.

[snip]
Post by John MacFarlane
One more thought. I think it would be useful to allow something like
~~~(haskell)~~~~~~~~~~~~~~~~~~~~~~~~~~~~
inlineNote = try $ do
failIfStrict
char '^'
contents <- inlinesInBalanced "[" "]"
return $ Note [Para contents]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I like it. Much better than the solution I currently have implemented
for python [1].

[1]: http://achinghead.com/markdown/codehilite/
--
----
Waylan Limberg
***@gmail.com
Michel Fortin
2007-10-08 11:39:50 UTC
Permalink
Post by Waylan Limberg
Well, with Michael's proposed indentation rules, the code could still
be indented if it's ever expected to be feed to non-extended parsers.
The only difference being that you lose two consecutive blocks.
Not really. My indentation rules don't interfere with the current
indented code block syntax. The indentation is limited to *three
spaces maximum*; any four-space-indented block is processed exactly
as before. I think it's important to keep the old syntax intact.


Michel Fortin
***@michelf.com
http://michelf.com/
Waylan Limberg
2007-10-08 22:25:32 UTC
Permalink
Post by Michel Fortin
Post by Waylan Limberg
Well, with Michael's proposed indentation rules, the code could still
be indented if it's ever expected to be feed to non-extended parsers.
The only difference being that you lose two consecutive blocks.
Not really. My indentation rules don't interfere with the current
indented code block syntax. The indentation is limited to *three
spaces maximum*; any four-space-indented block is processed exactly
as before. I think it's important to keep the old syntax intact.
Ah, I missed that part. Yeah, in that case, leave the indentation out.
--
----
Waylan Limberg
***@gmail.com
Michel Fortin
2007-10-08 11:39:13 UTC
Permalink
Post by John MacFarlane
I'm not sure the indentation feature is so useful. After all, you can use
the old syntax if you want indentation. What do you think?
I'd prefer to keep it simple and leave out the indentation feature.
Ok. Noted.
Post by John MacFarlane
I'm not against option 2, but I don't see it as a replacement to option 1
(for the reasons enumerated below).
It also has more potential of breaking existing documents. Imagine if
someone put multiple paragraphs and headers in a big list item, and one
header is preceded by two blank lines to make it stand out more. The
content of that list item would become a code block. Not pretty.
Good point.
We could allow this only between list items: add an additional blank line
to break out of the current list; but not working for code blocks. The
worse that could happen to existing documents then is that some lists could
be broken into separate consecutive lists; that's much less
damaging than
turning some list item's content into a code block.
This would complicate parsing quite a bit. At this point I'm inclined
to keep changes as simple as possible, and just to implement option (1)
without any version of (2).
Meanwhile, the HTML comment trick works to separate consecutive
lists. :-)
Post by John MacFarlane
(a) Because ~~~ is not used for Setext headers, we would not need to
require a blank line before a code block. You could have a code block
~~~
like this
~~~
which would not be possible with ===.
I hadn't thought about this one, but it's true indeed.
Post by John MacFarlane
On (a): Partly because ~ already has a use in pandoc for inline
text formatting (~~strikeout~~ and ~subscript~), and partly because
it makes parsing easier, I'd be in favor of requiring blank lines before
and after the new-style code blocks.
Yeah, but one is a block-level construct (code block) and the two
others are span-level. It isn't more ambiguous than, say, asterisks
as unordered list item markers and emphasis markers. Moreover, in our
case, it's even more different since the tilde are expected to be
three or more and to be alone on one line while your syntax for
strikeout and subscript is limited to two consecutive tilde which are
unlikely to be alone on their line.
Post by John MacFarlane
(b) Because === already has a use in markdown, using it to mark off
unindented code blocks might confuse people and parsers who aren't
familiar with the new syntax. Non-extended markdown parsers would
parse these code blocks as a regular paragraph followed by a header.
[...]
On (b): Non-extended markdown parsers will make a mess of the
new code blocks with either syntax, since they won't know to interpret
the text between ~~~ as verbatim text. I don't see a big advantage
here for the ~~~ syntax. Also, as I noted, in pandoc ~s are already used
to indicate ~~strikeout text~~ and ~subscripts~. I can see that the ===
syntax might cause problems with existing syntax highlighters, though.
The mess it'll make depends of the content of the code block, but
I'll have to agree that it was a weak argument. I still think it's
easier to read if we use a different character though.
Post by John MacFarlane
Perhaps an alternative would be to use ++++s instead of ~~~~s.
Advantages: Not currently used for anything in markdown or extensions,
vertically centered on the line in most fonts. Disadvantage: ugly?
Ugly indeed. What I've seen used for that is a line of dash: ----,
but that would trigger a horizontal line and I don't feel like
overriding that.
Post by John MacFarlane
I'd be interested in hearing what others think. Although I have a
preference for ===, I'd be willing to go with ~~~ just to prevent a
proliferation of different syntax extensions.
Me too. Let's wait until we have more comment on this.
Post by John MacFarlane
One more thought. I think it would be useful to allow something like
~~~(haskell)~~~~~~~~~~~~~~~~~~~~~~~~~~~~
inlineNote = try $ do
failIfStrict
char '^'
contents <- inlinesInBalanced "[" "]"
return $ Note [Para contents]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The idea is nice, although I'm not sure it's the right syntax. My
idea has always been to end the code block on the first like having
with the same number of tilde so you can easily write an example of a
Markdown code block inside a code block by using various marker
lengths, much like it works for code spans. For instance:

You can write a code block like this:

~~~~
Regular paragraph.

~~~
Code block
~~~
~~~~

So I'd be more in favor of something that doesn't interfere when
counting the tilde, like this:

{.haskell}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
inlineNote = try $ do
failIfStrict
char '^'
contents <- inlinesInBalanced "[" "]"
return $ Note [Para contents]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

or this:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
inlineNote = try $ do
failIfStrict
char '^'
contents <- inlinesInBalanced "[" "]"
return $ Note [Para contents]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
{.haskell}

or this:

~~~~~ {.haskell}
inlineNote = try $ do
failIfStrict
char '^'
contents <- inlinesInBalanced "[" "]"
return $ Note [Para contents]
~~~~~

Here I've followed the planned syntax for adding attributes to
Markdown elements which was discussed some time ago on this list,
where attributes are in braces and class names can be added by
preceding them with a dot. It is still unimplemented in PHP Markdown
Extra, but I think Markuru has most of it. I think it's better to
reuse that than to create an entirely new syntax for the same purpose.


Michel Fortin
***@michelf.com
http://michelf.com/
Waylan Limberg
2007-10-08 03:20:32 UTC
Permalink
Post by John MacFarlane
Post by Michel Fortin
So I'm seriously thinking about adding a second (unindented) code block
syntax to PHP Markdown Extra that would avoid this issue entirely.
Regular paragraph
~~~
Code block
~~~
I like the idea of doing something about this problem. I've been
thinking about adding something like this to pandoc, too, and it
would be good if possible to reach some agreement on a syntax.
If I recall, in the previous thread there were two suggestions for
1. adding delimited unindented code blocks, along the lines of
what you suggested above
2. adopting the convention that two blank lines end a list (or
footnote or other block element in which indentation is used
to indicate continuation blocks).
Ah, you beat me to it. This is my prefered solution.
Post by John MacFarlane
Do people have views about the comparative advantages and disadvantages
of these approaches?
An advantage of (2) is that it provides a clean way to have two
consecutive lists of the same type.
An advantage of (1) is that it allows you to cut and paste code without
adding or removing indentation. It's also something that you see quite
often in emails, and thus goes along with markdown's general philosophy
of using established conventions.
However, I prefer ======== to ~~~~~~~~~~ for this purpose, because
====== is vertically centered on the line and gives you a more
~~~~~~~~~
example 1
~~~~~~~~~
=========
example 2
=========
As long as a blank line is required before the string of ======='s,
there is no danger of confusing this with a Setext-style header.
Well, except when the second to last line of the code block is blank:

===========
some code

this looks like a header
===========

Granted, 'true' parsers and state machines shouldn't choke on this,
but some of the implementations that rely on regex may without jumpimg
through hoops. Besides, the tildes look plenty symetrical in the
machine I'n currently using with the type face I'm currently using. Of
course, YMMV.
Post by John MacFarlane
John
_______________________________________________
Markdown-Discuss mailing list
http://six.pairlist.net/mailman/listinfo/markdown-discuss
--
----
Waylan Limberg
***@gmail.com
Loading...