Skip to content

Conversation

@chiwanpark
Copy link
Contributor

This pull request contains following changes:

  • Add a maven submodule called commonmark-ext-metadata
  • Add an extension which enables metadata feature
  • Add metadata documentation into README.md

The metadata syntax is inspired by YAML format. But this extension supports only few features of YAML because of avoiding other YAML dependency and implementing easily.

This is a first pull request which I send to atlassian open source. So there may be many problems to merge this. Please feel free to leave comments. ;)

README.md Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace "This extension only supports limited features" with "This extension only supports a subset of YAML syntax"? And replace "The example is following:" with "Here's an example of what's supported:" to make it clear that this is it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there some kind of specification for this header that we can link to?

@robinst
Copy link
Collaborator

robinst commented Oct 11, 2015

Hey! Very good work, I only have some minor comments, see above. Some more comments:

  • Can metadata appear in the middle of the document, or does it have to be at the beginning? In case it has to be at the beginning, maybe we should check for that in tryStart?
  • Please add the extension to SpecIntegrationTest, so that we know we don't break anything in core with this extension.
  • Do you feel that this level of support for YAML covers almost all- use cases of document in the wild?

@chiwanpark chiwanpark force-pushed the metadata-extension branch 2 times, most recently from dad0ba8 to a167118 Compare December 2, 2015 16:44
@chiwanpark
Copy link
Contributor Author

Sorry for late update. I updated my PR and rebased on current master branch. About questions:

  • Metadata cannot appear in the middle of the document. I already added a test case covering that (metadataInParagraph).
  • I added metadata extension to SpecIntegrationTest. The integration test passed.
  • I'm not sure that this extension covers almost all use cases. But this extension provides features which are similar to GFM and python markdown library (https://pythonhosted.org/Markdown/extensions/meta_data.html).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work. tryStart only gets one line of the document. I think checking that we're at the beginning of the document and that we have a boundary marker is enough.

Also, please add a test for the following input:

---
test

That currently fails because there's an infinite loop in tryStart.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had another look, and the fact that this is working is actually a bug in the Substring class, it doesn't do the necessary bounds checking. I'm going to fix these bugs on master, which will mean that the above will stop working. But it wouldn't work when reading the input from a file anyway (only one line at a time).

You'll need to merge master and adjust the implementation. Thanks for your patience :).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, pushed to master. Could you merge master into your branch and adapt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we check only one line to decide whether metadata block starts or not, there are some collisions with CommonMark spec like following case:

---
test
----

From original CommonMark spec, rendering result should be <hr />\n<h2>test</h2>\n. But because this extension consider --- as start of metadata block, the actual result will be empty. Is there any method to use multi-line input for detecting start of the block?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't currently. If the first line is a normal paragraph line, then there's support for replacing it on the second line with a block. That is used e.g. for headlines, where only the second line (with the -) shows that it's a headline. But in the above case, the first line is parsed as a horizontal rule right away, not a paragraph.

Do you know how e.g. the Python implementation handles this?

Are there more collisions? If we can't find a way, we can exclude the examples in SpecIntegrationTest. I'm not sure it's a problem in practice, because it only affects the very first line. And if a user encounters this case, they can work around it by adding an empty line before it or an empty metadata block using two ... lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my investigation, the Python implementation (https://github.com/waylan/Python-Markdown/blob/master/markdown/extensions/meta.py) receives all unparsed-markdown string and check whole lines to decide whether metadata block starts or not.

If we check block boundary using only a line, example 14 and 16 in Setext headers of SpecIntegrationTest fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, the Python implementation and github markdown implementation allow only --- in first line to begin a metadata block. I would like to apply this rule in this extension.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for investigating!

Looks like the Python library implements it as a pre-processor that removes the metadata block from the input. Then later the rest of the document is parsed. I'm a bit hesitant to explore that direction because it makes source positions more complicated. Also, that doesn't really work nicely when parsing from a Reader instead of a String.

So let's go with keeping it simple for now, and if the first line in the document happens to be a ---, we interpret it as a metadata block. We could do some more checking in tryContinue and stop the block if the first line of content doesn't look like metadata content.

That means we'll need to override example 14 and 16 in SpecIntegrationTest, see getOverriddenExamples.

Also, +1 for only allowing ---.

@chiwanpark
Copy link
Contributor Author

I addressed your comments @robinst. :)

robinst added a commit that referenced this pull request Dec 8, 2015
Otherwise it's possible to access more of the underlying string than the
substring covers. Discovered in #24. Not sure if Substring is really
worth it at this point.
@chiwanpark chiwanpark force-pushed the metadata-extension branch 2 times, most recently from 156ab60 to 4ede3f9 Compare January 15, 2016 03:13
@chiwanpark
Copy link
Contributor Author

Hi @robinst, sorry for late update. I added a block verification logic to tryContinue method and overridden examples.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional, allowing ... to end the block but not to start it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replying to myself here, but looks like other implementations also do that: https://github.com/jgm/lcmark/#yaml-metadata

So disregard my comment :).

@derari
Copy link
Contributor

derari commented Feb 19, 2016

Just a quick side note: Wouldn't calling this extension "YAML front matter" (because that's what it is, even if not all YAML is supported) be more descriptive than just calling it "Metadata"? There is probably more than one way one might encounter metadata when handling Markdown, so I think it might be helpful for user if it was clear from the extension's name where and what kind of metadata it supports.

@robinst
Copy link
Collaborator

robinst commented Feb 23, 2016

@derari: I think that's a good point. @chiwanpark, when you come around to updating this PR, could you make that change? Other than that, we're really close, and I'd love to see this included in the next release :).

@chiwanpark
Copy link
Contributor Author

I'm really sorry for delaying update @robinst. I'll update this PR including @derari's suggestion in 3 days.

About a name of maven module, is commonmark-ext-yaml-frontmatter reasonable? I found that a similar implementation in Jekyll is called frontmatter also (http://jekyllrb.com/docs/frontmatter/).

@robinst
Copy link
Collaborator

robinst commented Feb 23, 2016

Cool! I'd add another hyphen, commonmark-ext-yaml-front-matter, because it's multiple words.

  - Move `literal` into `parseInlines` method
  - Fix a bug in `parseInlines` for incomplete block
  - Simplify construction of `MetadataNode`
  - Add a test case for incomplete metadata block
@chiwanpark
Copy link
Contributor Author

Hi all, I updated this PR addressing your opinions. Please review this. :-)

@@ -0,0 +1,6 @@
package org.commonmark.ext.yaml;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change the packages to org.commonmark.ext.front.matter instead? (Seems more future proof, apparently there's other front matter formats, e.g. JSON.)

@chiwanpark
Copy link
Contributor Author

Thanks for review @robinst. I just addressed your comments.


import org.commonmark.node.CustomBlock;

public class YamlFrontMatterBlock extends CustomBlock {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the file names aren't changed, this file is still named YAMLFrontMatterBlock. That's why the Travis build fails. Are you on a case-insensitive file system? Check out this Stackoverflow question for how to work around that.

@chiwanpark
Copy link
Contributor Author

Hi @robinst, I updated the all filenames. Thanks for your guide.

robinst added a commit that referenced this pull request Feb 26, 2016
YAML front matter metadata extension
@robinst robinst merged commit 77ddc83 into commonmark:master Feb 26, 2016
@robinst
Copy link
Collaborator

robinst commented Feb 26, 2016

Finally merged 😄. Nice work Chiwan!

@chiwanpark
Copy link
Contributor Author

Thanks @robinst and @derari!

@robinst
Copy link
Collaborator

robinst commented Apr 22, 2016

Released in 0.5.0 now 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants