Yozora
๐ Why named "yozora" ?โ
Yozora is the Roman sound of Japanese ใใใใใ, taken from the lyrics in ใ่ฑ้ณฅ้ขจๆใ by the band ไธ็ใฎ็ตใใ.
This project is a monorepo that aims to implement a highly extensible, pluggable Markdown parser. Based on the idea of middlewares, the core algorithm @yozora/core-parser will schedule tokenizers (such as @yozora/tokenizer-autolink) to complete the parsing tasks. More accurately, yozora is an algorithm to parse Markdown or its extended syntax contents into an abstract syntax tree (AST).
โจ Featuresโ
-
๐ Fully support all the rules mentioned in the GFM specification, and has passed almost all test cases created based on the examples in the specification (except the one https://github.github.com/gfm/#example-653, as there is no plan to support native HTML tags in the React Renderer, for the Yozora AST, so I'm a little lazy to do the tag filtering. If you need it, you can do the filtering by yourself).
See @yozora/parser-gfm or @yozora/parser-gfm-ex for further information.
-
๐ Robust.
-
All codes are written in Typescript, with the guarantee of strictly static type checking.
-
Eslint and Prettier to constrain coding styles to avoid error-prone problems such as hack syntax and shadow variables.
-
Tested with Jest, and passed a large number of test cases.
-
-
๐ Tidy: No third-party dependencies.
-
โก๏ธ Efficient.
-
The parsing complexity is the length of source contents multiplied by the number of tokenizers, which has reached the lower bound of theoretical complexity.
-
The parser API supports streaming read-in (using generators /iterators for input), and supports parsing while read-in (Only block-level data is supported yet).
-
Carefully handle the array creation / concat operations. To reused the array as much as possible during the entire matching phase, only use the array index to delineate the matching range. And a lot of strategies applied to reduce duplicated matching / parsing operations.
-
-
๐ฉน Compatibility, the parsed syntax tree is compatible with the one defined in [Mdast][mdast-homepage].
Even if some data types are not compatible in the future, it is easy to traverse the AST for adaptation and modification through the API provided in @yozora/ast-util.
-
๐จ Extendibility, Yozora comes with a plug-in system, which allowed Yozora to schedule the tokenizers through an internal algorithms to complete the parsing tasks.
-
It's easy to create and integrate custom tokenizers.
-
All tokenizers can be mounted or unmounted freely.
Some tokenizers of the data types that not mentioned in GFM have been implemented in this repository, such as @yozora/tokenizer-admonition, @yozora/tokenizer-footnote, etc. All of them are built into @yozora/parser in default, you can uninstall them at will, if you don't like it.
-
๐ Usageโ
- YozoraParser
- GfmParser
- GfmExParser
- MarkupWeaver
@yozora/parser: (Recommended) A Markdown parser with rich built-in tokenizers.
import YozoraParser from '@yozora/parser'
const parser = new YozoraParser()
parser.parse('source content')
@yozora/parser-gfm: A Markdown parser that supports GFM specification. Built-in tokenizers that supports all grammars mentioned in GFM specification (excluding the extended grammar mentioned in the specification, such as table).
import GfmParser from '@yozora/parser-gfm'
const parser = new GfmParser()
parser.parse('github flavor markdown contents')
@yozora/parser-gfm-ex: A Markdown parser that supports GFM specification. Built-in tokenizers that supports all grammars mentioned in GFM specification (including the extended grammar mentioned in the specification, such as table).
import GfmExParser from '@yozora/parser-gfm-ex'
const parser = new GfmExParser()
parser.parse('github flavor markdown contents (with gfm extensions enabled)')
Content AST into markup content
import { DefaultMarkupWeaver } from '@yozora/markup-weaver'
const weaver = new DefaultMarkupWeaver()
weaver.weave({
"type": "root",
"children": [
{
"type": "paragraph",
"children": [
{
"type": "text",
"value": "emphasis: "
},
{
"type": "strong",
"children": [
{
"type": "text",
"value": "foo \""
},
{
"type": "emphasis",
"children": [
{
"type": "text",
"value": "bar"
}
]
},
{
"type": "text",
"value": "\" foo"
}
]
}
]
}
]
})
// => emphasis: **foo "*bar*" foo**
๐ก FAQโ
-
How to use yozora with gatsby?
- Try the @yozora/gatsby-transformer and @yozora/gatsby-images
-
How to implemented custom tokenizer?
-
Use @yozora/template-tokenizer to create a custom tokenizer with predefined boilerplates.
-
Check @yozora/core-tokenizer for implementation details of tokenizer.
-
Check @yozora/jest-for-tokenizer for information about testing the custom tokenizer.
-
Check @yozora/core-parser and @yozora/parser for information on how to integrate a custom tokenzier.
It's also recommended to see the existing tokenizers for referencing.
-
๐ฌ Contactโ
๐ Licenseโ
Yozora is MIT licensed.
Relatedโ
- โจๅ ๅๅฐไธ็ดๆณ่ฆไธไธชๆธ ็ฝๅๅฎข: Why this project was written.
- @yozora/react-markdown: A library for rendering Yozora AST into React components.
- @yozora/html-markdown: A library for rendering Yozora AST into html strings.