@yozora/tokenizer-fenced-code
A code fence is a sequence of at least three consecutive
backtick characters (`
) or tildes (~
). (Tildes and backticks cannot be
mixed.) A fenced code block begins with a code fence,
indented no more than three spaces.
The line with the opening code fence may optionally contain some text following the code fence; this is trimmed of leading and trailing whitespace and called the info string. If the info string comes after a backtick fence, it may not contain any backtick characters. (The reason for this restriction is that otherwise some inline code would be incorrectly interpreted as the beginning of a fenced code block.)
The content of the code block consists of all subsequent lines, until a closing code fence of the same type as the code block began with (backticks or tildes), and with at least as many backticks or tildes as the opening code fence. If the leading code fence is indented spaces, then up to spaces of indentation are removed from each line of the content (if present). (If a content line is not indented, it is preserved unchanged. If it is indented less than spaces, all of the indentation is removed.)
The closing code fence may be indented up to three spaces, and may be followed only by spaces, which are ignored. If the end of the containing block (or document) is reached and no closing code fence has been found, the code block contains all of the lines after the opening code fence until the end of the containing block (or document). (An alternative spec would require backtracking in the event that a closing code fence is not found. But this makes parsing much less efficient, and there seems to be no real down side to the behavior described here.)
- See github flavor markdown spec for details.
- See Live Examples for an intuitive impression.
Install
- npm
- Yarn
- pnpm
npm install --save @yozora/tokenizer-fenced-code
yarn add @yozora/tokenizer-fenced-code
pnpm add @yozora/tokenizer-fenced-code
Usage
@yozora/tokenizer-fenced-code has been integrated into @yozora/parser / @yozora/parser-gfm-ex / @yozora/parser-gfm,
so you can use YozoraParser
/ GfmExParser
/ GfmParser
directly.
- Basic Usage
- YozoraParser
- GfmParser
- GfmExParser
@yozora/tokenizer-fenced-code cannot be used alone, it needs to be registered in Parser as a plugin-in before it can be used.
import { DefaultParser } from '@yozora/core-parser'
import ParagraphTokenizer from '@yozora/tokenizer-paragraph'
import TextTokenizer from '@yozora/tokenizer-text'
import FencedCodeTokenizer from '@yozora/tokenizer-fenced-code'
const parser = new DefaultParser()
.useFallbackTokenizer(new ParagraphTokenizer())
.useFallbackTokenizer(new TextTokenizer())
.useTokenizer(new FencedCodeTokenizer())
// parse source markdown content
parser.parse(`
\`\`\`ruby
def foo(x)
return 3
end
\`\`\`
~~~typescript
export const foo: string = 'waw'
~~~
# baz
`)
import YozoraParser from '@yozora/parser'
const parser = new YozoraParser()
// parse source markdown content
parser.parse(`
\`\`\`ruby
def foo(x)
return 3
end
\`\`\`
~~~typescript
export const foo: string = 'waw'
~~~
# baz
`)
import GfmParser from '@yozora/parser-gfm'
const parser = new GfmParser()
// parse source markdown content
parser.parse(`
\`\`\`ruby
def foo(x)
return 3
end
\`\`\`
~~~typescript
export const foo: string = 'waw'
~~~
# baz
`)
import GfmExParser from '@yozora/parser-gfm-ex'
const parser = new GfmExParser()
// parse source markdown content
parser.parse(`
\`\`\`ruby
def foo(x)
return 3
end
\`\`\`
~~~typescript
export const foo: string = 'waw'
~~~
# baz
`)
Options
Name | Type | Required | Default |
---|---|---|---|
name | string | false | "@yozora/tokenizer-fenced-code" |
priority | number | false | TokenizerPriority.FENCED_BLOCK |
-
name
: The unique name of the tokenizer, used to bind the token it generates, to determine the tokenizer that should be called in each life cycle of the token in the entire matching / parsing phase. -
priority
: Priority of the tokenizer, determine the order of processing, high priority priority execution. interruptable. In addition, in thematch-block
stage, a high-priority tokenizer can interrupt the matching process of a low-priority tokenizer.
Types
@yozora/tokenizer-fenced-code produce Code type nodes. See @yozora/ast for full base types.
import type { Literal } from '@yozora/ast'
export const CodeType = 'code'
export type CodeType = typeof CodeType
/**
* Code represents a block of preformatted text, such as ASCII art or computer
* code.
* @see https://github.com/syntax-tree/mdast#code
* @see https://github.github.com/gfm/#code-fence
*/
export interface Code extends Literal<CodeType> {
/**
* Language of the codes
*/
lang?: string
/**
* Meta info string
*/
meta?: string
}
Live Examples
-
Basic.
-
Fewer than three backticks is not enough.
-
The closing code fence must use the same character as the opening fence.
-
The closing code fence must be at least as long as the opening fence.
-
Unclosed code blocks are closed by the end of the document (or the enclosing block quote or list item).
-
A code block can have all empty lines as its content.
-
A code block can be empty.
-
Fences can be indented. If the opening fence is indented, content lines will have equivalent opening indentation removed, if present.
-
Four spaces indentation produces an indented code block.
-
Closing fences may be indented by 0-3 spaces, and their indentation need not match that of the opening fence.
-
This is not a closing fence, because it is indented 4 spaces.
-
Code fences (opening and closing) cannot contain internal spaces.
-
Fenced code blocks can interrupt paragraphs, and can be followed directly by paragraphs, without a blank line between.
-
Other blocks can also occur before and after fenced code blocks without an intervening blank line
-
An info string can be provided after the opening code fence. Although this spec doesn’t mandate any particular treatment of the info string, the first word is typically used to specify the language of the code block.
-
Info strings for backticks code blocks cannot contain backticks and tildes.
-
Info strings for tilde code blocks can contain backticks and tildes.
-
Closing code fences cannot have info strings.