Version: 2.x.x

@yozora/tokenizer-html-block

github flavor markdown spec

An HTML block is a group of lines that is treated as raw HTML (and will not be escaped in HTML output).

There are seven kinds of HTML block, which can be defined by their start and end conditions. The block begins with a line that meets a start condition (after up to three spaces optional indentation). It ends with the first subsequent line that meets a matching end condition, or the last line of the document, or the last line of the container block containing the current HTML block, if no line is encountered that meets the end condition. If the first line meets both the start condition and the end condition, the block will contain just that line.

Start condition: line begins with the string <script, <pre, or <style (case-insensitive), followed by whitespace, the string >, or the end of the line.

End condition: line contains an end tag </script>, </pre>, or </style> (case-insensitive; it need not match the start tag).

Start condition: line begins with the string <!--.

End condition: line contains the string -->.

Start condition: line begins with the string <?.

End condition: line contains the string ?>.

Start condition: line begins with the string <! followed by an uppercase ASCII letter.

End condition: line contains the character >.

Start condition: line begins with the string <![CDATA[.

End condition: line contains the string ]]>.

Start condition: line begins the string < or </ followed by one of the strings (case-insensitive) address, article, aside, base, basefont, blockquote, body, caption, center, col, colgroup, dd, details, dialog, dir, div, dl, dt, fieldset, figcaption, figure, footer, form, frame, frameset, h1, h2, h3, h4, h5, h6, head, header, hr, html, iframe, legend, li, link, main, menu, menuitem, nav, noframes, ol, optgroup, option, p, param, section, source, summary, table, tbody, td, tfoot, th, thead, title, tr, track, ul, followed by whitespace, the end of the line, the string >, or the string />.

End condition: line is followed by a blank line.

Start condition: line begins with a complete open tag (with any [tag name]gfm-tag-name other than script, style, or pre) or a complete closing tag, followed only by whitespace or the end of the line.

End condition: line is followed by a blank line.

HTML blocks continue until they are closed by their appropriate end condition, or the last line of the document or other container block. This means any HTML within an HTML block that might otherwise be recognised as a start condition will be ignored by the parser and passed through as-is, without changing the parser’s state.

See github flavor markdown spec for details.
See Live Examples for an intuitive impression.

Install

npm
Yarn
pnpm

npm install --save @yozora/tokenizer-html-block

yarn add @yozora/tokenizer-html-block

pnpm add @yozora/tokenizer-html-block

Usage

tip

@yozora/tokenizer-html-block has been integrated into @yozora/parser / @yozora/parser-gfm-ex / @yozora/parser-gfm, so you can use YozoraParser / GfmExParser / GfmParser directly.

Basic Usage
YozoraParser
GfmParser
GfmExParser

@yozora/tokenizer-html-block cannot be used alone, it needs to be registered in Parser as a plugin-in before it can be used.

import { DefaultParser } from '@yozora/core-parser'
import ParagraphTokenizer from '@yozora/tokenizer-paragraph'
import TextTokenizer from '@yozora/tokenizer-text'
import HtmlBlockTokenizer from '@yozora/tokenizer-html-block'

const parser = new DefaultParser()
  .useFallbackTokenizer(new ParagraphTokenizer())
  .useFallbackTokenizer(new TextTokenizer())
  .useTokenizer(new HtmlBlockTokenizer())

// parse source markdown content
parser.parse(`
<pre language="haskell"><code>
import Text.HTML.TagSoup

main :: IO ()
main = print $ parseTags tags
</code></pre>
okay
`)

import YozoraParser from '@yozora/parser'

const parser = new YozoraParser()

// parse source markdown content
parser.parse(`
<pre language="haskell"><code>
import Text.HTML.TagSoup

main :: IO ()
main = print $ parseTags tags
</code></pre>
okay
`)

import GfmParser from '@yozora/parser-gfm'

const parser = new GfmParser()

// parse source markdown content
parser.parse(`
<pre language="haskell"><code>
import Text.HTML.TagSoup

main :: IO ()
main = print $ parseTags tags
</code></pre>
okay
`)

import GfmExParser from '@yozora/parser-gfm-ex'

const parser = new GfmExParser()

// parse source markdown content
parser.parse(`
<pre language="haskell"><code>
import Text.HTML.TagSoup

main :: IO ()
main = print $ parseTags tags
</code></pre>
okay
`)

Options

Name	Type	Required	Default
`name`	`string`	`false`	`"@yozora/tokenizer-html-block"`
`priority`	`number`	`false`	`TokenizerPriority.ATOMIC`

name: The unique name of the tokenizer, used to bind the token it generates, to determine the tokenizer that should be called in each life cycle of the token in the entire matching / parsing phase.
priority: Priority of the tokenizer, determine the order of processing, high priority priority execution. interruptable. In addition, in the match-block stage, a high-priority tokenizer can interrupt the matching process of a low-priority tokenizer.

Types

@yozora/tokenizer-html-block produce Html type nodes. See @yozora/ast for full base types.

import type { Literal } from '@yozora/ast'

export const HtmlType = 'html'
export type HtmlType = typeof HtmlType

/**
 * HTML (Literal) represents a fragment of raw HTML.
 * @see https://github.com/syntax-tree/mdast#html
 * @see https://github.github.com/gfm/#html-blocks
 * @see https://github.github.com/gfm/#raw-html
 */
export type Html = Literal<HtmlType>

Live Examples

(Condition 1)
#139
yozora

pretty-json
<pre language="haskell"><code> import Text.HTML.TagSoup main :: IO () main = print $ parseTags tags </code></pre> okay
Comment (Condition 2)
#148
yozora

pretty-json
 okay
Processing instruction (Condition 3)
#149
yozora

pretty-json
<?php echo '>'; ?> okay
Declaration (Condition 4)
#150
yozora

pretty-json
<!DOCTYPE html>
CDATA (Condition 5)
#151
yozora

pretty-json
<![CDATA[ function matchwo(a,b) { if (a < b && a < 0) then { return 1; } else { return 0; } } ]]> okay
(Condition 6)
#119
yozora

pretty-json
<table> <tr> <td> hi </td> </tr> </table> okay.
(Condition 7)
#133
yozora

pretty-json
<Warning> *bar* </Warning>

Install​

Usage​

Options​

Types​

Live Examples​

Related​

Install

Usage

Options

Types

Live Examples

Related