@yozora/tokenizer-html-inline
Text between <
and >
that looks like an HTML tag is parsed as a raw HTML tag
and will be rendered in HTML without escaping. Tag and attribute names are not
limited to current HTML tags, so custom tags (and even, say, DocBook tags) may
be used.
Here is the grammar for tags:
A tag name consists of an ASCII letter followed by zero or more
ASCII letters, digits, or hyphens (-
).
An attribute consists of whitespace, an attribute name, and an optional attribute value specification.
An attribute name consists of an ASCII letter, _
, or :
,
followed by zero or more ASCII letters, digits, _
, .
, :
, or -
. (Note:
This is the XML specification restricted to ASCII. HTML5 is laxer.)
An attribute value specification consists of optional
whitespace, a =
character, optional whitespace,
and an attribute value.
An attribute value consists of an unquoted attribute value, a single-quoted attribute value, or a double-quoted attribute value.
An unquoted attribute value is a nonempty
string of characters not including whitespace, "
, '
, =
,
<
, >
, or `
.
A single-quoted attribute value consists of
'
, zero or more characters not including '
, and a final '
.
A double-quoted attribute value consists of
"
, zero or more characters not including "
, and a final "
.
An open tag consists of a <
character, a tag name,
zero or more attributes, optional whitespace,
an optional /
character, and a >
character.
A closing tag consists of the string </
, a tag name,
optional whitespace, and the character >
.
An HTML comment consists of <!--
+ text + -->
, where
text does not start with >
or ->
, does not end with -
, and does not
contain --
. (See the HTML5 spec.)
A processing instruction consists of the string
<?
, a string of characters not including the string ?>
, and the string ?>
.
A declaration consists of the string <!
, a name consisting
of one or more uppercase ASCII letters, whitespace, a string
of characters not including the character >
, and the character >
.
A CDATA section consists of the string <![CDATA[
, a
string of characters not including the string ]]>
, and the string ]]>
.
An HTML tag consists of an open tag, a closing tag, an HTML comment, a processing instruction, a declaration, or a CDATA section.
- See github flavor markdown spec for details.
- See Live Examples for an intuitive impression.
Install
- npm
- Yarn
- pnpm
npm install --save @yozora/tokenizer-html-inline
yarn add @yozora/tokenizer-html-inline
pnpm add @yozora/tokenizer-html-inline
Usage
@yozora/tokenizer-html-inline has been integrated into @yozora/parser / @yozora/parser-gfm-ex / @yozora/parser-gfm,
so you can use YozoraParser
/ GfmExParser
/ GfmParser
directly.
- Basic Usage
- YozoraParser
- GfmParser
- GfmExParser
@yozora/tokenizer-html-inline cannot be used alone, it needs to be registered in Parser as a plugin-in before it can be used.
import { DefaultParser } from '@yozora/core-parser'
import ParagraphTokenizer from '@yozora/tokenizer-paragraph'
import TextTokenizer from '@yozora/tokenizer-text'
import HtmlInlineTokenizer from '@yozora/tokenizer-html-inline'
const parser = new DefaultParser()
.useFallbackTokenizer(new ParagraphTokenizer())
.useFallbackTokenizer(new TextTokenizer())
.useTokenizer(new HtmlInlineTokenizer())
// parse source markdown content
parser.parse(`
<a><bab><c2c>
foo <?php echo $a; ?>
`)
import YozoraParser from '@yozora/parser'
const parser = new YozoraParser()
// parse source markdown content
parser.parse(`
<a><bab><c2c>
foo <?php echo $a; ?>
`)
import GfmParser from '@yozora/parser-gfm'
const parser = new GfmParser()
// parse source markdown content
parser.parse(`
<a><bab><c2c>
foo <?php echo $a; ?>
`)
import GfmExParser from '@yozora/parser-gfm-ex'
const parser = new GfmExParser()
// parse source markdown content
parser.parse(`
<a><bab><c2c>
foo <?php echo $a; ?>
`)
Options
Name | Type | Required | Default |
---|---|---|---|
name | string | false | "@yozora/tokenizer-html-inline" |
priority | number | false | TokenizerPriority.ATOMIC |
-
name
: The unique name of the tokenizer, used to bind the token it generates, to determine the tokenizer that should be called in each life cycle of the token in the entire matching / parsing phase. -
priority
: Priority of the tokenizer, determine the order of processing, high priority priority execution. interruptable. In addition, in thematch-block
stage, a high-priority tokenizer can interrupt the matching process of a low-priority tokenizer.Exception: Delimiters of type
full
are always processed before other type delimiters.
Types
@yozora/tokenizer-html-inline produce Html type nodes. See @yozora/ast for full base types.
import type { Literal } from '@yozora/ast'
export const HtmlType = 'html'
export type HtmlType = typeof HtmlType
/**
* HTML (Literal) represents a fragment of raw HTML.
* @see https://github.com/syntax-tree/mdast#html
* @see https://github.github.com/gfm/#html-blocks
* @see https://github.github.com/gfm/#raw-html
*/
export type Html = Literal<HtmlType>
Live Examples
-
Opening.
-
Closing.
-
Comments.
-
Processing instruction.
-
Declaration.
-
CDATA section.