astro/packages/integrations/markdoc/src/tokenizer.ts

37 lines
1.3 KiB
TypeScript
Raw Normal View History

Add "allowHTML" option for Markdoc with HTML parsing/processing (#7597) * 7576 - initial support for HTML inside Markdoc. This uses htmlparser2 to perform a pure token transform/mutation on the markdown-it tokens, replacing the original raw HTML string tokens with a richer set of tokens per HTML node, and in the process Markdoc tags are interleaved in the resulting token graph at the appropriate locations This removes the legacy config of the @astrojs/markdoc integration entirely (suggested by @bholmesdev) and introduces a new type for options to be specified in the astro config, initially, with just the new "enableHTML" option When "enableHTML" is *not* enabled (the default), the behavior of the entire @astrojs/markdoc integration should remain functionally equivalent to before this change * 7576 - fixed issues with whitespace preservation also: * cleaned up " to ' for astro project preferred linting * made the html rendering test fixture use a dynamic path * 7576 - detailed nested HTML test coverage * 7576 - component + HTML interleaved tests * 7576 - fix lint problems from previous changes * 7576 - some commentary * 7576 - file naming, refactor html under imports, package.json exports definition for html * 7576 * move out of extensions dir, remove export * cdata handling changes * 7576 * inline license from third party code * cleanup test class copy of HTML output * remove // third party indicators for imports (clarification: not third party code, just a indicator this group of imports is third party) * 7576 - fixed test before/after for DRY'ness * 7576 - no need to React-ify HTML attribute case * 7576 - rename "enableHTML" option to "allowHTML" * Added Markdoc allowHTML feature changeset * 7576 - updated README with allowHTML info * 7576 - fixed changeset typo * 7576 - minor edits based on PR feedback for docs * 7576 - minor edits based on PR feedback for docs
2023-07-24 23:34:06 +00:00
import type { Tokenizer } from '@markdoc/markdoc';
import Markdoc from '@markdoc/markdoc';
import type { MarkdocIntegrationOptions } from './options.js';
type TokenizerOptions = ConstructorParameters<typeof Tokenizer>[0];
export function getMarkdocTokenizer(options: MarkdocIntegrationOptions | undefined): Tokenizer {
2023-07-24 23:36:32 +00:00
const key = cacheKey(options);
if (!_cachedMarkdocTokenizers[key]) {
const tokenizerOptions: TokenizerOptions = {
// Strip <!-- comments --> from rendered output
// Without this, they're rendered as strings!
allowComments: true,
};
if (options?.allowHTML) {
// we want to allow indentation for Markdoc tags that are interleaved inside HTML block elements
tokenizerOptions.allowIndentation = true;
// enable HTML token detection in markdown-it
tokenizerOptions.html = true;
}
_cachedMarkdocTokenizers[key] = new Markdoc.Tokenizer(tokenizerOptions);
}
return _cachedMarkdocTokenizers[key];
}
Add "allowHTML" option for Markdoc with HTML parsing/processing (#7597) * 7576 - initial support for HTML inside Markdoc. This uses htmlparser2 to perform a pure token transform/mutation on the markdown-it tokens, replacing the original raw HTML string tokens with a richer set of tokens per HTML node, and in the process Markdoc tags are interleaved in the resulting token graph at the appropriate locations This removes the legacy config of the @astrojs/markdoc integration entirely (suggested by @bholmesdev) and introduces a new type for options to be specified in the astro config, initially, with just the new "enableHTML" option When "enableHTML" is *not* enabled (the default), the behavior of the entire @astrojs/markdoc integration should remain functionally equivalent to before this change * 7576 - fixed issues with whitespace preservation also: * cleaned up " to ' for astro project preferred linting * made the html rendering test fixture use a dynamic path * 7576 - detailed nested HTML test coverage * 7576 - component + HTML interleaved tests * 7576 - fix lint problems from previous changes * 7576 - some commentary * 7576 - file naming, refactor html under imports, package.json exports definition for html * 7576 * move out of extensions dir, remove export * cdata handling changes * 7576 * inline license from third party code * cleanup test class copy of HTML output * remove // third party indicators for imports (clarification: not third party code, just a indicator this group of imports is third party) * 7576 - fixed test before/after for DRY'ness * 7576 - no need to React-ify HTML attribute case * 7576 - rename "enableHTML" option to "allowHTML" * Added Markdoc allowHTML feature changeset * 7576 - updated README with allowHTML info * 7576 - fixed changeset typo * 7576 - minor edits based on PR feedback for docs * 7576 - minor edits based on PR feedback for docs
2023-07-24 23:34:06 +00:00
// create this on-demand when needed since it relies on the runtime MarkdocIntegrationOptions and may change during
// the life of module in certain scenarios (unit tests, etc.)
let _cachedMarkdocTokenizers: Record<string, Tokenizer> = {};
function cacheKey(options: MarkdocIntegrationOptions | undefined): string {
2023-07-24 23:36:32 +00:00
return JSON.stringify(options);
Add "allowHTML" option for Markdoc with HTML parsing/processing (#7597) * 7576 - initial support for HTML inside Markdoc. This uses htmlparser2 to perform a pure token transform/mutation on the markdown-it tokens, replacing the original raw HTML string tokens with a richer set of tokens per HTML node, and in the process Markdoc tags are interleaved in the resulting token graph at the appropriate locations This removes the legacy config of the @astrojs/markdoc integration entirely (suggested by @bholmesdev) and introduces a new type for options to be specified in the astro config, initially, with just the new "enableHTML" option When "enableHTML" is *not* enabled (the default), the behavior of the entire @astrojs/markdoc integration should remain functionally equivalent to before this change * 7576 - fixed issues with whitespace preservation also: * cleaned up " to ' for astro project preferred linting * made the html rendering test fixture use a dynamic path * 7576 - detailed nested HTML test coverage * 7576 - component + HTML interleaved tests * 7576 - fix lint problems from previous changes * 7576 - some commentary * 7576 - file naming, refactor html under imports, package.json exports definition for html * 7576 * move out of extensions dir, remove export * cdata handling changes * 7576 * inline license from third party code * cleanup test class copy of HTML output * remove // third party indicators for imports (clarification: not third party code, just a indicator this group of imports is third party) * 7576 - fixed test before/after for DRY'ness * 7576 - no need to React-ify HTML attribute case * 7576 - rename "enableHTML" option to "allowHTML" * Added Markdoc allowHTML feature changeset * 7576 - updated README with allowHTML info * 7576 - fixed changeset typo * 7576 - minor edits based on PR feedback for docs * 7576 - minor edits based on PR feedback for docs
2023-07-24 23:34:06 +00:00
}