Skip to content

Conversation

@ematipico
Copy link
Member

Summary

Fixes an issue where the presence of keyword inside HTML text would trip the parser.

Test Plan

Added tests

Docs

@changeset-bot
Copy link

changeset-bot bot commented Dec 24, 2025

🦋 Changeset detected

Latest commit: 3a8d0b2

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 13 packages
Name Type
@biomejs/biome Patch
@biomejs/cli-win32-x64 Patch
@biomejs/cli-win32-arm64 Patch
@biomejs/cli-darwin-x64 Patch
@biomejs/cli-darwin-arm64 Patch
@biomejs/cli-linux-x64 Patch
@biomejs/cli-linux-arm64 Patch
@biomejs/cli-linux-x64-musl Patch
@biomejs/cli-linux-arm64-musl Patch
@biomejs/wasm-web Patch
@biomejs/wasm-bundler Patch
@biomejs/wasm-nodejs Patch
@biomejs/backend-jsonrpc Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions bot added A-Parser Area: parser L-HTML Language: HTML and super languages labels Dec 24, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 24, 2025

Walkthrough

This changeset fixes an HTML parser bug where keywords appearing inside HTML tag text content incorrectly triggered parse errors. The fix involves replacing a direct Vue feature check with a proper feature-flag mechanism, adding keyword detection logic with helper functions (is_at_keyword and is_at_html_keyword), and implementing rewind-and-reparse logic when keywords are detected in text contexts. A new test file validates the fix with keyword-wrapped content in spans.

Possibly related PRs

Suggested labels

A-Parser, L-HTML

Suggested reviewers

  • dyc3

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(html): parse keywords' directly addresses the main change: fixing HTML parser keyword handling.
Description check ✅ Passed The description accurately summarises the fix and mentions tests were added, relating to the changeset content.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/html-parser

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 25cdd50 and 3a8d0b2.

⛔ Files ignored due to path filters (1)
  • crates/biome_html_parser/tests/html_specs/ok/text_keywords.html.snap is excluded by !**/*.snap and included by **
📒 Files selected for processing (4)
  • .changeset/itchy-zoos-swim.md
  • crates/biome_html_parser/src/syntax/mod.rs
  • crates/biome_html_parser/tests/html_specs/ok/text_keywords.html
  • crates/biome_html_parser/tests/quick_test.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (CONTRIBUTING.md)

**/*.rs: Use inline rustdoc documentation for rules, assists, and their options
Use the dbg!() macro for debugging output in Rust tests and code
Use doc tests (doctest) format with code blocks in rustdoc comments; ensure assertions pass in tests

Files:

  • crates/biome_html_parser/src/syntax/mod.rs
  • crates/biome_html_parser/tests/quick_test.rs
🧠 Learnings (27)
📚 Learning: 2025-12-21T21:15:03.782Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: CONTRIBUTING.md:0-0
Timestamp: 2025-12-21T21:15:03.782Z
Learning: Changesets should describe user-facing changes only; internal refactoring without behavior changes does not require a changeset

Applied to files:

  • .changeset/itchy-zoos-swim.md
📚 Learning: 2025-08-05T14:43:29.581Z
Learnt from: dyc3
Repo: biomejs/biome PR: 7081
File: packages/@biomejs/biome/configuration_schema.json:7765-7781
Timestamp: 2025-08-05T14:43:29.581Z
Learning: The file `packages/biomejs/biome/configuration_schema.json` is auto-generated and should not be manually edited or reviewed for schema issues; any changes should be made at the code generation source.

Applied to files:

  • .changeset/itchy-zoos-swim.md
📚 Learning: 2025-12-21T21:15:03.782Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: CONTRIBUTING.md:0-0
Timestamp: 2025-12-21T21:15:03.782Z
Learning: Use past tense when describing actions in changesets (e.g., 'Added', 'Fixed'), and present tense for Biome behavior (e.g., 'Biome now supports')

Applied to files:

  • .changeset/itchy-zoos-swim.md
📚 Learning: 2025-12-19T11:30:58.874Z
Learnt from: ematipico
Repo: biomejs/biome PR: 8509
File: crates/biome_html_parser/src/syntax/svelte.rs:465-467
Timestamp: 2025-12-19T11:30:58.874Z
Learning: In the Biome HTML parser, `parse_single_text_expression_content` does not advance the parser if the current token is empty. This is intentional to support cases where an expression is optional. When calling this function, callers must check if `p.cur_text().is_empty()` after the call and manually advance the parser with `p.bump_remap(HTML_LITERAL)` if needed.

Applied to files:

  • .changeset/itchy-zoos-swim.md
  • crates/biome_html_parser/src/syntax/mod.rs
📚 Learning: 2025-12-04T13:29:49.287Z
Learnt from: dyc3
Repo: biomejs/biome PR: 8291
File: crates/biome_html_formatter/tests/specs/prettier/vue/html-vue/elastic-header.html:10-10
Timestamp: 2025-12-04T13:29:49.287Z
Learning: Files under `crates/biome_html_formatter/tests/specs/prettier` are test fixtures synced from Prettier and should not receive detailed code quality reviews (e.g., HTTP vs HTTPS, formatting suggestions, etc.). These files are test data meant to validate formatter behavior and should be preserved as-is.

Applied to files:

  • crates/biome_html_parser/tests/html_specs/ok/text_keywords.html
  • crates/biome_html_parser/tests/quick_test.rs
📚 Learning: 2025-11-24T18:05:42.356Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_js_type_info/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:05:42.356Z
Learning: Applies to crates/biome_js_type_info/**/*.rs : Distinguish between `TypeData::Unknown` and `TypeData::UnknownKeyword` to measure inference effectiveness versus explicit user-provided unknown types

Applied to files:

  • crates/biome_html_parser/tests/html_specs/ok/text_keywords.html
  • crates/biome_html_parser/src/syntax/mod.rs
📚 Learning: 2025-11-24T18:06:03.545Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Use `ConditionalParsedSyntax` for syntax that is only valid in specific contexts (e.g., strict mode, file types, language versions) and call `or_invalid_to_bogus()` to convert to a bogus node if not supported

Applied to files:

  • crates/biome_html_parser/src/syntax/mod.rs
📚 Learning: 2025-11-24T18:05:27.810Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_js_formatter/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:05:27.810Z
Learning: Applies to crates/biome_js_formatter/**/*.rs : When formatting AST nodes, use mandatory tokens from the AST instead of hardcoding token strings (e.g., use `node.l_paren_token().format()` instead of `token("(")`)

Applied to files:

  • crates/biome_html_parser/src/syntax/mod.rs
📚 Learning: 2025-11-24T18:06:03.545Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Implement a token source struct that wraps the lexer and implements `TokenSourceWithBufferedLexer` and `LexerWithCheckpoint` for lookahead and re-lexing capabilities

Applied to files:

  • crates/biome_html_parser/src/syntax/mod.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Set `language` to `jsx`, `ts`, or `tsx` for rules that only apply to specific JavaScript dialects

Applied to files:

  • crates/biome_html_parser/src/syntax/mod.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/lint/**/*.rs : Lint rules should check syntax according to language specification and emit error diagnostics

Applied to files:

  • crates/biome_html_parser/src/syntax/mod.rs
📚 Learning: 2025-11-24T18:06:03.545Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Parse rules must take a mutable reference to the parser as their only parameter and return a `ParsedSyntax`

Applied to files:

  • crates/biome_html_parser/src/syntax/mod.rs
📚 Learning: 2025-11-24T18:06:03.545Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Use `p.eat(token)` for optional tokens, `p.expect(token)` for required tokens, `parse_rule(p).ok(p)` for optional nodes, and `parse_rule(p).or_add_diagnostic(p, error)` for required nodes

Applied to files:

  • crates/biome_html_parser/src/syntax/mod.rs
📚 Learning: 2025-11-24T18:05:27.810Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_js_formatter/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:05:27.810Z
Learning: Applies to crates/biome_js_formatter/**/*.rs : For tokens that are not mandatory, use helper functions instead of hardcoding

Applied to files:

  • crates/biome_html_parser/src/syntax/mod.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Set `version` field to `next` in `declare_lint_rule!` macro

Applied to files:

  • crates/biome_html_parser/src/syntax/mod.rs
📚 Learning: 2025-11-09T12:47:46.298Z
Learnt from: ematipico
Repo: biomejs/biome PR: 8031
File: crates/biome_html_parser/src/syntax/svelte.rs:140-147
Timestamp: 2025-11-09T12:47:46.298Z
Learning: In the Biome HTML parser, `expect` and `expect_with_context` consume the current token and then lex the next token. The context parameter in `expect_with_context` controls how the next token (after the consumed one) is lexed, not the current token being consumed. For example, in Svelte parsing, after `bump_with_context(T!["{:"], HtmlLexContext::Svelte)`, the next token is already lexed in the Svelte context, so `expect(T![else])` is sufficient unless the token after `else` also needs to be lexed in a specific context.

Applied to files:

  • crates/biome_html_parser/src/syntax/mod.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Prefix line with `#` in documentation code examples sparingly; prefer concise complete snippets

Applied to files:

  • crates/biome_html_parser/tests/quick_test.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Use `full_options` code block property for complete biome.json configuration snippets in documentation

Applied to files:

  • crates/biome_html_parser/tests/quick_test.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Lines prefixed with `#` in rule documentation code examples will be hidden from output

Applied to files:

  • crates/biome_html_parser/tests/quick_test.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Use `options` code block property for rule-specific configuration snippets in documentation

Applied to files:

  • crates/biome_html_parser/tests/quick_test.rs
📚 Learning: 2025-11-24T18:05:20.371Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_formatter/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:05:20.371Z
Learning: Applies to crates/biome_formatter/**/biome_*_formatter/tests/language.rs : Implement `TestFormatLanguage` trait in `tests/language.rs` for the formatter's test language

Applied to files:

  • crates/biome_html_parser/tests/quick_test.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Use `ignore` code block property to exclude documentation code examples from automatic validation

Applied to files:

  • crates/biome_html_parser/tests/quick_test.rs
📚 Learning: 2025-11-24T18:05:27.810Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_js_formatter/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:05:27.810Z
Learning: Applies to crates/biome_js_formatter/**/*.rs : Do not attempt to 'fix' the code; if a token/node is known to be mandatory but is missing, return `None` instead

Applied to files:

  • crates/biome_html_parser/tests/quick_test.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Each rule option must have its own h3 header with description, default value, options block, and code example

Applied to files:

  • crates/biome_html_parser/tests/quick_test.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Rule documentation code blocks must have a language defined for syntax highlighting

Applied to files:

  • crates/biome_html_parser/tests/quick_test.rs
📚 Learning: 2025-11-24T18:05:20.371Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_formatter/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:05:20.371Z
Learning: Define `FormatHtmlSyntaxNode` struct in a `cst.rs` file implementing `FormatRule<HtmlSyntaxNode>`, `AsFormat<HtmlFormatContext>`, and `IntoFormat<HtmlFormatContext>` traits using the provided boilerplate code

Applied to files:

  • crates/biome_html_parser/tests/quick_test.rs
📚 Learning: 2025-12-21T21:15:03.782Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: CONTRIBUTING.md:0-0
Timestamp: 2025-12-21T21:15:03.782Z
Learning: Applies to **/*.rs : Use doc tests (doctest) format with code blocks in rustdoc comments; ensure assertions pass in tests

Applied to files:

  • crates/biome_html_parser/tests/quick_test.rs
🧬 Code graph analysis (1)
crates/biome_html_parser/src/syntax/mod.rs (2)
crates/biome_html_parser/src/syntax/svelte.rs (1)
  • is_at_svelte_keyword (980-997)
crates/biome_parser/src/lib.rs (1)
  • cur (183-185)
🪛 LanguageTool
.changeset/itchy-zoos-swim.md

[style] ~4-~4: Consider using a different verb for a more formal wording.
Context: --- "@biomejs/biome": patch --- Fixed an issue where the Biome HTML parser wo...

(FIX_RESOLVE)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Lint project (depot-ubuntu-24.04-arm-16)
  • GitHub Check: Lint project (depot-windows-2022)
  • GitHub Check: Test (depot-ubuntu-24.04-arm-16)
  • GitHub Check: Test (depot-windows-2022-16)
  • GitHub Check: Documentation
  • GitHub Check: Check Dependencies
  • GitHub Check: End-to-end tests
  • GitHub Check: autofix
  • GitHub Check: Test Node.js API
🔇 Additional comments (6)
crates/biome_html_parser/src/syntax/mod.rs (3)

134-134: Good refactor to use feature flag mechanism.

Using HtmlSyntaxFeatures::Vue.is_supported(p) is more consistent with the pattern used elsewhere in the codebase and improves maintainability.


184-190: Core fix looks solid.

Detecting keywords after the opening element and re-lexing as HTML text follows the established pattern used for Svelte keywords (see lines 273-277). This should correctly handle keywords appearing in tag text content.


694-700: Helper functions are clean and well-scoped.

Centralising keyword detection in is_at_keyword and is_at_html_keyword improves code clarity and maintainability.

.changeset/itchy-zoos-swim.md (1)

1-5: Changeset looks good.

The description clearly communicates the user-facing fix. The static analysis suggestion to use "Resolved" instead of "Fixed" is pedantic—"Fixed" is perfectly standard for changelogs.

crates/biome_html_parser/tests/quick_test.rs (1)

7-12: Test update aligns with the fix.

The updated test input exercises the {@html content} expression handling, which is appropriate for validating the keyword parsing improvements.

crates/biome_html_parser/tests/html_specs/ok/text_keywords.html (1)

1-16: Comprehensive test coverage.

This test file thoroughly exercises the keyword handling fix, covering both HTML keywords (html, doctype) and Svelte keywords (const, render, if, else, await, etc.) within span elements. Nice work.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: revert quick test changes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's fine, it's meant to be changed, it's ignored after all

@ematipico ematipico merged commit 1022c76 into main Dec 24, 2025
13 checks passed
@ematipico ematipico deleted the fix/html-parser branch December 24, 2025 16:47
@github-actions github-actions bot mentioned this pull request Dec 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Parser Area: parser L-HTML Language: HTML and super languages

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants