Skip to content

🐛 useValidLang rule incorrectly rejects valid BCP 47 language tags that include script subtags, such as zh-Hans-CN  #8117

@0xDing

Description

@0xDing

Environment information

CLI:
  Version:                      2.3.5
  Color support:                true

Platform:
  CPU Architecture:             aarch64
  OS:                           macos

Environment:
  BIOME_LOG_PATH:               unset
  BIOME_LOG_PREFIX_NAME:        unset
  BIOME_CONFIG_PATH:            unset
  BIOME_THREADS:                unset
  NO_COLOR:                     unset
  TERM:                         xterm-256color
  JS_RUNTIME_VERSION:           v24.3.0
  JS_RUNTIME_NAME:              node
  NODE_PACKAGE_MANAGER:         bun/1.3.2

Biome Configuration:
  Status:                       Loaded successfully
  Path:                         biome.json
  Formatter enabled:            true
  Linter enabled:               true
  Assist enabled:               true
  VCS enabled:                  true

Workspace:
  Open Documents:               0

What happened?

The useValidLang rule incorrectly rejects valid BCP 47 language tags that include script subtags, such as zh-Hans-CN (Simplified Chinese as used in China).

Expected result

According to BCP 47 and IANA Language Subtag Registry, these tags should be valid:

<html lang="zh-Hans-CN" />  // ✅ Should be valid (Simplified Chinese, China)
<html lang="zh-Hant-TW" />  // ✅ Should be valid (Traditional Chinese, Taiwan)
<html lang="sr-Cyrl-RS" />  // ✅ Should be valid (Serbian, Cyrillic script, Serbia)

Valid BCP 47 Language Tag Format

BCP 47 language tags follow this structure:

language [-script] [-region] [-variant] [-extension] [-privateuse]

Root Cause

The current implementation in use_valid_lang.rs only supports the simple language-country format:

let mut split_value = attribute_text.split('-');
match (split_value.next(), split_value.next()) {
    (Some(language), Some(country)) => {
        // Validates as language-country only
        if !is_valid_language(language) { /* ... */ }
        else if !is_valid_country(country) { /* ... */ }
        else if split_value.next().is_some() {
            // Rejects any additional subtags
            return Some(UseValidLangState {
                invalid_kind: InvalidKind::Value,
            });
        }
    }
    // ...

Code of Conduct

  • I agree to follow Biome's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LinterArea: linterS-Bug-confirmedStatus: report has been confirmed as a valid bugS-Help-wantedStatus: you're familiar with the code base and want to help the projectgood first issueGood for newcomers

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions