Skip to content

Panic when stateful lexer's non-Root rule has optional group but captures nothing #324

@7sDream

Description

@7sDream

Above example shows a common design: assume we have some keyword (x in this example), so it can not be parsed as a Ident in whole program, we want use a special symbol % to remove this limit: if prefix with %, allow use a keyword as ident.

When testing it, aa + bb is parsed succefully, x + y will failed because x is a keyword, not a ident, as excepted.

Next it parse %x + y, and panic.

If we change pattern of ident to [[:alpha:]][[:alnum:]]* (Just remove the inner group, or add ?: make it non-capture), it will works fine.

Test code
package main

import (
    "fmt"

    "github.com/alecthomas/participle/v2"
    "github.com/alecthomas/participle/v2/lexer"
)

type Plus struct {
    Lhs string `parser:"@Ident"`
    Op  string `parser:"@'+'"`
    Rhs string `parser:"@Ident"`
}

func main() {
    parser := participle.MustBuild[Plus](
	    participle.Lexer(lexer.MustStateful(lexer.Rules{
		    "Root": {
			    {"whitespace", ` +`, nil},
			    {"Op", `\+`, nil},
			    {"Keyword", `x`, nil},
			    {"Ident", `[[:alpha:]]([[:alnum:]])*`, nil},
			    {"percent", `%`, lexer.Push("Percent")},
		    },
		    "Percent": {
			    {"Ident", `[[:alpha:]]([[:alnum:]])*`, lexer.Pop()},
		    },
	    })),
    )

    ast, err := parser.ParseString("input", "aa + bb")
    fmt.Printf("ast: %#v, err: %#v\n\n", ast, err)

    ast, err = parser.ParseString("input", "x + y")
    fmt.Printf("ast: %#v, err: %#v\n\n", ast, err)

    ast, err = parser.ParseString("input", "%x + y")
    fmt.Printf("ast: %#v, err: %#v\n\n", ast, err)
}

Maybe caused by here, the pattern of Ident call FindStringSubmatchIndex with data x + y will return a [0, 1, -1, -1], the last two -1 means the inner group ([[:alnum:]]) never captures:

The simplest way to fixes this may be add a guard:

if match[i] >= 0 {
	groups = append(groups, l.data[match[i]:match[i+1]])
}

But I'm not sure how this group intergrate with Action interface, may be it needs some type to indicate the empty capture? So I just open this issue, not a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions