Skip to content

Unnecessary Isolated States and Epsilon Transitions in replace_reluctant_regex #568

@koniksedy

Description

@koniksedy

The function mata::applications::strings::replace::replace_reluctant_regex often constructs nft_reluctant_replace with

  1. unnecessary isolated states, and
  2. multiple consecutive epsilon transitions between two zero-level states.

This does not affect the correctness of the algorithm. However, the redundant epsilon transitions can significantly increase the number of transitions in the composition.

For example, when running the following code:

mata::EnumAlphabet alphabet{ 'a', 'f' };
Nft nft{ mata::applications::strings::replace::replace_reluctant_regex("a", { 'f' }, &alphabet) };

the nft_reluctant_replace inside the mata::applications::strings::replace::replace_reluctant_regex function looks like:

Image

Here, you can see that states 2 and 5 are isolated. Additionally, there is an unnecessary sequence of epsilon transitions: 12--->11--->13--->14--->6

Ideally, states 2 and 5 should not be present, and the epsilon chain should be replaced with a direct transition 10 ---f---> 6.

Metadata

Metadata

Assignees

No one assigned

    Labels

    For:libraryThe issue is related to library (c++ implementation)Module:nftThe issue is related to Nondeterministic Finite TransducersType:optimizationThis issue is related to a possible optimization of an algorithm, improving the performance of Mata.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions