Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion beancount/core/account.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
# L: All letters.
# Nd: Decimal numbers.
ACC_COMP_TYPE_RE = r"[\p{Lu}][\p{L}\p{Nd}\-]*"
ACC_COMP_NAME_RE = r"[\p{Lu}\p{Nd}][\p{L}\p{Nd}\-]*"
ACC_COMP_NAME_RE = r"[\p{Lu}\p{Nd}]\p{Han}][\p{L}\p{Nd}\p{Han}\-]*"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my limited understanding of Unicode properties and scripts not using the Latin alphabet, I think adding the Han character class to the regular expression is correct. However, there is a syntax error in the regular expression: there are two closing brackets ] but just one opening one [. I infer that this change has not been tested. Also, this changes the regular expression for the account name components after the first, but it does not change the one for the leading component (ACC_COMP_TYPE_RE vs ACC_COMP_NAME_RE). I think both need to be changed in a similar way.


# Regular expression string that matches a valid account. {5672c7270e1e}
ACCOUNT_RE = r"(?:{})(?:{}{})+".format(ACC_COMP_TYPE_RE, sep, ACC_COMP_NAME_RE)
Expand Down