Skip to content

UDL Operator can't recognize unicode characters #12161

@byzod

Description

@byzod

Description of the Issue

UDL operator takes unicode characters as valid input but it didn't work actually

Steps to Reproduce the Issue

  1. Create a new language template, add = * × ; ; to operators 1 and color it for better observation
  2. Open file with content:
    123*456=789
    123×456=789
    123;456=789
    123;456=789

Expected Behavior

All Characters except numbers are highlighted with style you set for operators 1

Actual Behavior

Only * = ; are highlighted
1

Debug Information

Notepad++ v8.4.5 (64-bit)
Build time : Sep 3 2022 - 04:05:32
Path : C:\App\something\Notepad++\Notepad++.exe
Command Line : "K:\Down\1.tst"
Admin mode : OFF
Local Conf mode : ON
Cloud Config : OFF
OS Name : Windows 10 Home China (64-bit)
OS Version : 21H2
OS Build : 19044.1889
Current ANSI codepage : 936
Plugins :
BetterMultiSelection (1.5)
ComparePlus (1)
CSharpRegexTools4Npp (1.1.2)
EmmetNPP (1.0.2)
HexEditor (0.9.12)
JSMinNPP (1.2006)
mimeTools (2.8)
nppAutoDetectIndent (2.3)
NppConverter (4.4)
NppExport (0.4)
NppTextViz (0.4.2)
PythonScript (2)

More info

The ×(0x00D7) is a common sign in math while (0xFF1B) is simply ; in chinese form, the UDL.xml records those correctly but it don't work. The .xml file reads:

<Keywords name="Operators1">= * &#x00D7; ; &#xFF1B;</Keywords>

Metadata

Metadata

Assignees

No one assigned

    Labels

    udlEverything related to User Defined Language

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions