Description of the Issue
Auto Detect UTF-8 Encoding for French is broken in Notepad++ 7.6.x
Steps to Reproduce the Issue
- create a new utf-8 document in notepad++
- paste or write the word "Mosaïque" in it
- save as test.txt
- close test.txt in notepad++ and reopen it
(or close notepad++ and reopen it if sessions are enabled)
same for this file (direct download link to paquet.xml):
https://zone.spip.net/trac/spip-zone/export/HEAD/spip-zone/_plugins_/mosaique/trunk/paquet.xml
further information and files to test, submitted by Franckybleu, at: https://notepad-plus-plus.org/community/topic/16873/encodage/3
Expected Behavior
test.txt should be detected as utf-8
Actual Behavior
test.txt is detected as vietnamese windows-1258
Debug Information
Notepad++ v7.6.2 (32-bit)
Build time : Jan 1 2019 - 00:00:08
Path : C:\Program Files (x86)\Notepad++\notepad++.exe
Admin mode : OFF
Local Conf mode : OFF
OS : Windows 7 (64-bit)
Plugins : DSpellCheck.dll mimeTools.dll NppConverter.dll NppExport.dll
Edit:
if you write the word “Réservation” to a new utf-8 file and save and reopen it it is also detected as vietnamese.
@guy038 has discovered, that the sentence "Cette mosaïque était jolie" in a new utf-8 file will be detected correctly as utf-8
but further tests show that a utf-8 file only containing one of the words mosaïque or était will be detected as vietnamese, so only the combination of in this example an existing ï and an existing é will detect correctly.
same vietnamese detection happens with spanish, except if an ñ is present.
german characters seem to work fine
Description of the Issue
Auto Detect UTF-8 Encoding for French is broken in Notepad++ 7.6.x
Steps to Reproduce the Issue
(or close notepad++ and reopen it if sessions are enabled)
same for this file (direct download link to paquet.xml):
https://zone.spip.net/trac/spip-zone/export/HEAD/spip-zone/_plugins_/mosaique/trunk/paquet.xml
further information and files to test, submitted by Franckybleu, at: https://notepad-plus-plus.org/community/topic/16873/encodage/3
Expected Behavior
test.txt should be detected as utf-8
Actual Behavior
test.txt is detected as vietnamese windows-1258
Debug Information
Notepad++ v7.6.2 (32-bit)
Build time : Jan 1 2019 - 00:00:08
Path : C:\Program Files (x86)\Notepad++\notepad++.exe
Admin mode : OFF
Local Conf mode : OFF
OS : Windows 7 (64-bit)
Plugins : DSpellCheck.dll mimeTools.dll NppConverter.dll NppExport.dll
Edit:
if you write the word “Réservation” to a new utf-8 file and save and reopen it it is also detected as vietnamese.
@guy038 has discovered, that the sentence "Cette mosaïque était jolie" in a new utf-8 file will be detected correctly as utf-8
but further tests show that a utf-8 file only containing one of the words mosaïque or était will be detected as vietnamese, so only the combination of in this example an existing ï and an existing é will detect correctly.
same vietnamese detection happens with spanish, except if an ñ is present.
german characters seem to work fine