Skip to content

Add automatic attribute styling for python#49

Closed
Red-M wants to merge 14 commits intoScintillaOrg:masterfrom
Red-M:master
Closed

Add automatic attribute styling for python#49
Red-M wants to merge 14 commits intoScintillaOrg:masterfrom
Red-M:master

Conversation

@Red-M
Copy link
Copy Markdown
Contributor

@Red-M Red-M commented Dec 31, 2021

This allows for styling attributes of identifiers like the screenshot below.
image

Please let me know if I've missed anything.

@nyamatongwe
Copy link
Copy Markdown
Member

This is a disruptive change that may be incompatible with downstream projects and usage. For example, it stops the following SciTE settings from highlighting "findall" and "replace" methods as these are now treated as SCE_P_ATTRIBUTE instead of SCE_P_IDENTIFIER:

substyles.python.11=3
substylewords.11.3.$(file.patterns.py)=findall replace
style.python.11.3=fore:#EEAA80,italics,bold

The change may cause scripts that depend on the results of lexing to break.

It could be safer to guard attribute recognition with an enabling lexer.python... property.

In lexicalClasses, the 3rd column are a set of common tokens that can be understood by projects to more easily discover the meaning of lexemes. Since attributes are identifiers, this should probably be "identifier".

The tests in the test subdirectory fail after this change so should be updated. See test/README.

The SCE_P_ATTRIBUTE state doesn't persist over new lines and some users like splitting complex lines after '.' so will see some attributes highlighted and others not:

x = s.findall("thing")
y = a.very.complicated.expression.
    findall("<p><script>// Inserted by macrothing")

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 1, 2022

Thank you for taking the time to look at this change.

I think it will be easier to add an option to enable this style of lexing, I only want it for making SciTE able to more closely replicate a code highlighting theme I enjoy.

I'll fix lexicalClasses, the tests and split lines.

Fix `lexicalClasses` description for attributes.
@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 1, 2022

I've fixed everything else up except the tests because I'm not sure why but I'm getting errors that don't seem related to my changes when I run lexilla/script/RunTest.sh.

g++ -DDEBUG -I ../../scintilla/include -I ../include -I ../access --std=c++2a -g -Wpedantic -Wall -Wextra   -c TestLexers.cxx -o TestLexers.o
TestLexers.cxx: In function ‘void {anonymous}::PrintLevel(std::ostringstream&, int)’:
TestLexers.cxx:66:11: error: ‘setw’ is not a member of ‘std’
   << std::setw(3) << levelNow << " "
           ^~~~

There are more errors but this is the first.

@rdipardo
Copy link
Copy Markdown
Contributor

rdipardo commented Jan 1, 2022

TestLexers.cxx: In function ‘void {anonymous}::PrintLevel(std::ostringstream&, int)’:
TestLexers.cxx:66:11: error: ‘setw’ is not a member of ‘std’
   << std::setw(3) << levelNow << " "
           ^~~~

lexilla/test/README

Lines 9 to 10 in ebb1211

TestLexers works on Windows, Linux, or macOS and requires a C++20 compiler.
MSVC 2019.4, GCC 9.0, Clang 9.0, and Apple Clang 11.0 are known to work.

Does g++ --version say 9.x.x or higher?

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 1, 2022

I'll need to update my g++ apologies.

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 1, 2022

The tests are passing now but I'll need to add new tests for this setting.

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 1, 2022

How should I add tests for this?
I have a feeling adding lexer.python.identifier.attributes=2 to lexilla/test/examples/python/SciTE.properties is probably the best but I am unsure how we can handle testing all of the modes out.

Update styling tests for python to include new attribute code to style, pending advice on how this should be done.
@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 1, 2022

I've figured the tests out and this should be good to review again.

@nyamatongwe
Copy link
Copy Markdown
Member

lexer.python.identifier.attributes guards existing users against unexpected changes. Unsure whether all 4 values are worthwhile or whether they belong in a single property.

lexicalClasses changed the wrong column. The last column is some explanatory text that can be displayed in a settings dialog. I'd expect the line to be:

	20, "SCE_P_ATTRIBUTE", "identifier", "Attribute of identifier",

The persistent state over new lines appears to work well.

The added tests are exhaustive which initially appears to be a good thing as changes to other lexical classes will be noticed. However, this may not be great for maintenance. Will someone adding a new test to AllStyles.py in the main directory know/remember to add it to each subdirectory? How will this scale to more settings? It may be better to add a new short test ("attributes.py"?) that concentrates on the new feature.

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 4, 2022

lexer.python.identifier.attributes guards existing users against unexpected changes. Unsure whether all 4 values are worthwhile or whether they belong in a single property.

I think how aggressive the amount of attribute styling is important as for example, I would prefer to use 3 but in your example of using:

substyles.python.11=3
substylewords.11.3.$(file.patterns.py)=findall replace
style.python.11.3=fore:#EEAA80,italics,bold

using 1 or 2 would be better as it would respect already styled attributes.
4 is there just as a stop gap for anyone who would prefer to do whatever they'd like with any attributes.

lexicalClasses changed the wrong column. The last column is some explanatory text that can be displayed in a settings dialog. I'd expect the line to be:

I have fixed this in the latest commit.

The added tests are exhaustive which initially appears to be a good thing as changes to other lexical classes will be noticed. However, this may not be great for maintenance. Will someone adding a new test to AllStyles.py in the main directory know/remember to add it to each subdirectory? How will this scale to more settings? It may be better to add a new short test ("attributes.py"?) that concentrates on the new feature.

I have reduced the number of tests to just testing mode 2 and have renamed the test and made changes to reduce the maintenance work required.

@nyamatongwe
Copy link
Copy Markdown
Member

That appears to have improved things.

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 5, 2022

Thanks, please let me know if there is anything else I need to do for this to get merged.

@nyamatongwe
Copy link
Copy Markdown
Member

This has sat in my queue for a while, almost getting pushed but the 5 state setting looks to me like it will confuse users, particularly the difference between 2 and 3. Complex settings don't get used. The main requirement of the setting is to protect current user setups while allowing some progress. However, that progress doesn't have to reach the final state which may be better informed if a simpler version was introduced initially.

Identifier and decorator attributes appear distinct to me so would be better with distinct properties which would both default to 0.

  • lexer.python.identifier.attributes
    • 0: current styling for attributes of identifiers
    • 1: SCE_P_ATTRIBUTE
  • lexer.python.decorator.attributes
    • 0: current styling for attributes of decorators
    • 1: SCE_P_DECORATOR

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 13, 2022

Could I also suggest a 3rd option which controls if already styled attributes are respected?

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 14, 2022

Just noting that if there is a way to get the styler to give me the style at a text location for backwards searching, I could remove the entire loop for comment and decorator detection. It may make the new code easier to read and maintain.

@rdipardo
Copy link
Copy Markdown
Contributor

Just noting that if there is a way to get the styler to give me the style at a text location for backwards searching, I could remove the entire loop for comment and decorator detection.

LexAccessor::StyleAt may be what you're looking for:

char StyleAt(Sci_Position position) const {
return pAccess->StyleAt(position);
}

Sci_Position is a signed type, so you can safely pass in the difference of the current location and an offset amount

I don't see how the loop can be eliminated, unless you're expecting the significant character to be at exactly the start or end of the line. But it's always a good idea to check the lexical style and character value at the same time — e.g., @ could belong to a decorator or an email domain "...@aol.com"

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 14, 2022

I did try StyleAt but didn't seem to give me anything except for 0.

@zufuliu
Copy link
Copy Markdown
Contributor

zufuliu commented Jan 14, 2022

I did try StyleAt but didn't seem to give me anything except for 0.

See issue #54, you need to call Flush() first to flush cached styles in styleBuf.

@nyamatongwe
Copy link
Copy Markdown
Member

Could I also suggest a 3rd option which controls if already styled attributes are respected?

That may be a detour. My thinking was that if users consider attributes to be different then there should be distinct list(s) of highlighted attributes. Or maybe consider identifier.attribute pairs ('codecs.open', 'pathlib.Path'). I didn't say that as there are many possibilities here making it a large topic which should be discussed in a separate issue.

@nyamatongwe
Copy link
Copy Markdown
Member

Just noting that if there is a way to get the styler to give me the style at a text location for backwards searching, I could remove the entire loop for comment and decorator detection. It may make the new code easier to read and maintain.

Another possibility is to maintain a state enumeration (start, identifier, identifierDot, ...) and store / retrieve it with line state or a lexer field similar to ftripleStateAtEol.

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 15, 2022

Could I also suggest a 3rd option which controls if already styled attributes are respected?

That may be a detour. My thinking was that if users consider attributes to be different then there should be distinct list(s) of highlighted attributes. Or maybe consider identifier.attribute pairs ('codecs.open', 'pathlib.Path'). I didn't say that as there are many possibilities here making it a large topic which should be discussed in a separate issue.

Personally, I wanted this change so that lists of attributes didn't have to be maintained, I might want to change the idea behind this to be around allowing default styling of attributes. Since I was wanting to be able to imitate themes from other editors. If such a list is going to be required, please just decline this PR.

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 15, 2022

Give me a moment to figure out how I want to present this patch.

@Red-M Red-M changed the title Add attribute style for python Add automatic subword styling for python Jan 15, 2022
@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 15, 2022

That is more along the lines of what I think is being offered.

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 15, 2022

I think this is a good starting point as is and would prefer this to be accepted as is, if there are more changes you'd like, please let me know.

@nyamatongwe
Copy link
Copy Markdown
Member

(1) a 3rd option which controls if already styled attributes are respected

(2) I might want to change the idea behind this to be around allowing default styling of attributes. Since I was wanting to be able to imitate themes from other editors.

(1) implies you want attributes to be styled specially when they are included in an special identifier list but (2) says you want attributes to just be a single style that isn't affected by any lists.

If such a list is going to be required, please just decline this PR.

Did you interpret my comment as not styling attributes as attributes unless they are in a list? I meant they might be plain attributes unless in a list just like identifiers.

"Attribute" is reasonably well understood in Python but "subword" is unfamiliar. Is this commonly used?

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 15, 2022

(1) implies you want attributes to be styled specially when they are included in an special identifier list but (2) says you want attributes to just be a single style that isn't affected by any lists.

I am on the side of (2) here but I don't mind if subwords from substylewords (eg, #49 (comment) ) affecting this styling. I think both options should be offered since I've been wanting this kind of styling for years but didn't really know enough of how to implement it since I didn't know enough C to even begin to touch C++.

Did you interpret my comment as not styling attributes as attributes unless they are in a list? I meant they might be plain attributes unless in a list just like identifiers.

I did and your clarification is a relief since I really don't enjoy the idea as a developer of libraries to have to release/maintain wordlists for editors to pick up on, its kinda why I did this feature the way I did since it allowed styling identifiers differently.

"Attribute" is reasonably well understood in Python but "subword" is unfamiliar. Is this commonly used?

I used subwords from the vernacular that is in scite/scintilla vs Python, similar to lexer.python.keywords2.no.sub.identifiers or substylewords.

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 15, 2022

I am happy to revert the subword change back to attributes however.

@nyamatongwe
Copy link
Copy Markdown
Member

"substyle" is a Lexilla term for taking one style (like SCE_P_IDENTIFIER) and dynamically at runtime dividing that into sets based on lists each with its own dynamically assigned style number.

The problem it solves was that users want multiple sets of identifiers highlighted differently (sometimes one colour per library) and there may be multiple types of identifier-like lexemes in a file (in HTML: tags, attributes, JavaScript identifiers, PHP identifiers, ...). Lexilla has a fixed number of styles available and different users may want more distinctions in different types.

I think the last 2 changes on this pull request should be reverted and the property descriptions simplified to just describe their positive effect for the value 1 as they are booleans. Something like:

DefineProperty("lexer.python.identifier.attributes", &OptionsPython::identifierAttributes,
	       "Set to 1 to recognise Python identifier attributes.");

DefineProperty("lexer.python.decorator.attributes", &OptionsPython::decoratorAttributes,
	       "Set to 1 to recognise Python decorator attributes.");

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 16, 2022

I think that is fine and I'll revert those changes this afternoon.

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 17, 2022

That should be what we're looking for, after I fix the tests.

@Red-M Red-M changed the title Add automatic subword styling for python Add automatic attribute styling for python Jan 18, 2022
@nyamatongwe nyamatongwe added the python Caused by the Python lexer label Jan 19, 2022
nyamatongwe pushed a commit that referenced this pull request Jan 19, 2022
…ifier.attributes and

 lexer.python.decorator.attributes.
@nyamatongwe
Copy link
Copy Markdown
Member

Committed feature.

Credited to "Red-M". If you want a different name in the credits then reply with desired name.

@Red-M
Copy link
Copy Markdown
Contributor Author

Red-M commented Jan 19, 2022

Red_M please, I have a homepage at red-m.net

nyamatongwe added a commit that referenced this pull request Jan 20, 2022
@nyamatongwe
Copy link
Copy Markdown
Member

Red_M please

Fixed.

@nyamatongwe
Copy link
Copy Markdown
Member

Included in 5.1.5 release.

@nyamatongwe nyamatongwe closed this Feb 9, 2022
kugel- added a commit to kugel-/geany that referenced this pull request Sep 6, 2023
"attribute" is for highlighting object attributes or methods, i.e.
to style foo and bar differently in foo.bar.

Also set lexer properties to allow scintilla using different styles.

See ScintillaOrg/lexilla#49
kugel- added a commit to kugel-/geany that referenced this pull request Oct 4, 2023
"attribute" is for highlighting object attributes or methods, i.e.
to style foo and bar differently in foo.bar.

Also set lexer properties to allow scintilla using different styles.

See ScintillaOrg/lexilla#49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Caused by the Python lexer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants