Skip to content

Conversation

mmomtchev
Copy link
Contributor

@mmomtchev mmomtchev commented Mar 19, 2024

This PR adds support for C++11/C23 [[attributes]] and allows to use the existing match$ in %rename.

The C++ specifications is deliberately very vague about where this attributes can be placed. I have tried to support all cases that I have seen. All types of attributes are parsed and attached to the AST, but only declaration attributes for variables and functions are available to the end user at the moment.

In order to safeguard against eventual unexpected placement of attributes, -ignoreattrs allows to revert to the old behavior - ignoring all attributes directly in the tokenizer.

This PR also switches most of the larger bison codeblocks to use named references in order to make further modifications and especially experimenting with the grammar easier.

The new main grammar item, attribute contains %empty and has to always precede storage_class which also has %empty.

The only exception is function argument attributes which use an enumeration without %empty in order to avoid shift/reduce conflicts with template parameters which use the same items and do not support attributes.

The only other behavior change is the %rename directive where

$match$attribute$deprecated

without =value matches every occurrence of deprecated which can be [[deprecated]] but also [[deprecated("reason")]].

Res: #2837

@ojwb
Copy link
Member

ojwb commented Mar 20, 2024

I really think we want to do the grammar change to use named references as a separate commit (as I said in the issue earlier) - firstly so it's easy to verify that the generated code is unchanged by this refactor, but also to make it actually feasible to review the functional changes to the grammar for this.

@mmomtchev
Copy link
Contributor Author

I see that there is a callparms grammar item - which should probably be used instead of a new item (attribute_arg_list) with the difference being that callparms returns a string, while attribute_arg_list is a numbered hash. If I reduce the arguments to a string this will render the syntax for most attributes simpler, but won't allow matching on a single argument when having multiple arguments - which probably is not a needed feature.

@@ -105,6 +105,7 @@ static const char *usage2 = "\
-I- - Don't search the current directory\n\
-I<dir> - Look for SWIG files in directory <dir>\n\
-ignoremissing - Ignore missing include files\n\
-ignoreattrs - Ignore C++11/C23 [[attributes]]\n\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this a command line option seems unhelpful - it's something the user is likely to need to specify for a particular interface file (because it's something that is likely to be needed based on particular uses of attributes in third party headers they're trying to wrap), so it would be better to be able to specify it in the interface file (if it's a global setting, perhaps as a parameter on %module; if it's something that's useful to control in a more fine-grained way, perhaps as a %feature. I can see one might theoretically want it on for one wrapped header but not another, but that may be a bit of a contrived example).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did mostly because the specifications for [[attribute]] are very vague and I am afraid that now that parsing is implemented and mandatory, there might be cases where the parser wouldn't expect an [[attribute]] and this header won't pass at all in SWIG.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems reasonable to have a way to switch attribute handling back to the current "just ignore them in the lexer", I'm just saying making it a command line option seems the wrong approach for how to specify this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a parser option, it does not belong in the parsed text. Since an attribute cannot contain a %feature it can be made to work, but still, I don't think it belongs there.

@@ -22,9 +27,13 @@
#pragma warning(disable : 5030) // attribute is not recognized ('likely' and 'unlikely')
#endif

#ifndef __BIGGEST_ALIGNMENT__
#define __BIGGEST_ALIGNMENT__ 16
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining a macro with a reserved name makes the program ill-formed so it'd be better to use e.g. BIGGEST_ALIGNMENT. If it actually matters to use __BIGGEST_ALIGNMENT__ when it is defined by the compiler we could define it using

#ifdef __BIGGEST_ALIGNMENT__
#define BIGGEST_ALIGNMENT __BIGGEST_ALIGNMENT__
#else
#define BIGGEST_ALIGNMENT 16
#endif

and then use BIGGEST_ALIGNMENT below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the only anti-pattern in the unit tests, this will only add additional clutter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change this.

int data[1] = { 0 };
int *a = data;
int b = a[a[0]];
}
Copy link
Member

@ojwb ojwb Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point here is to ensure that SWIG can parse int b = a[a[0]]; (in particular that the ]] is broken apart properly. SWIG doesn't parse function bodies, so moving this code into a function body doesn't actually test what was intended here.

(The reason this part is currently deactivated is just that the parser doesn't currently actually handle a double array dereference in an expression, but once that is fixed we want to be testing this properly.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will still have to be tokenized if attributes parsing is disabled - this is the only case when this could not work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parser just skips tokens between the { at the start of a function body and its matching closing }, so putting it in a function body will parse OK even if the ]] is scanned as a single token.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next time you permit yourself to make a comment in order to pass a criminal message, I will seriously reconsider my decision from last year to skip sending an email to the legal department of every institution mentioned in the SWIG license files to inform them of a copyright problem involving plagiarism for a criminal extortion related to a series of falsified legal procedures in the EU. I will be completely honest and I will attest that you most probably do not have anything to do with the organized judicial and police corruption, the sexual elements, the international prostitution ring or the drug trade taking part in this affair. My patience has limits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmomtchev, this last comment appears threatening, is not constructive and is not welcome in the collaborative SWIG development community that we strive for. @ojwb is merely trying to help review and improve your (very welcome) contribution. I don't understand the comment as it appears out of context and I hope it has been posted on the wrong issue, in which case, please withdraw it. If not and if you have some non-technical issue about SWIG that you would like to discuss, please raise a separate discussion thread. Otherwise, I ask you to kindly keep discussions on topic to C++11 attributes. Thank-you for you understanding.

@erezgeva
Copy link
Contributor

This PR adds support for C++11/C23 [[attributes]] and allows to use the existing match$ in %rename.

The C++ specifications is deliberately very vague about where this attributes can be placed. I have tried to support all cases that I have seen. All types of attributes are parsed and attached to the AST, but only declaration attributes for variables and functions are available to the end user at the moment.

In order to safeguard against eventual unexpected placement of attributes, -ignoreattrs allows to revert to the old behavior - ignoring all attributes directly in the tokenizer.

This PR also switches most of the larger bison codeblocks to use named references in order to make further modifications and especially experimenting with the grammar easier.

The new main grammar item, attribute contains %empty and has to always precede storage_class which also has %empty.

The only exception is function argument attributes which use an enumeration without %empty in order to avoid shift/reduce conflicts with template parameters which use the same items and do not support attributes.

The only other behavior change is the %rename directive where

$match$attribute$deprecated

without =value matches every occurrence of deprecated which can be [[deprecated]] but also [[deprecated("reason")]].

Res: #2837

Pardon for asking.
I understand in general the need to support C/C++ attributes.
I would like to ask some clarification questions:

  1. Do you mean parsing C/C++ code that uses the attributes or add ones to the generated code?
  2. I understand C++ 11 is the default, why C23? It seems like a big gap. Perhaps I missed it?
  3. What about newer C++ standard attributes optionally? At least as for parsing?

Erez

@mmomtchev
Copy link
Contributor Author

Attributes exist in C++ starting with C++11 and C starting with C23. The attributes themselves are not interpreted in any way - they are simply made available to the end user for matching %feature. The first use case is libraries that depend on gnu::visibility to not export certain methods/symbols.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants