Are ">>"s in type parameters tokenized using a special rule?

Matt Fenwick 2013-05-29 10:03

Based on reading the code linked by @sm4, it looks like the strategy is:

tokenize the input normally. So A<B<C>> i; would be tokenized as A, <, B, <, C, >>, i, ; -- 8 tokens, not 9.
during hierarchical parsing, when working on parsing generics and a > is needed, if the next token starts with > -- >>, >>>, >=, >>=, or >>>= -- just knock the > off and push a shortened token back onto the token stream. Example: when the parser gets to >>, i, ; while working on the typeArguments rule, it successfully parses typeArguments, and the remaining token stream is now the slightly different >, i, ;, since the first > of >> was pulled off to match typeArguments.

So although tokenization does happen normally, some re-tokenization occurs in the hierarchical parsing phase, if necessary.

johnchen902 2013-05-31 07:34:59

Why compiler don't re-tokenize things such as a--b?

Matt Fenwick 2013-05-31 11:12:39

@johnchen902 why should it? Retokenization isn't a general strategy for rescuing failed parses, it's only used in one special case so that you don't have to write A<B<C> >.

Related issues

Alternative approach for checking data types to build an object

Is it possible to get first and last name from Google Authentication on Firebase?

Constructor that receives TXT files to read and store them

Logback with Elastic Beanstalk

Camel Predicate Example in xml DSL

How do I get user input validation into my java calculator program?

firebase getting exact childs from database

Listener for custom dialog null

JFrame very small

AWS Elastic Beanstalk Application Logging with Logback