This is a quite basic theoritical question. I started learning the C language. I came across the topic Tokens in C.
Quoting from geeksforgeeks.org,
A token is the smallest element of a program that is meaningful to the compiler.Tokens can be classified as follows:
- Keywords
- Identifiers
- Constants
- Strings
- Special Symbols
- Operators
Why strings are considered as a token while arrays aren’t?
I tried to search for this. I googled, used ChatGPT etc. But all this was of no use.
According to ChatGPT,
In C, strings are represented as arrays of characters, with a null terminator (‘\0’) at the end to indicate the end of the string. Each character in the string is a distinct element that can be individually accessed and manipulated, making them tokens. Additionally, C provides a number of built-in functions for working with strings, such as strlen(), strcpy(), and strcat(), which operate on these individual characters.
On the other hand, arrays in C are simply a contiguous block of memory that stores a collection of values of the same data type. Unlike strings, there is no special terminator or delimiter that separates one element from another, and there are no built-in functions specifically designed to operate on individual elements within an array.
I still couldn’t understand why is string a token when arrays aren’t.
Sorry for my poor english.
>Solution :
Geeksforgeeks is almost as bad a source for learning as ChatGPT.
It is true that strings in C consists of null-terminated character arrays. But what it means to say is string literals, "these things". That is, a constant used to initialize character arrays or to use as a read-only string.
Similarly, "constants" does not refer to things like const int x=1; but rather just the number part 1 – this is what formal C means when it refers to an integer constant (sometimes also called "integer literal" although that term is strictly speaking not correct).
Note that tokens is mostly a concept that matters when writing macros, it’s not a concept that beginners usually have to worry about. The formal grammar (C17 6.4), "lexical elements", groups everything in C in these groups/sub-chapters:
- Keywords
- Identifiers
- Universal character names
- Constants
- String literals
- Punctuators
- Header names
- Preprocessing numbers
- Comments