I am coding a simple parser in C as an assignment, and I’m trying to reuse as much code as I can from the previous one. I had the following snippets:
Tokens:
typedef struct token {
const char *lexeme; //the actual string value of the token
Category category; //the category of the token
} Token;
Token create_token(const char *lexeme, Category cat) {
Token token = {.lexeme = lexeme, .category = cat};
return token;
}
Automatas:
typedef struct automata {
//does not directly store token but is easy to retrieve
Category token_type; //which type of token is the automata supposed to scan
char *scanned; //char buffer to store read characters from a string. Not realloced because it's been given reasonable capacity
int lexeme_capacity; //capacity of scanned buffer. Dynamic change is not implemented, all tokens should be at most this length
...
} Automata;
Automata create_automata(int num_states, char *accepted_chars,
int num_accepted_states, const int *accepted_states,
Category token_type) {
...
//create and return an automata
Automata automata = {
.token_type = token_type,
//TODO malloc returns an allocated pointer so it is not null but it must take into account possible overflows
.scanned = malloc(DEFAULT_LEXEME_LENGTH * sizeof(char)),
.lexeme_capacity = DEFAULT_LEXEME_LENGTH,
...
};
return automata;
}
Token get_token(Automata *automata) {
// if automata is not in accepting state, it did not recognize the lexeme
Category category = accept(automata) ? automata->token_type : CAT_NONRECOGNIZED;
// easy to understand if written like this
Token value = {
.lexeme = automata->scanned,
.category = category,
};
return value;
}
From what I understand, any token I create using the get_token function is going to contain a char *to heap memory. This worked fine for my previous assignment.
As part of the current assignment, I want to be able to read tokens from a file. Since I can’t just have a token contain a pointer to data on a file, I’ve created the following functions to directly create a token from stack memory:
Token allocate_token(const char *lexeme, size_t lexeme_len, Category cat) {
//allocate and copy into heap memory the contents of the lexeme string
char *heap_allocated = calloc(lexeme_len + 1, sizeof(char));
memcpy(heap_allocated, lexeme, lexeme_len * sizeof(char));
//create and return a token
return create_token(heap_allocated, cat);
}
void free_token(Token *token) {
free(&token->lexeme);
}
However, CLion warns that the free_token function is "attempting to call free on non-heap object ‘lexeme’". I can’t see how lexeme would not be a heap allocated pointer.
I want to know if I’m missing or not understanding something or if I can just ignore this warning.
>Solution :
If you are calling free(token->lexeme) the compiler should complain that you are calling free with a const qualified pointer, which is incorrect as free expects a char * argument.
Ensuring you call free with a pointer to allocated memory is your responsibility, but storing that to a const char * makes it harder to track the object lifetime and to differentiate between pointers to heap objects and pointers to constant data such as string literals.