Let’s say I have a C library that implements the following function:
// Returns the number of elements in the array,
// and a pointer to the first element.
// The memory pointed to has static lifetime.
size_t MyLib_GetValues(const int** outBasePtr);
And I use it like this:
const int* values = NULL;
size_t count = MyLib_GetValues(&values);
if ( values )
{
for ( size_t index = 0; index < count; ++index )
{
DoSomething(values + index);
}
}
I have a few questions about this approach, given there’s nothing to stop the user of the library iterating past the end of the array.
- Would this be classed as a security issue for
MyLib? I don’t know enough about how systems handle libraries to know whethervalues[count]would be caught as an invalid memory access, or whether the user would just be able to read off the end of the array and obtain values from elsewhere in the library’s memory. - If this would be considered a security issue, what would be a better approach? Should the library copy values out into a user-provided buffer? Additionally, would this mean that exposing strings directly as
const char*would also be a security concern, given you can just read off the end of those as well if you want to?
My gut instinct is that whether this approach should be considered a security issue would depend on the intended use case for the library. Given the user of the library could easily look into the contents in a hex editor, there would be nothing to stop them doing that anyway. If you’re trying to hide things in memory then it’s game over in that case, and so the API design pattern doesn’t really make any difference. However, in more restricted cases where a user would only have code-based access to the library (eg. by providing some program that links to the library at runtime), things might be different.
Is there best practice on this kind of thing from a C library design perspective?
>Solution :
Yes, allowing the user of the library to iterate past the end of the array can be considered a security issue because it can lead to undefined behavior, including memory corruption, segmentation faults, and other errors that could potentially be exploited by an attacker. Depending on the platform and architecture, such errors may or may not be caught as invalid memory access.
A better approach would be to provide a user-provided buffer of known size, where the library can safely copy the array elements. This approach ensures that the library stays within the bounds of the buffer, preventing the user from iterating past the end of the array. Alternatively, the library can return a dynamically allocated buffer that the user is responsible for freeing after use. However, the latter approach requires more careful memory management on the part of the user and may be less efficient.
Regarding const char* strings, the same issue applies, and it is generally recommended to provide a user-provided buffer or a dynamically allocated buffer for string data, rather than exposing the internal memory directly.
You are correct that the degree of security risk depends on the intended use case of the library. If the library is used in an environment where the user has unrestricted access to the memory, there may be limited benefit to implementing safeguards against iterating past the end of the array or string. However, in other environments, such as when the library is used in a sandboxed or constrained environment, or when it is used as part of a larger system with security requirements, such safeguards can be critical.
Best practices for C library design typically include:
Clearly defining the API contract, including function signatures, expected inputs and outputs, and behavior in edge cases.
Avoiding exposing internal data structures or implementation details directly to the user, to maintain encapsulation and flexibility.
Providing safe defaults and sensible error handling mechanisms to prevent unexpected behavior and facilitate debugging.
Using appropriate memory management techniques, such as providing user-provided buffers or dynamically allocating memory, and clearly specifying ownership and responsibility for freeing memory.