[RFC] Add the core.sentinel module#2123
Conversation
|
Thanks for your pull request and interest in making D better, @marler8997! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla referencesYour PR doesn't reference any Bugzilla issue. If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog. |
Aside from whether these are a good idea in general, that won't work for C bindings. The C bindings must match the C declarations. And we don't do wrappers for the C bindings in druntime. |
C doesn't mangle their function names...so the bindings should just work :) Also I'm not proposing adding wrappers, the proposal is to use extern(C) size_t strlen(const(char)* str); // Current
extern(C) size_t strlen(cstring str); // Proposed
// NOTE: cstring is an alias to SentinelPtr!(const(char))I believe we could also make this work for |
64bd50e to
22c0a49
Compare
|
|
8640218 to
1c793d2
Compare
|
Also if this was integrated, at some point we would probably want to change the type of string literals from Note this extra change to the "string literal" type may require a DIP, however, it can be done as an addition to adding the types to druntime first. However, I wouldn't go around changing all the types in druntime to use Also note that if the string literal type was changed to |
| if the given array does not contain the sentinel value at `array.ptr[array.length]`. | ||
| */ | ||
| this(T[] array) | ||
| in { assert(array.ptr[array.length] == sentinelValue, |
There was a problem hiding this comment.
- Using a contract/assert means this is only enforced in debug builds. This does not improve the safety of the program in terms of the
@safeattribute. - You're going out of bounds here. You can't assume that
array.ptr[array.length]is supposed to be a sentinel, even if it happens to have the right value.
Same throughout.
There was a problem hiding this comment.
- Using a contract/assert means this is only enforced in debug builds. This does not improve the safety of the program in terms of the @safe attribute.
I'm not sure, you could either go with assert or enforce. It's synonymous with array bounds checking, you may only want it turned on in debug mode. Maybe this check should be "tied into" the same setting, only enabled when array bounds checking is enabled.
- You're going out of bounds here. You can't assume that array.ptr[array.length] is supposed to be a sentinel, even if it happens to have the right value.
Right that's the point of this type. You can't go outside of the bounds of an array (or a "slice") unless you know the array also owns the elements you are checking. The point of SentinalArray is that you are explicitly declaring it "owns" the sentinel element at array.ptr[array.length].
This particular constructor you've commented on "coercing" a normal array to a SentinalArray, and whenever you convert from one type to another, it is up to the developer to make sure the conversion is valid, there's nothing the compiler can do about that. However, if SentinalArray is builtin, meaning that string literals will already be of type SentinalArray, then this coercion will only be used as much as normal casting (and should only be allowed in @system/@trusted code).
Note: after your comment I've added @system to these constructors to indicate the operation is not @safe.
There was a problem hiding this comment.
It's synonymous with array bounds checking, you may only want it turned on in debug mode.
Note that @safe needs bounds checking to work, which is why -release keeps bounds checking on in @safe code.
The point of SentinalArray is that you are explicitly declaring it "owns" the sentinel element at array.ptr[array.length].
[...]
it is up to the developer to make sure the conversion is valid
So getting anything @safe is not the plan here? It's just about adding sanity checks to @system code?
How about expecting the sentinel in arr[$ - 1] (or maybe anywhere in arr)? With that and with a run-time check that's always there, you could maybe make it impossible to construct an invalid SentinelArray. Then it could be @safe:
extern(C) cstring getenv(cstring name) @safe;
extern(C) int puts(cstring s) @safe;
/* No idea if the char*s are really the only thing keeping these functions from being @safe. */
void main() @safe
{
cstring p = getenv("PATH\0".asSentinelPtr);
puts(p);
}(And if "PATH" were already a SentinelArray as you're planning, it wouldn't need the ugly explicit terminator.)
There was a problem hiding this comment.
So getting anything @safe is not the plan here? It's just about adding sanity checks to @System code?
the grand plan would be to change the string literal type to SentinalString so you wouldn't have to "coerce" them. So it should all work in @safe code. The only code that would not be safe would be converting normal D arrays to SentinalArray since there's no way to know if an array owns the value array[$]. Actually I also added the StringLiteral template which is @trusted so you could also use that in @safe code, i.e.
puts(StringLiteral!"hello"); // safeHow about expecting the sentinel in arr[$ - 1] (or maybe anywhere in arr)? With that and with a run-time check that's always there, you could maybe make it impossible to construct an invalid SentinelArray. Then it could be @safe:
Good idea. This is an additional use case that could be used in @safe. I'll see if I can find a good way to add this in.
I know I already answered this, but your getenv example could currently be written as:
cstring p = getenv(StringLiteral!"PATH");There was a problem hiding this comment.
Ok here it is:
/**
This function converts an array to a SentinelArray. It requires that the last element `array[$-1]`
be equal to the sentinel value. This differs from the function `asSentinelArray` which requires
the first value outside of the bounds of the array `array[$]` to be equal to the sentinel value.
This function does not require the array to "own" elements outside of its bounds.
*/
@property auto reduceToSentinelArray(T)(T[] array) @trusted
in {
assert(array.length > 0);
assert(array[$ - 1] == defaultSentinel!T);
} do
{
return asSentinelArrayUnchecked(array[0 .. $-1]);
}
///
@safe unittest
{
auto s = "abc\0".reduceToSentinelArray;
assert(s.length == 3);
() @trusted {
assert(s.ptr[s.length] == '\0');
}();
}8eadec2 to
19033d8
Compare
ff61fe0 to
91897bc
Compare
| @@ -0,0 +1,109 @@ | |||
| /* | |||
There was a problem hiding this comment.
Shouldn't this file be part of the benchmark suite?
There was a problem hiding this comment.
I wasn't planning on keeping it in the finalized change. Where is the benchmark suite? Maybe it would make sense to add this to it.
There was a problem hiding this comment.
(though that one is specifically about comparing different druntime versions, but I think just moving the file into this folder should be good enough as it will be a lot easier to find when it's checked in the source code.)
| ConstPtr asConst; | ||
| } | ||
| alias asConst this; // facilitates implicit conversion to const type | ||
| // alias ptr this; // NEED MULTIPLE ALIAS THIS!!! |
There was a problem hiding this comment.
dlang/dmd#3998 though it will require quite an effort to revive this.
src/core/sentinel.d
Outdated
| Returns: | ||
| the length of the array | ||
| */ | ||
| size_t findLength() const |
There was a problem hiding this comment.
In phobos this is called walkLength
REQUEST FOR COMMENT
In D we have APIs that accept both D arrays and pointers with sentinel values (https://en.wikipedia.org/wiki/Sentinel_value). The classic example is the string, where some functions accept D's
stringtype and others accept null-terminated strings (i.e. the C standard library functions). This change proposes the addition of theSentinelPtrandSentinelArraytypes.A
SentinelPtris a pointer to an array that contains a sentinel value to terminate its elements (i.e. a C string). ASentinelArrayis a D array whose pointer is aSentinelPtr. D's "string literal" is an example of aSentinelArraysince it contains both a length and guarantees null-termination. Note that the newcstringtype (an alias toSentinelPtr!(const(char))is an exact definition of "c-style" strings. This allows the semantics of c-strings to exist within the D type system, which comes with alot of advantages (see below).Adding a
SentinelPtrandSentinelArraytype would allow an API to explicitly declare when a sentinel value is required. This allows the type system to enforce this requirement rather than relying on documentation/trust that the user only passes sentinel pointers when they are required, i.e.Expanding on the example a bit further, the correct thing to do there would have been:
This pattern is seen all over phobos. The problem here is that if
barhad been called with a string literal in the first place, then the copy intempCStringwould have been completely unnecessary, but thebarfunction has no way of knowing whether or notsalready has a sentinel value. This is whySentinelArraycomes in handy because nowbarcan know this, i.e.It's important to note that
SentinelArrayallows the type to indicate that the array not only "owns" all the elements betweenarray[0..$], it also owns the sentinel elementarray[$]. The array type itself has no way to indicate this.One reason for adding this to druntime (instead of phobos) is so that it could take advantage of these types in its C bindings. In cases where functions currently use pointers to accept sentinel arrays, they can be replaced with SentinelPtr to explicitly declare this requirement.
The overall impact should be an increase in safety, better template introspection and provides the opportunity to avoid unnecessary copy/allocations of arrays in certain situations. Safety would be increased because functions can declare when sentinel values are required allowing compile-time and runtime checks to enforce this. This also allows templates to take advantage of a common type to know when an array has a sentinel value. This can result in selecting more optimized algorithms when they apply (see https://www.youtube.com/watch?v=AxnotgLql0k&t=14m41s). It also allows functions to omit an extra copy/allocation when a sentinel array is required. The classic example of this is usage of the
tempCStringfunction which becomes a no-op forSentinelArray!char.Note that with this change the string literal type could be redefined as
SentinelArray!(immutable(char)), (whose alias isSentinelString) which would explicitly guarantee it is null-terminated allowing functions to know this as well. This prevents the need to copy the string to a temporary buffer when it needs to be passed to a function accepting "c-style" strings.