Add departure_occupancy_status to TripUpdate#260
Conversation
The idea behind this proposal is that for the user, the occupancy of a vehicle several stops away (at its VehiclePosition) may be difficult to act upon, because occupancy will change by the time the vehicle arrives at the stop where the user intends to board. This new field allows agencies to provide expected occupancy for future stops. It is left open for agencies to decide how to calculate these occupancies, they may use any model provided it uses realtime data. There is no prohibition on providing data for past stops or for the last stop, although such data is not useful for the use case described. The natural place in GTFS for these occupancies is in TripUpdate.StopTimeUpdate - they are analogous to StopTimeEvents. One drawback is that the name StopTimeUpdate is potentially then slightly misleading because its data is not just about timing. If google#248 is adopted, that default would apply here as well.
|
When I read the topic I thought this ment: the crowdedness at the stop. I think it would be good to change the wording to prevent this ambiguity. |
|
I used StopOccupancy, since it's analogous to StopTimeUpdate - except there might not be a previous value so it isn't an update. |
|
How will rider know if the prediction for the occupancy at each trip stop is arrival or departure based? Shouldn't there be a separate occupancy prediction for both arrival and departure StopTimeEvent (or put the occupancy prediction inside StopTimeEvent). |
barbeau
left a comment
There was a problem hiding this comment.
Thanks @jakehoare! Some of my thoughts are in-line below.
| optional VehiclePosition.OccupancyStatus occupancy_status = 3; | ||
| } | ||
|
|
||
| optional StopOccupancy stop_occupancy = 7; |
There was a problem hiding this comment.
However then it's necessary to specify time updates in order to specify occupancy updates - this proposal makes times and occupancies independent.
There is precedent for specifying other stop attributes within a StopTimeUpdate without specifying predictions. #219 introduced dynamic stop assignments, which says:
"To assign a stop without providing any real-time arrival or departure predictions, populate stop_time_properties.assigned_stop_id and set StopTimeUpdate.schedule_relationship = NO_DATA."
We could do something similar here if we want to include the new data within StopTimeUpdate.
I'd be more inclined to bundle the data within StopTimeUpdate and not have a separate array here. Having another array of values that have to be matched to stops and can be keyed on stop_id or stop_sequence seems like it's inviting trouble, as that's already a point of error in feeds. If this stays as a separate array, I'd strongly recommend that it be keyed only on stop_sequence and omit stop_id.
There was a problem hiding this comment.
Also, this would need to change from optional to repeated if it stays as a separate array.
There was a problem hiding this comment.
Thanks. That makes sense to me. I suppose the negatives are having to provide empty StopTimeUpdates in order to provide occupancy predictions (which is probably a niche case), and the name StopTimeUpdate possibly implying that it's about time and not occupancy.
They are not show-stoppers to me. Particularly if there is precedent, then a separate repeated looks good. I'll propose some changes after others have had a chance to comment on this and the other points under discussion.
| } | ||
| optional TripProperties trip_properties = 6; | ||
|
|
||
| // Realtime occupancy immediately after departure for a given stop on a |
There was a problem hiding this comment.
Is there a reason to allow data for past events?
I'd be inclined to have stronger language that requires this to be a predicted future value immediately after departure for the given stop. One sticking point for predictions currently, especially surrounding TripUpdate.delay, is that it's not clear if the value is observed fact or a prediction about the future. I'd prefer to have stricter guidance here to know exactly what we're seeing in feeds. It seems to me if you just want real-time or past info, you should be looking at VehiclePosition.occupancy_status. This new field should include some predictive intelligence that presumably adds value over the data in VehiclePosition.
There was a problem hiding this comment.
The reason was only consistency with other existing fields of TripUpdate. But it would be fine to change to "future only", as you mention.
That would imply consistency with VehiclePosition.current_stop_sequence, or allow current_stop_sequence to be inferred if it isn't set.
| optional string stop_id = 2; | ||
|
|
||
| // Expected occupancy after departure from the given stop. | ||
| optional VehiclePosition.OccupancyStatus occupancy_status = 3; |
There was a problem hiding this comment.
tl;dr - There is a lot of discussion if OccupancyStatus is the right way forward for occupancy. See #223 for discussion vs occupancy_percentage and an older proposal to revise OccupancyStatus at #212. There is also a newer draft proposal at TransitApp#3 to define a fixed linear scale for occupancy (CrowdLevel), which could deprecate OccupancyStatus.
All or some of these (OccupancyStatus, occupancy_percentage, CrowdLevel) could be used here.
There was a problem hiding this comment.
That's fine, the intention here is to use the standard unit of occupancy that applies to the rest of GTFS, whatever that may change to in the future.
If occupancy on departure from stop n is equal to occupancy on arrival at stop n+1 (i.e. assume nobody get on or off between stops) then clients of the data can use occupancy on departure as occupancy on arrival at the next stop. So 2 fields would repeat the same data, except shifted by 1 stop. |
Amend to put departure_occupancy_status inside StopTimeUpdate
Revise spec for departure_occupancy_status inside StopTimeEvent instead of separate message.
|
I have revised this in line with comments above - mainly to put the new occupancy inside StopTimeUpdate (i.e. using the same stop_sequence or stop_id), instead of a separate message. Note that ScheduleRelationship.NO_DATA could be misleading in that times cannot be provided but occupancies could. |
|
Sorry I'm a bit late to the discussion as Jake has called for a vote on this PR. But there is a seeming discrepancy. The departure_occupancy_status is marked as optional in the proto file but as Required in the documentation. Is the documentation incorrect? |
barbeau
left a comment
There was a problem hiding this comment.
@jakehoare Some comments in-line.
For those not following the Google Group, the vote was called at https://groups.google.com/forum/#!topic/gtfs-realtime/GQFcNL8Mlds (in the future please call it here too). There are going to need to be some updates to the PR to clarify exactly what's being proposed, so I'd suggest re-starting the vote after everything is set.
Also, please clarify if this is a vote for adoption as an RT experimental field (i.e., we don't currently have any producers or consumers) or official field (requires producers and consumers). See https://github.com/google/transit/blob/master/gtfs-realtime/CHANGES.md#experimental-fields.
gtfs-realtime/spec/en/reference.md
Outdated
| | **stop_id** | [string](https://developers.google.com/protocol-buffers/docs/proto#scalar) | Conditionally required | One | Must be the same as in stops.txt in the corresponding GTFS feed. Either stop_sequence or stop_id must be provided within a StopTimeUpdate - both fields cannot be empty. If `StopTimeProperties.assigned_stop_id` is populated, it is preferred to omit `stop_id` and use only `stop_sequence`. If `StopTimeProperties.assigned_stop_id` and `stop_id` are populated, `stop_id` must match `assigned_stop_id`. | | ||
| | **arrival** | [StopTimeEvent](#message-stoptimeevent) | Conditionally required | One | If schedule_relationship is empty or SCHEDULED, either arrival or departure must be provided within a StopTimeUpdate - both fields cannot be empty. arrival and departure may both be empty when schedule_relationship is SKIPPED. If schedule_relationship is NO_DATA, arrival and departure must be empty. | | ||
| | **departure** | [StopTimeEvent](#message-stoptimeevent) | Conditionally required | One | If schedule_relationship is empty or SCHEDULED, either arrival or departure must be provided within a StopTimeUpdate - both fields cannot be empty. arrival and departure may both be empty when schedule_relationship is SKIPPED. If schedule_relationship is NO_DATA, arrival and departure must be empty. | | ||
| | **departure_occupancy_status** | [OccupancyStatus](#enum-occupancystatus) | Required | One | The state of passenger occupancy for the vehicle immediately after departure from the given stop. | |
There was a problem hiding this comment.
This can't be Required in the spec docs because of backwards compatibility (it would declare existing implementations without it non-compliant). It should be Optional. It also needs to be labeled as an experimental field (see other experimental fields).
There was a problem hiding this comment.
I would also add that if this field is provided, stop_sequence must be populated, and add "To provide occupancy without providing any real-time arrival or departure predictions, populate this field and set StopTimeUpdate.schedule_relationship = NO_DATA."
There was a problem hiding this comment.
Should this be "The predicted state of passenger..."?
There was a problem hiding this comment.
IMHO, CrowdLevel as proposed in #261 is a better solution than OccupancyStatus. We already know there are problems interpreting OccupancyStatus across multiple producers, and IMHO rather than proliferating those we should switch to CrowdLevel. I'd like to hear from more producers, and consumers, on this topic though.
There was a problem hiding this comment.
I see CrowdLevel vs OccupancyStatus as a separate issue that should be resolved by #261. I think we can move ahead with predicted occupancy using OccupancyStatus at this time and then add CrowdLevel once #261 is hashed out. I don't see us removing OccupancyStatus as part of #261 so I think it would be acceptable to also have two values for predicted occupancy. At that time I would also like to see occupancy_percentage be discussed as a possible part of predicted occupancy since the enumerated types are likely too limited to be of universal use.
There was a problem hiding this comment.
Amended for the first 3 comments above.
For CrowdLevel vs OccupancyStatus, I'd prefer to separate that out from this and use OccupancyStatus here since that's currently used in VehiclePosition, then update this when/if VP changes.
|
+1 I think the changes to the documentation are good and that other issues. like departure_crowd_level, should be dealt with separately. |
|
This pull request has been open for more than one week, so per the Official Process I'm calling for a vote. Vote will be closed on Friday March 26th at 23:59:59 UTC. You can vote with a +1 (in favor) or -1 (against) in the comments here. Thanks |
|
+1 |
|
+1 With departure_crowd_level being added eventually. |
|
+1 from Transit 👏 |
|
+1 from Trafiklab/Samtrafiken |
|
+1 from IBI Group And we agree with the comment above from @jakehoare that “the intention here is to use the standard unit of occupancy that applies to the rest of GTFS, whatever that may change to in the future” |
gtfs-realtime/spec/en/reference.md
Outdated
| | **SCHEDULED** | The vehicle is proceeding in accordance with its static schedule of stops, although not necessarily according to the times of the schedule. This is the **default** behavior. At least one of arrival and departure must be provided. Frequency-based trips (GTFS frequencies.txt with exact_times = 0) should not have a SCHEDULED value and should use UNSCHEDULED instead. | | ||
| | **SKIPPED** | The stop is skipped, i.e., the vehicle will not stop at this stop. Arrival and departure are optional. When set `SKIPPED` is not propagated to subsequent stops in the same trip (i.e., the vehicle will stop at subsequent stops in the trip unless those stops also have a `stop_time_update` with `schedule_relationship: SKIPPED`). Delay from a previous stop in the trip *does* propagate over the `SKIPPED` stop. In other words, if a `stop_time_update` with an `arrival` or `departure` prediction is not set for a stop after the `SKIPPED` stop, the prediction upstream of the `SKIPPED` stop will be propagated to the stop after the `SKIPPED` stop and subsequent stops in the trip until a `stop_time_update` for a subsequent stop is provided. | | ||
| | **NO_DATA** | No data is given for this stop. It indicates that there is no realtime information available. When set NO_DATA is propagated through subsequent stops so this is the recommended way of specifying from which stop you do not have realtime information. When NO_DATA is set neither arrival nor departure should be supplied. | | ||
| | **NO_DATA** | No data is given for this stop. It indicates that there is no realtime information available. When set NO_DATA is propagated through subsequent stops so this is the recommended way of specifying from which stop you do not have realtime timing information. When NO_DATA is set neither arrival nor departure should be supplied. | |
There was a problem hiding this comment.
Should the first "realtime" here be "realtime timing"?
There was a problem hiding this comment.
Yes true, amended.
|
Thank you all for participating! The vote ended on Friday March 26th at 23:59:59 UTC with a unanimous consensus of 6 yes votes from OpenGeo (@skinkie), Swiftly (@mike-swiftly), Transit (@juanborre), RTL (@ As per the specification amendment process, this proposal is now accepted! |
|
@jakehoare Looks like there are conflicts with the master branch - could you please resolve those, and then I can merge? |
|
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
|
@googlebot I fixed it. |
|
@barbeau Ah yes, resolved conflicts now. |
The idea behind this proposal is that for the user, the occupancy of a vehicle several stops away (supplied in VehiclePosition) may be difficult to act upon, because occupancy will change by the time the vehicle arrives at the stop where the user intends to board.
This new field allows agencies to provide expected occupancy for future stops.
It is left open for agencies to decide how to calculate these occupancies, they may use any model provided it uses realtime data.
The natural place in GTFS for these occupancies is in TripUpdate.StopTimeUpdate - they are analogous to StopTimeEvents.
If #248 is adopted, that default would apply here as well.
Thoughts, comments?