Skip to content

Add departure_occupancy_status to TripUpdate#260

Merged
barbeau merged 10 commits intogoogle:masterfrom
jakehoare:patch-1
Apr 14, 2021
Merged

Add departure_occupancy_status to TripUpdate#260
barbeau merged 10 commits intogoogle:masterfrom
jakehoare:patch-1

Conversation

@jakehoare
Copy link
Contributor

@jakehoare jakehoare commented Feb 9, 2021

The idea behind this proposal is that for the user, the occupancy of a vehicle several stops away (supplied in VehiclePosition) may be difficult to act upon, because occupancy will change by the time the vehicle arrives at the stop where the user intends to board.

This new field allows agencies to provide expected occupancy for future stops.
It is left open for agencies to decide how to calculate these occupancies, they may use any model provided it uses realtime data.

The natural place in GTFS for these occupancies is in TripUpdate.StopTimeUpdate - they are analogous to StopTimeEvents.

If #248 is adopted, that default would apply here as well.

Thoughts, comments?

The idea behind this proposal is that for the user, the occupancy of a vehicle several stops away (at its VehiclePosition) may be difficult to act upon, because occupancy will change by the time the vehicle arrives at the stop where the user intends to board.

This new field allows agencies to provide expected occupancy for future stops.
It is left open for agencies to decide how to calculate these occupancies, they may use any model provided it uses realtime data.

There is no prohibition on providing data for past stops or for the last stop, although such data is not useful for the use case described.

The natural place in GTFS for these occupancies is in TripUpdate.StopTimeUpdate - they are analogous to StopTimeEvents. One drawback is that the name StopTimeUpdate is potentially then slightly misleading because its data is not just about timing.

If google#248 is adopted, that default would apply here as well.
@google-cla google-cla bot added the cla: yes label Feb 9, 2021
@skinkie
Copy link
Contributor

skinkie commented Feb 9, 2021

When I read the topic I thought this ment: the crowdedness at the stop. I think it would be good to change the wording to prevent this ambiguity.

@jakehoare
Copy link
Contributor Author

I used StopOccupancy, since it's analogous to StopTimeUpdate - except there might not be a previous value so it isn't an update.
But a clarification would be fine.
e.g.. TripStopOccupancy? or OccupancyOnTrip?

@mpaine-act
Copy link

mpaine-act commented Feb 9, 2021

How will rider know if the prediction for the occupancy at each trip stop is arrival or departure based? Shouldn't there be a separate occupancy prediction for both arrival and departure StopTimeEvent (or put the occupancy prediction inside StopTimeEvent).

Copy link
Contributor

@barbeau barbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jakehoare! Some of my thoughts are in-line below.

optional VehiclePosition.OccupancyStatus occupancy_status = 3;
}

optional StopOccupancy stop_occupancy = 7;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However then it's necessary to specify time updates in order to specify occupancy updates - this proposal makes times and occupancies independent.

There is precedent for specifying other stop attributes within a StopTimeUpdate without specifying predictions. #219 introduced dynamic stop assignments, which says:

"To assign a stop without providing any real-time arrival or departure predictions, populate stop_time_properties.assigned_stop_id and set StopTimeUpdate.schedule_relationship = NO_DATA."

We could do something similar here if we want to include the new data within StopTimeUpdate.

I'd be more inclined to bundle the data within StopTimeUpdate and not have a separate array here. Having another array of values that have to be matched to stops and can be keyed on stop_id or stop_sequence seems like it's inviting trouble, as that's already a point of error in feeds. If this stays as a separate array, I'd strongly recommend that it be keyed only on stop_sequence and omit stop_id.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this would need to change from optional to repeated if it stays as a separate array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. That makes sense to me. I suppose the negatives are having to provide empty StopTimeUpdates in order to provide occupancy predictions (which is probably a niche case), and the name StopTimeUpdate possibly implying that it's about time and not occupancy.
They are not show-stoppers to me. Particularly if there is precedent, then a separate repeated looks good. I'll propose some changes after others have had a chance to comment on this and the other points under discussion.

}
optional TripProperties trip_properties = 6;

// Realtime occupancy immediately after departure for a given stop on a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to allow data for past events?

I'd be inclined to have stronger language that requires this to be a predicted future value immediately after departure for the given stop. One sticking point for predictions currently, especially surrounding TripUpdate.delay, is that it's not clear if the value is observed fact or a prediction about the future. I'd prefer to have stricter guidance here to know exactly what we're seeing in feeds. It seems to me if you just want real-time or past info, you should be looking at VehiclePosition.occupancy_status. This new field should include some predictive intelligence that presumably adds value over the data in VehiclePosition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason was only consistency with other existing fields of TripUpdate. But it would be fine to change to "future only", as you mention.
That would imply consistency with VehiclePosition.current_stop_sequence, or allow current_stop_sequence to be inferred if it isn't set.

optional string stop_id = 2;

// Expected occupancy after departure from the given stop.
optional VehiclePosition.OccupancyStatus occupancy_status = 3;
Copy link
Contributor

@barbeau barbeau Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tl;dr - There is a lot of discussion if OccupancyStatus is the right way forward for occupancy. See #223 for discussion vs occupancy_percentage and an older proposal to revise OccupancyStatus at #212. There is also a newer draft proposal at TransitApp#3 to define a fixed linear scale for occupancy (CrowdLevel), which could deprecate OccupancyStatus.

All or some of these (OccupancyStatus, occupancy_percentage, CrowdLevel) could be used here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, the intention here is to use the standard unit of occupancy that applies to the rest of GTFS, whatever that may change to in the future.

@jakehoare
Copy link
Contributor Author

How will rider know if the prediction for the occupancy at each trip stop is arrival or departure based? Shouldn't there be a separate occupancy prediction for both arrival and departure StopTimeEvent (or put the occupancy prediction inside StopTimeEvent).

If occupancy on departure from stop n is equal to occupancy on arrival at stop n+1 (i.e. assume nobody get on or off between stops) then clients of the data can use occupancy on departure as occupancy on arrival at the next stop. So 2 fields would repeat the same data, except shifted by 1 stop.
It's up to clients of the data to convey to users if they are showing departure or arrival occupancy.

Amend to put departure_occupancy_status inside StopTimeUpdate
@jakehoare jakehoare changed the title Add StopOccupancy to TripUpdate Add departure_occupancy_status to TripUpdate Mar 11, 2021
Revise spec for departure_occupancy_status inside StopTimeEvent instead of separate message.
@jakehoare
Copy link
Contributor Author

I have revised this in line with comments above - mainly to put the new occupancy inside StopTimeUpdate (i.e. using the same stop_sequence or stop_id), instead of a separate message.

Note that ScheduleRelationship.NO_DATA could be misleading in that times cannot be provided but occupancies could.

@mike-swiftly
Copy link

Sorry I'm a bit late to the discussion as Jake has called for a vote on this PR. But there is a seeming discrepancy. The departure_occupancy_status is marked as optional in the proto file but as Required in the documentation. Is the documentation incorrect?

Copy link
Contributor

@barbeau barbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakehoare Some comments in-line.

For those not following the Google Group, the vote was called at https://groups.google.com/forum/#!topic/gtfs-realtime/GQFcNL8Mlds (in the future please call it here too). There are going to need to be some updates to the PR to clarify exactly what's being proposed, so I'd suggest re-starting the vote after everything is set.

Also, please clarify if this is a vote for adoption as an RT experimental field (i.e., we don't currently have any producers or consumers) or official field (requires producers and consumers). See https://github.com/google/transit/blob/master/gtfs-realtime/CHANGES.md#experimental-fields.

| **stop_id** | [string](https://developers.google.com/protocol-buffers/docs/proto#scalar) | Conditionally required | One | Must be the same as in stops.txt in the corresponding GTFS feed. Either stop_sequence or stop_id must be provided within a StopTimeUpdate - both fields cannot be empty. If `StopTimeProperties.assigned_stop_id` is populated, it is preferred to omit `stop_id` and use only `stop_sequence`. If `StopTimeProperties.assigned_stop_id` and `stop_id` are populated, `stop_id` must match `assigned_stop_id`. |
| **arrival** | [StopTimeEvent](#message-stoptimeevent) | Conditionally required | One | If schedule_relationship is empty or SCHEDULED, either arrival or departure must be provided within a StopTimeUpdate - both fields cannot be empty. arrival and departure may both be empty when schedule_relationship is SKIPPED. If schedule_relationship is NO_DATA, arrival and departure must be empty. |
| **departure** | [StopTimeEvent](#message-stoptimeevent) | Conditionally required | One | If schedule_relationship is empty or SCHEDULED, either arrival or departure must be provided within a StopTimeUpdate - both fields cannot be empty. arrival and departure may both be empty when schedule_relationship is SKIPPED. If schedule_relationship is NO_DATA, arrival and departure must be empty. |
| **departure_occupancy_status** | [OccupancyStatus](#enum-occupancystatus) | Required | One | The state of passenger occupancy for the vehicle immediately after departure from the given stop. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be Required in the spec docs because of backwards compatibility (it would declare existing implementations without it non-compliant). It should be Optional. It also needs to be labeled as an experimental field (see other experimental fields).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also add that if this field is provided, stop_sequence must be populated, and add "To provide occupancy without providing any real-time arrival or departure predictions, populate this field and set StopTimeUpdate.schedule_relationship = NO_DATA."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be "The predicted state of passenger..."?

Copy link
Contributor

@barbeau barbeau Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, CrowdLevel as proposed in #261 is a better solution than OccupancyStatus. We already know there are problems interpreting OccupancyStatus across multiple producers, and IMHO rather than proliferating those we should switch to CrowdLevel. I'd like to hear from more producers, and consumers, on this topic though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see CrowdLevel vs OccupancyStatus as a separate issue that should be resolved by #261. I think we can move ahead with predicted occupancy using OccupancyStatus at this time and then add CrowdLevel once #261 is hashed out. I don't see us removing OccupancyStatus as part of #261 so I think it would be acceptable to also have two values for predicted occupancy. At that time I would also like to see occupancy_percentage be discussed as a possible part of predicted occupancy since the enumerated types are likely too limited to be of universal use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amended for the first 3 comments above.

For CrowdLevel vs OccupancyStatus, I'd prefer to separate that out from this and use OccupancyStatus here since that's currently used in VehiclePosition, then update this when/if VP changes.

@mike-swiftly
Copy link

+1

I think the changes to the documentation are good and that other issues. like departure_crowd_level, should be dealt with separately.

@jakehoare
Copy link
Contributor Author

This pull request has been open for more than one week, so per the Official Process I'm calling for a vote.
Also note that this is for adoption as an experimental field.

Vote will be closed on Friday March 26th at 23:59:59 UTC.

You can vote with a +1 (in favor) or -1 (against) in the comments here.

Thanks
Jake

@skinkie
Copy link
Contributor

skinkie commented Mar 17, 2021

+1

@RTL-Gilles-Ferland
Copy link

+1

With departure_crowd_level being added eventually.

@juanborre
Copy link
Contributor

+1 from Transit 👏

@Bertware
Copy link

+1 from Trafiklab/Samtrafiken

@sam-hickey-arcadis
Copy link
Contributor

+1 from IBI Group

And we agree with the comment above from @jakehoare that “the intention here is to use the standard unit of occupancy that applies to the rest of GTFS, whatever that may change to in the future”

| **SCHEDULED** | The vehicle is proceeding in accordance with its static schedule of stops, although not necessarily according to the times of the schedule. This is the **default** behavior. At least one of arrival and departure must be provided. Frequency-based trips (GTFS frequencies.txt with exact_times = 0) should not have a SCHEDULED value and should use UNSCHEDULED instead. |
| **SKIPPED** | The stop is skipped, i.e., the vehicle will not stop at this stop. Arrival and departure are optional. When set `SKIPPED` is not propagated to subsequent stops in the same trip (i.e., the vehicle will stop at subsequent stops in the trip unless those stops also have a `stop_time_update` with `schedule_relationship: SKIPPED`). Delay from a previous stop in the trip *does* propagate over the `SKIPPED` stop. In other words, if a `stop_time_update` with an `arrival` or `departure` prediction is not set for a stop after the `SKIPPED` stop, the prediction upstream of the `SKIPPED` stop will be propagated to the stop after the `SKIPPED` stop and subsequent stops in the trip until a `stop_time_update` for a subsequent stop is provided. |
| **NO_DATA** | No data is given for this stop. It indicates that there is no realtime information available. When set NO_DATA is propagated through subsequent stops so this is the recommended way of specifying from which stop you do not have realtime information. When NO_DATA is set neither arrival nor departure should be supplied. |
| **NO_DATA** | No data is given for this stop. It indicates that there is no realtime information available. When set NO_DATA is propagated through subsequent stops so this is the recommended way of specifying from which stop you do not have realtime timing information. When NO_DATA is set neither arrival nor departure should be supplied. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the first "realtime" here be "realtime timing"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes true, amended.

@jakehoare
Copy link
Contributor Author

Thank you all for participating! The vote ended on Friday March 26th at 23:59:59 UTC with a unanimous consensus of 6 yes votes from OpenGeo (@skinkie), Swiftly (@mike-swiftly), Transit (@juanborre), RTL (@
RTL-Gilles-Ferland), Trafiklab/Samtrafiken (@Bertware), and IBI Group (@sam-hickey-ibigroup).

As per the specification amendment process, this proposal is now accepted!

@barbeau
Copy link
Contributor

barbeau commented Apr 13, 2021

@jakehoare Looks like there are conflicts with the master branch - could you please resolve those, and then I can merge?

@google-cla
Copy link

google-cla bot commented Apr 13, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added cla: no and removed cla: yes labels Apr 13, 2021
@jakehoare
Copy link
Contributor Author

@googlebot I fixed it.

@google-cla google-cla bot added cla: yes and removed cla: no labels Apr 13, 2021
@jakehoare
Copy link
Contributor Author

@barbeau Ah yes, resolved conflicts now.

@barbeau barbeau merged commit 3fe9939 into google:master Apr 14, 2021
@sccmcca sccmcca added proposal GTFS Realtime Issues and Pull Requests that focus on GTFS Realtime labels May 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

GTFS Realtime Issues and Pull Requests that focus on GTFS Realtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.