Skip to content

Proposal: Support multi-car crowding in GTFS-RT#237

Merged
barbeau merged 15 commits intogoogle:masterfrom
lfontolliet:multi-car-crowding
Sep 10, 2020
Merged

Proposal: Support multi-car crowding in GTFS-RT#237
barbeau merged 15 commits intogoogle:masterfrom
lfontolliet:multi-car-crowding

Conversation

@lfontolliet
Copy link
Contributor

@lfontolliet lfontolliet commented Jul 27, 2020

Background

GTFS-RT provides a way to surface vehicle crowding data as part of the VehiclePosition message using occupancyStatus. An experimental field occupancy_percentage is also available.

These fields only provide crowding data at vehicle level. In case a vehicle is composed of several carriages (like a subway or train), there is no way to provide details per car. This proposal offers a simple way to provide multi car crowding data in GTFS-RT. This proposal is made with the help of LIRR (as a producer) and MobilityData. Lyft would be the consumer of this data.

Here is a screenshot of our ongoing designs:
ss 2020-08-13 at 4 38 15 PM

Proposal

To handle the case of vehicles that provide crowding data at carriage level, we would like to propose the addition of a new repeated multi_carriage_details field in the VehiclePosition message. This field would be based on a CarriageDetails message containing five fields: id, label, occupancy_status, occupancy_percentage and carriage_sequence.

CarriageDetails.occupancy_status would follow the values of OccupancyStatus enum. CarriageDetails.occupancy_percentage would follow the same rules than the ones defined for occupancy_percentage field (see Special Cases section below for details about unavailable data).

The first occurence of the repeated field would represent the first car of the vehicle, given the current direction of travel. The number of occurrences of the multi_carriage_details field represents the number of carriages of the vehicle, including the non boardable ones (engines, carriages closed for maintenance, etc...).

In a given occurrence of the repeated multi_carriage_details field, the CarriageDetails.occupancy_status and CarriageDetails.occupancy_percentage fields are not mutually exclusive.

Special Cases

Case when vehicle contains carriages that can never be boarded (like an Engine)
A new value NOT_BOARDABLE has been defined in OccupancyStatus enum. This data is useful for users to determine where they need to stand on a platform, or that carriage cannot be boarded.

Case when data is unavailable for some carriages of the vehicle
The structure also needs to accurately handle cases where the data is not available for all the carriages of the vehicle. For CarriageDetails.occupancy_status, we propose to use a new DATA_NOT_AVAILABLE value in OccupancyStatus enum. For CarriageDetails.occupancy_percentage, as we are using an int32, we propose to use -1 to determine that data is not available for this specific car.

Example for CarriageDetails.occupancy_status: Vehicle is composed of three carriages. Data is available only for first and third carriages. To handle correctly the "padding", the repetition of CarriageDetails.occupancy_status would be the following: STANDING_ROOM_ONLY, DATA_NOT_AVAILABLE, MANY_SEATS_AVAILABLE.

Example for CarriageDetails.occupancy_percentage: Vehicle is composed of three cars. Data is available only for first and third cars. To handle correctly the "padding", the repetition of CarriageDetails.occupancy_percentage would be the following: 15, -1, 54.

Case when data is unavailable for the last carriages of the vehicle
The repeated multi_carriage_details field should contain a value for all the carriages of the current vehicle.
Example: A vehicle is composed of 4 carriages. The last two carriages don't have crowding data available, due to faulty sensors.
The repetition of CarriageDetails.occupancy_status would be the following: STANDING_ROOM_ONLY, MANY_SEATS_AVAILABLE, DATA_NOT_AVAILABLE, DATA_NOT_AVAILABLE.

@google-cla google-cla bot added the cla: yes label Jul 27, 2020
@skinkie
Copy link
Contributor

skinkie commented Jul 27, 2020

Shall we start by changing Car to Coach?

@gcamp
Copy link
Contributor

gcamp commented Jul 28, 2020

@lfontolliet thanks for the proposition.

A few questions :

@wafisher
Copy link

We (MTA LIRR) could produce using this as soon as it's ratified. Not sure about the consumer, though.

@lfontolliet
Copy link
Contributor Author

lfontolliet commented Jul 28, 2020

Thanks for the quick comments.

Shall we start by changing Car to Coach?

@skinkie Yes that's a possibility. Could you describe the advantages you see of using Coach instead of Car?

  • Are you aware of GTFS-vehicles extension? GTFS-Vehicles (core version) #200 Do you see any consequences or things that you would be different if that proposition would be accepted?

@gcamp I wasn't aware of it, thanks for pointing out. From what I understand, GTFS-vehicles proposal is more related to static vehicle data and would enable to surface vehicle amenities. The CarDetails message in this current proposal is useful for data which is specific to Real-Time, like the crowding details (as it varies during a given trip), and wouldn't be meant to convey data such as amenities. We will review the GTFS-vehicles extension in more details today and provide additional details if we conclude that it would change anything from our proposal.

@gcamp We (Lyft) do not have a producer yet, we can indeed follow the experimental path.

We (MTA LIRR) could produce using this as soon as it's ratified. Not sure about the consumer, though.

@wafisher We'd be happy to collaborate with you to discuss usage of this field!

@skinkie
Copy link
Contributor

skinkie commented Jul 28, 2020

Shall we start by changing Car to Coach?

@skinkie Yes that's a possibility. Could you describe the advantages you see of using Coach instead of Car?

Not inventing terms, keeping it in line with common terms from the UIC.
https://en.wikipedia.org/wiki/UIC_passenger_coach_types

@wafisher
Copy link

wafisher commented Jul 29, 2020

We're currently using this extension to produce our GTFS-RT, hosted here, in test. This feed is in beta.

Happy to work together to come up with a more official standard that we can use.

@sccmcca sccmcca mentioned this pull request Aug 10, 2020
@lfontolliet
Copy link
Contributor Author

lfontolliet commented Aug 13, 2020

Hey everyone,

We have been working with LIRR and MobilityData to update the original proposal. LIRR and MobilityData are aligned with the changes being proposed here. The original description of the PR has been modified to reflect the latest changes.

Here are the main takeaways of these new changes:

  • We would use the "carriage" terminology instead of "cars".
  • The carriage_status repeated field now contains two additional optional fields: label and carriage_sequence.
  • A new value NOT_BOARDABLE has been introduced in OccupancyStatus to reflect vehicles or carriages that are not boardable (like an Engine, or a maintenance car for instance).
  • occupancy_percentage and occupancy_status provide default values to avoid values automatically added by protobuf that do not reflect the ground truth.

LIRR would be the producer of this proposal and Lyft would be the consumer.
Thanks to @barbeau and @wafisher for contributing to the changes.

@davidlewis-ito
Copy link

@lfontolliet this looks like a really valuable addition. I have two minor questions :

i) carriage_sequence: The description suggests that there is no restriction on the values used - ie entries of 4,10 and 105 are valid. I would suggest that using instead 1,2 & 3 provides for a degree of validation (ie "1,2,4,5" is missing carriage 3.).

ii) Could you clarify the intent of the OccupancyStatus of NOT_ACCEPTING_PASSENGERS . Would this be a carriage that is out of service ?

@wafisher
Copy link

I can answer the second: an example of NOT_ACCEPTING_PASSENGERS is when certain cars which have seats are locked out i.e. deadheading. Crews on our system (and others) can do that when the load is very light and they want to consolidate passengers into fewer cars. In this case, they will lock out a couple cars so pax can't board those cars.

One reason they're allowed to this is because we have plenty of short platforms in our system. If we run 8-car trains mid-day along certain routes, they'll come along some 6-car platforms. If the crew can lock out those cars, they can guarantee that nobody is in the 2 cars that won't platform. Obviously, though, when it's busier (or at all during COVID), they'll want to keep the entire consist open and make sure pax are in the right car to alight.

@barbeau
Copy link
Contributor

barbeau commented Aug 17, 2020

@lfontolliet Thanks for pulling this together!

I believe there are two more points to address to complete the proposal (including the addition of the fields to the reference.md document):

  1. What fields are required/optional - due to protocol buffer quirks, new experimental fields in gtfs-realtime.proto will always be labeled as optional. We need to define field requirements separately in reference.md that are semantic GTFS-realtime spec. requirements. I would expect carriage_sequence to be required, and the rest of the fields to be optional. Any additional thoughts on this?
  2. Update VehiclePosition.OccupancyStatus and VehiclePosition.occupancy_percentage definitions - these fields both need to be updated to reflect how they interact with per-carriage occupancy info. I propose this language addition to both fields, respectively:

If multi_carriage_status is populated with per-carriage occupancy_percentage/OccupancyStatus, then this field should describe the entire vehicle with all carriages accepting passengers considered.

@lfontolliet
Copy link
Contributor Author

lfontolliet commented Aug 18, 2020

i) carriage_sequence: The description suggests that there is no restriction on the values used - ie entries of 4,10 and 105 are valid. I would suggest that using instead 1,2 & 3 provides for a degree of validation (ie "1,2,4,5" is missing carriage 3.).

@davidlewis-ito Yes that's a good point, I have modified the description of the field to start carriages at 1, with an increment of one for each carriage.

@barbeau I have modified the reference.md file and added your comments. Let me know how that looks.

@barbeau
Copy link
Contributor

barbeau commented Aug 18, 2020

@lfontolliet Thanks, looks good! I left a few suggestions for formatting, and the experimental message/field labels are needed on everything new. Other than that, though, 👍.

// If the second carriage in the direction of travel has a value of 3,
// consumers will detect an issue with the sequencing.
// Even if protobuf keeps sequencing, it will help with feed validation
optional uint32 carriage_sequence = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is switch-back / zig-zag / leversal / changing ends / turnround / turning / to horse over envisioned within a single trip?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skinkie My interpretation is that the sequence and label combinations should match whatever is shown to riders (i.e., the ground truth) for the given carriage configuration at the VehiclePosition.timestamp. So between two VehiclePosition updates, the sequence and label combinations could change, which indicates a change in configuration.

Copy link
Contributor

@skinkie skinkie Aug 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@barbeau this is not sustainable. This means that ever time the train changes direction it either is enforced to become a new trip (which is not enforced in GTFS-static) or arbitrary need to reverse all the values in RT. Not to mention the effect on combining vehicles.

Copy link
Contributor

@jxeeno jxeeno Aug 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep in mind a label is something that's customer facing. If carriages have digital outward facing signage, the label may change at the time of train divide / amalgamate / switch-backs.

Could this be modelled by the id field? You'll still need to reverse the carriage sequence at the time of the switch-back, but the consumer will be able to track the switch-back through an idempotent reference to the physical carriage. Consumers can also track carriages for divide and amalgamate operations.

It doesn't solve conveying these operations ahead of time, but at least it can be modelled and you'll know after it happens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jxeeno Yes, if id persists across sequence changes, you should be able to detect these types of changes in real-time. So adding the id field seems like a reasonable approach to me.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to consider and handle the case -
For car ordering or car position, it will be done regardless of the direction of travel -
For many trains and train operators, generally, the train Consists always remains the same, and stay in the same order, with the car adjacent to the locomotive always being termed as Car # 1 and the next car as # 2, then Car # 3 and so on.
So, the reporting of Car # 1 may need to remain the same, particularly with the AMTRAK Capitol Corridor CCJPA. In some other transit systems, it may not be the same.
Even though the train may be going in either direction, the locomotive is many times always on one side and would be either pulling or pushing (behind the cars) the train coaches.
Car 1 is always adjacent to the locomotive, regardless of the direction of travel.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to consider and handle the case -
For car ordering or car position, it will be done regardless of the direction of travel -
For many trains and train operators, generally, the train Consists always remains the same, and stay in the same order, with the car adjacent to the locomotive always being termed as Car # 1 and the next car as # 2, then Car # 3 and so on.
So, the reporting of Car # 1 may need to remain the same, particularly with the AMTRAK Capitol Corridor CCJPA. In some other transit systems, it may not be the same.
Even though the train may be going in either direction, the locomotive is many times always on one side and would be either pulling or pushing (behind the cars) the train coaches.
Car 1 is always adjacent to the locomotive, regardless of the direction of travel.

For multi-car occupancy, “order of Car numbering may not be flipped every time the train direction of travel is changed”.
Carriage Sequence is always going to be 1, 2, 3, 4, 5
The first occurrence of the repeated field will represent the first car or vehicle, given the direction of travel is not always true generally with rail operations and passenger rail industry, specially in the case of AMTRAK Capitol Corridor and this needs to be accounted for, to have an operator view of the same.
There needs to be a special case for this consideration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harshverma2000 I think you're referring to how the operator numbers carriages internally? The internal numbering doesn't need to match the "external" number following the rules defined here in GTFS-realtime - the operator would just need to keep a mapping between the two internally to ensure the right numbers are published to GTFS-realtime. There are certainly cases that the current GTFS-realtime model doesn't address, but IMHO it seems adequate for the purpose of showing occupancy per-carriage to the end user. @lfontolliet and @wafisher (and others) can certainly weigh in if they see any obstacles so far in implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with @barbeau. The operator would need to know the sequence of carriage for a given segment of a trip, but keeping a mapping between the internal identifier and their external facing sequence seems to be easier to handle in the operator's systems, compared to surfacing a complex logic in the standard.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @barbeau and @lfontolliet for your thoughts and comments.
The idea of keeping a mapping between the internal identifier and their external facing sequence to handle the operator's systems is excellent.
I believe operators could do that. What could be important is for the purpose of understanding occupancy-by-carriage position by various end users - they would need to know the direction-of-travel in order to understand the mapping and this point has to be documented for operator-enduser understanding. Direction-of-travel field could help end users realize this mapping...

Co-authored-by: Sean Barbeau <sjbarbeau@gmail.com>
@google-cla google-cla bot added the cla: yes label Aug 25, 2020
@barbeau
Copy link
Contributor

barbeau commented Aug 27, 2020

@lfontolliet Is it possible to resolve the remaining changes discussed above soon so we can call a vote for adding as an experimental field early next week?

lfontolliet and others added 3 commits August 27, 2020 11:12
Co-authored-by: Sean Barbeau <sjbarbeau@gmail.com>
Co-authored-by: Sean Barbeau <sjbarbeau@gmail.com>
Co-authored-by: Sean Barbeau <sjbarbeau@gmail.com>
@lfontolliet
Copy link
Contributor Author

lfontolliet commented Aug 27, 2020

@lfontolliet Is it possible to resolve the remaining changes discussed above soon so we can call a vote for adding as an experimental field early next week?

@barbeau Yes, reviewing the current comments/proposals right now and will reply before EOD.

@lfontolliet
Copy link
Contributor Author

Summary of latest changes:

  • Renamed the protobuf message into CarriageDetails. Renamed the field into multi_carriage_details.
  • Added an optional id field in the CarriageDetails message.

@lfontolliet
Copy link
Contributor Author

Hello all,

This pull request has been open for several weeks, so per the Official Process I'm calling for a vote.
Vote will be closed on Monday September 7th 2020 at 23:59:59 UTC.

Thanks,
Loïc

@wafisher
Copy link

+1 from the LIRR

@skinkie
Copy link
Contributor

skinkie commented Aug 31, 2020

-1 I still see to many issues regarding normal operations such as reversing direction or splitting which have not been addressed. I think this producer (alone) is not a suitable candidate. I would suggest to ask Deutsche Bahn for a review here.

@jxeeno
Copy link
Contributor

jxeeno commented Aug 31, 2020

-1 I still see to many issues regarding normal operations such as reversing direction or splitting which have not been addressed. I think this producer (alone) is not a suitable candidate. I would suggest to ask Deutsche Bahn for a review here.

I agree it doesn't fully handle all cases of changes to carriages, but I don't think it should have to. This is the vehicle position feed, so should be describing the current state of the vehicle. Current vehicle state shouldn't care about a future vehicle state as a result of an operation like divide, amalgamate or reversals.

imo, what @Dave-TfNSW suggested in #240 (comment) + this PR could be a way for operators to convey divide, amalgamate and reversals by providing departure and arrival occupancy at a per stop time level.

@jxeeno
Copy link
Contributor

jxeeno commented Aug 31, 2020

+1 from AnyTrip (consumer)

@skinkie
Copy link
Contributor

skinkie commented Aug 31, 2020

I agree it doesn't fully handle all cases of changes to carriages, but I don't think it should have to. This is the vehicle position feed, so should be describing the current state of the vehicle. Current vehicle state shouldn't care about a future vehicle state as a result of an operation like divide, amalgamate or reversals.

I respectfully disagree. Since this will eventually be used to track the occupancy of trains, and conveniently people are going to use the sequence which at some point does not make sense anymore. I recall that CityMapper used sequence to suggest where to board for the best place to transfer later. Also for this behavior you need proper scenarios to follow up historic state. The main problem with trains is that it is not a vehicle but a composition of formations. Pushing it back to some trivial examples that work and standardise that is not in our best interest, hence a proper review from DB, SNCF aka any serious European Rail Agency would probably benefit us all, to prevent reinventing the wheel. (Yes, this is what happens a lot lately here...)

@stotala
Copy link
Contributor

stotala commented Sep 1, 2020

+1 from Google

@davidlewis-ito
Copy link

+1 from Ito World

Copy link
Contributor

@juanborre juanborre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for Transit.

I added a comment to clarify some behaviour. Hopefully it is so small that does not require to restart the vote 😅

Comment on lines +442 to +444
// For example, the first carriage in the direction of travel has a value of 1.
// If the second carriage in the direction of travel has a value of 3,
// consumers will detect an issue with the sequencing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behaviour could be more clear.

Does this mean that if a sequence is missing the consumer must drop the CarriageDetails, the vehiclePosition or the whole GTFS-rt?
Can it be that if a sequence is missing we should assume NO_DATA_AVAILABLE for that carriage?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I think an invalid sequence should result in dropping the multi-carriage data as it indicates corruption at some level. So for example:

// If the second carriage in the direction of travel has a value of 3,
// consumers will discard data for all carriages (i.e., the multi_carriage_details field).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense since we are enforcing the inclusion of all carriages, so it must mean error somewhere 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, it's probably worth adding:

// Carriages without data must be represented with a valid carriage_sequence number and the fields without data should be omitted (alternately, those fields could also be included and set to the "no data" values).

@lfontolliet
Copy link
Contributor Author

The vote is now closed. Here is the tally.

Yes (+1)

No (-1)

Result: 83.3% Yes

We need at least 80% yes for experimental GTFS-RT features per the official process, so this proposal passes.

Note: I will modify the comment describing the carriage_sequence field as suggested by @juanborre and @barbeau in this GitHub comment, unless someone wants to change their vote if this change in the comment is made.

@harshverma2000
Copy link

So, I was away for a while and just checking it out. The CarriageDetails field could include a DirectionOfTravel and this would address the requirements for operators like AMTRAK/ Capitol Corridor and some other transit agencies as well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

GTFS Realtime Issues and Pull Requests that focus on GTFS Realtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.