Skip to content

GTFS-Vehicles (core version)#200

Closed
timMillet wants to merge 1 commit intogoogle:masterfrom
MobilityData:gtfs-vehicles
Closed

GTFS-Vehicles (core version)#200
timMillet wants to merge 1 commit intogoogle:masterfrom
MobilityData:gtfs-vehicles

Conversation

@timMillet
Copy link
Contributor

Context

Currently, the GTFS specification does not provide any vehicle information besides routes.route_type, trips.wheelchair_accessible, and trips.bike_allowed. However, most transit agencies have data on vehicles they operate, which could enhance passenger information provided in GTFS format.

The following needs could be addressed with vehicle data:

  • Suggesting to somebody in a wheelchair where to board the vehicle. Refining accessibility information per carriage or door is the last key information for generating exhaustive step-free trip plans (pathways.txt already provides step-free trip plans through stations only).
  • Suggesting to anybody where to board the vehicle according to vehicle features or station components at destination. Looking for optimal vehicle boarding is a common behavior of subway and train riders. Optimal vehicle boarding is sometimes displayed on station posters or screens, so GTFS data should represent the same information.
  • Informing (more precisely) about vehicle crowdedness. The GTFS real-time specification supports crowdedness, but the field OccupancyStatus in VehiclePosition is still experimental, and the scale of crowdedness is relatively subjective.
  • Informing about vehicle features and amenities. Wheelchair spots, bike racks, AC, wifi, carriage class, dining availability, and other information may be useful in certain circumstances for making travel decisions.

Pull Request

This PR is a core version of a broader GTFS-Vehicles extension proposal.
The goal of the current PR is:

  • Describing vehicles and their composition (e.g: a subway train composed of 7 carriages) with vehicle_categories.txt and vehicle_couplings.txt;
  • Assigning one or more vehicles to routes or trips, using routes.txt, trips.txt, and/or vehicle_allocations.txt;
  • Providing information on optimal vehicle boarding, within vehicle_boardings.txt.

This PR suggests the extension or the addition of:

  • routes.txt (1 new field);
  • trips.txt (1 new field);
  • vehicle_categories.txt (new file);
  • vehicle_allocations.txt (new file);
  • vehicle_couplings.txt (new file);
  • vehicle_boardings.txt (new file).

Compared to GTFS-Vehicles full version (still at the discussion stage), this PR does not include, mainly:

  • Vehicle features and amenities in vehicle_categories.txt;
  • Door information in vehicle_doors.txt and vehicle_boardings.txt;
  • Vehicle allocation and orientation in stop_times.txt;
  • Crowdedness information in message OccupancyDescriptor.

Links

GTFS-Vehicles (Core version) - this PR: http://bit.ly/gtfs-vehicles-core
GTFS-Vehicles (Full version): http://bit.ly/gtfs-vehicles

2 extended files and 4 new files
@skinkie
Copy link
Contributor

skinkie commented Feb 11, 2020

-1 overly complex
-1 travel information and accessibility is currently being facilitated via denormalised fields
-1 this is not about vehicles (a thing with a license plate as used in GTFS-RT), but about vehicle_types, why use non-standard wording?
-1 wording is ambigious, and not following any common practice, for example what is a "child vehicle"?

@mgilligan
Copy link

-1

Similar to @skinkie, I find the terminology in the pull request bizarre and was surprised this didn't include an actual vehicle inventory. Typically a coupling of rail cars/carriages is referred to as a consist, not parent/child.

@timMillet
Copy link
Contributor Author

Thank you for your feedback! In order to ease the production and consumption of GTFS-Vehicles data, we are currently working on Best Practices and dataset examples that will be available soon. Some answers to the concerns that were expressed:

Different files

This proposal differentiates the data producers’ needs for vehicle description into 4 distinct files, so that every data producer can provide only the files that suit its needs:

  • To describe vehicle categories only, use vehicle_categories.txt.
  • To describe vehicle categories and their composition (e.g: a coupled train composed of 2 trains, like this SG3 from Rotterdam Metro), use vehicle_categories.txt along with vehicle_couplings.txt.
  • To describe vehicle categories and their assignation (e.g: SG2, SG2/1, SG3, and HSG3 may be operated on Rotterdam Metro’s line B), use vehicle_categories.txt along with:
    • For route assignments, the extended routes.txt (or vehicle_allocations.txt if two or more vehicle categories are assigned to a single route);
    • For trip assignments, the extended trips.txt.
  • To describe vehicle boarding, linking vehicle categories with boarding areas, use vehicle_categories.txt along with vehicle_boardings.txt.

Vehicle category

The reasons behind the term “vehicle category”, with a vehicle_category_id, were:

  • A vehicle_id would have been too similar to the id field from the message VehicleDescriptor in the GTFS-rt specification.
  • We thought “vehicle type” was heavily related to routes.route_type: a “vehicle type” would be a streetcar, a metro, a train, etc. This definition would be too restricted for the needs of this proposal.
  • We needed a term that simultaneously represents “vehicle types” (e.g: streetcar, metro, train), vehicles according to their nesting level (e.g: coupled train, train, carriage), and vehicles depending on their different generations (e.g: SG2, SG2/1, SG3, HSG3). The term “rolling stock” could have been interesting, but can we say “rolling” for ferries or aerial lifts?

Parent/Child

Different terms were needed to describe vehicle categories according to their nesting levels. We chose not to refer to the train terminology and use generic terms instead so that, if needed, all vehicle categories can be coupled. We are cautious, mostly because the GTFS spec must be backward compatible.

The definition of parent and child vehicles could be added within the specification of vehicle_couplings.txt.

Vehicle inventory

This PR is a core version of GTFS-Vehicles (full version), which includes vehicle features and amenities. It means that this PR is compatible with a vehicle inventory, that would be done within vehicle_categories.txt.

As you may see in the full version, discussions are still ongoing about which features and amenities to include. However, the sections on vehicle coupling and boarding were less an issue, hence this PR. The same thing happened with GTFS-Pathways: a subset was adopted, and the full proposal is still on the table.

@skinkie
Copy link
Contributor

skinkie commented Feb 13, 2020

We thought “vehicle type” was heavily related to routes.route_type: a “vehicle type” would be a streetcar, a metro, a train, etc.

I don't know who is "we" in this case. But if you could inform them that when mentioning a streetcar, metro, train they are talking about modes of transport, which we have standardised in GTFS as route_type. A vehicle type in a professional transit environment refers to the pink metro and the purple bus, or more specifically could aggregate trains consisting of multiple vehicle types in sequence.

We chose not to refer to the train terminology

Indeed, your thinktank tries to reinvented the wheel, introducing new terminology and new data models, while not improving or implementing best practises on the same topic. http://www.transmodel-cen.eu/category/tutorials/

same thing happened with GTFS-Pathways

Unrelated, pathways are directly affecting journey planning. Vehicle Types are merely properties.

@timMillet
Copy link
Contributor Author

timMillet commented Feb 26, 2020

Hi everybody,

I’m seeing that there are some concerns around the terminology used in this PR. To address these concerns, I would be thankful if you could offer your constructive feedback on the 3 questions below. It would be very great to have different points of view from several data producers/consumers. Thank you!

“Vehicles” versus “vehicle types” versus “vehicle categories”

In this PR, we do not want to describe vehicles one by one, but we would like to group them into types/categories upon their attributes.

Currently in the GTFS spec:

  • Vehicles: vehicle is already defined in GTFS-realtime, within the messages VehicleDescriptor. It means a specific vehicle.
  • Vehicle types: _type fields already exist, with routes.route_type and stops.location_type. Both fields are enums. No vehicle_type fields are defined in the official specification.
  • Vehicle categories: _category fields do not exist. Consequently, no vehicle_category fields are defined in the official specification.

Question 1: Should we use the term vehicle type or vehicle category?

Option 1A:

  • _type fields are dedicated to enums: predefined typologies, with restricted possibilities (4 choices for location_type; 10 for route_type and stakeholders agreed it shouldn’t be a lot of route types). The data producers cannot freely define the types.
  • _category fields are dedicated to fields for which the data producers can establish their own typology, with unlimited possibilities/categories. The data producers can freely define the different categories. Other _category fields could exist in the future, following this logic.

In this proposal, the vehicle description is not based on an enum: the data producers can specify as many categories as required (trams, train sets, coupled trains, carriages, buses, boats, aerial lifts, etc). Following this choice, the term of vehicle category will remain in this PR.

Option 1B:

  • The _type fields are both for restricted and free typologies. It combines enums with restricted possibilities and other typologies with unlimited possibilities/categories. The data producers can sometimes define their own categories, sometimes not.

Following this choice, the term of vehicle category in this PR will be replaced by vehicle type.

"Train/carriage" versus "parent/child"

In this PR, we would like to describe that several vehicle types/categories may be coupled to form a composed vehicle type/category.

Currently in the GTFS spec:

  • Train is not specifically defined in the spec, but the routes.route_type=2 Rail can be considered as a synonym. Tram, Streetcar, Light rail (routes.route_type=0), Subway, Metro (routes.route_type=1), Cable tram (routes.route_type=5), and Monorail (routes.route_type=12), can be considered at different levels as related to trains (they are different rail technologies).
  • Carriage is not defined in the spec.
  • Parent is a concept already present in the spec, with the field stops.parent_station that defines hierarchy between different locations.
  • Child is a concept already present in the spec, without any specific field: it is understood as stops/platforms with a parent station defined in stops.parent_station. The term Child is explicitly mentioned within the descriptions of stops.wheelchair_boarding, transfers.from_stop_id, and transfers.to_stop_id.

Currently in operation:

  • Vehicle couplings, with vehicles other than trains or related:
    • Bus (routes.route_type=3) with trailer operated by MVG in Munich, DE. Wikipedia
      München
    • Aerial lift (routes.route_type=6) with 4 to 5 modules in a row, in Grenoble, FR. Wikipedia
      Grenoble
  • Vehicle coupling, with 3 levels of vehicle nesting:
    • Coupled TGV train (routes.route_type=2) in France, a coupled train composed of 2 train sets, each train set composed of 8 passenger carriages and 2 locomotives. Wikipedia
      TGV

Question 2: Should we use the terms train/carriage or parent/child?

Option 2A:

  • The concepts of parent and child are dedicated to define hierarchy (between physical objects) throughout the GTFS spec, like the field stops.parent_station is already doing.

Following this choice, the terms of parent and child will remain in this PR.

Option 2B:

  • The concepts of parent and child are dedicated to define hierarchy for stops, and maybe other things in the future.
  • To specifically define hierarchy between vehicle types/categories, the terms train and carriage are used.

Following this choice, the terms of parent and child in this PR will be respectively replaced by train and carriage. The bus with trailer operated by MVG would be considered as a train (bus train?) composed of 2 carriages (1 standard bus, 1 bus trailer). The aerial lift in Grenoble would be considered as a train (aerial train?) composed of 4 to 5 carriages (4 to 5 gondolas).

Question 3: Should we allow 3 nesting levels for vehicle coupling?

Option 3A:

  • 3 nesting levels are allowed, and:
    • If option 2A: the terms grandparent/grandchild are also used to describe this hierarchy (full hierarchy: child/parent/grandparent or grandchild/child/parent)
    • If option 2B: the term coupled train is also used to describe this hierarchy (full hierarchy: carriage/train/coupled train)

Following this choice, the 3 nesting levels as proposed in this PR will remain. The coupled TGV train would be defined in the spec as a grandparent vehicle (or a coupled train) composed of 2 parent vehicles (or 2 trains), each parent vehicle (or each train) composed of 8 child vehicles (or 8 carriages).

Option 3B:

  • Only 2 nesting levels are allowed. Their naming follows option 2A or 2B.

Following this choice, the 3 nesting levels in this PR will be replaced by 2 nesting level. In a dataset, the coupled TGV train would be defined separately from the regular TGV train set: the coupled TGV train would be a parent vehicle (or a train) composed of 2 child vehicles (or 2 carriages); and the regular TGV train set would be a parent vehicle (or a train) composed of 8 child vehicles (or 8 carriages).

@skinkie
Copy link
Contributor

skinkie commented Feb 26, 2020

@timMillet @LeoFrachet

First of all. To me it feels that your organisation more and more starts acting as some sort of lobbying organisation. Lobby has nothing to do in a standardisation process or ecosystem. It is counter productive and it does not help the group in a good direction. Especially since more people already mentioned that what you are trying to standardise has been solved in the field with different wording. The reply of Tim again suggests that there is some sort of silent agreement this should be ignored.

  • Vehicle types: _type fields already exist, with routes.route_type and stops.location_type. Both fields are enums. No vehicle_type fields are defined in the official specification.

No where in the specification is mentioned that a type must be an enumeration. Assuming this is a fallacy. I can give a clear example where direction_id is not an id type, but a 0/1. In the same fashion monday-friday have been retyped to enum, while in my perspective this always has been a boolean field. Hence there are no strict definitions of suffixes and are not consistently used, nor enforced.

Question 1: Should we use the term vehicle type or vehicle category?

Option 1A:

The comments of existing fields are unrelated to a new field. We have not agreed that the _type suffix is limited to an enumeration, neither that an enumeration must have a _type suffix.

  • _category fields are dedicated to fields for which the data producers can establish their own typology, with unlimited possibilities/categories. The data producers can freely define the different categories. Other _category fields could exist in the future, following this logic.

I cannot see any specification of which the suffix _category has been defined in such broad way. Neither has _category be mentioned anywhere in the specification https://gtfs.org/reference/static

Following this choice, the terms of parent and child in this PR will be respectively replaced by train and carriage. The bus with trailer operated by MVG would be considered as a train (bus train?) composed of 2 carriages (1 standard bus, 1 bus trailer). The aerial lift in Grenoble would be considered as a train (aerial train?) composed of 4 to 5 carriages (4 to 5 gondolas).

Again, this is an invented wording that does not match the suggestions and common practices. A coupled bus, tram or aerial lift is not a train.

Question 3: Should we allow 3 nesting levels for vehicle coupling?

Lets first start to define what exactly we want to model. Is it:
A: the physical appearance of a vehicle, as in a subtype of a vehicletype so a traveler can recognise the vehicle.
B: a facility that would allow to model operational processes such as porage working, and their relationship within GTFS as passenger information.

@LeoFrachet
Copy link
Contributor

If I understand correctly, the two main points which are getting discussed on this pull request are about naming (vehicle_categorie vs vehicle_type and parent & child vs train & carriage).

I see as a good sign that the two remaining issues in the proposal are about naming. It shows that there is a consensus on the modeling and on the definitions.

If the naming that MobilityData chose (namely me, @LeoFrachet, since you were asking) has flaws which can be fixed, let's do it. As you know I'm not a native anglophone and my English isn't perfect.

As Tim explained:

  • I chose "categories" since they are user defined (aka not an enum defined in the spec), and I was afraid calling them vehicle_type was going to be hinting toward spec-defined categories.
  • I chose "parent" and "child" instead of "carriage" and "train" to be generic and to avoid confusing producers by calling buses "trains".

We are paid (by our members) to draft specifications, to have opinions and to defend them, but not to vote. So let's ask the community what is clearer for them. Our goal is to make the spec as clear as possible, and if vehicle_type is clearer for the community, then it's the right choice and we'll be happy to embrace it.

So, community, let's do an informal vote, to help us choose the clearer option for those two questions.

@LeoFrachet
Copy link
Contributor

LeoFrachet commented Mar 9, 2020

GTFS Community, please give us your prefered option on the two following questions (it’s an informal vote, outside of The Official Process):

Question A: To describe groups of vehicles that share the same features (e.g. number of doors, seats, bike spots), and which are defined by each GTFS producer, would use the term:

  • A1: “vehicle categories”; or
  • A2: “vehicle types”?

MobilityData opinion: A1, since “..._type” have been enum in the past in GTFS so “..._categories” may be clearer that they depend on the producer, but we can also extend the concept of “..._type” if you do not see it as confusing.

Question B: To describe vehicle coupling (e.g. 1 train made of 7 carriages; 1 bus with trailer made of 1 standard bus and 1 separated trailer; 1 gondola set made of 5 gondolas in a row), would you:

  • B1: name the whole thing “parent vehicle” and each unit composing it “child vehicle”; or
  • B2: name the whole thing “train” and each unit composing it “carriage”?

MobilityData opinion: B1, since parent/child is more generic and is less confusing when the vehicle isn’t on a track (articulated buses referred as “train” could be confusing), but if train/carriage is clearer for you and less confusing, let’s do it.

Pinging:

@skinkie
Copy link
Contributor

skinkie commented Mar 9, 2020

We are paid (by our members) to draft specifications, to have opinions and to defend them, but not to vote. So let's ask the community what is clearer for them.

For me this is exactly the problem @LeoFrachet. Don't ask the community but your members to step into the discussion. Don't ask their money but their participation in the process, instead of some capital they surely can miss anyway. The community in investing a significant amount of time that is unpaid for, the least you you make your members do is to actively participate, not in the voting process, but the drafting process and discussions such as these.

@skinkie
Copy link
Contributor

skinkie commented Mar 9, 2020

  • B1: name the whole thing “parent vehicle” and each unit composing it “child vehicle”; or
  • B2: name the whole thing “train” and each unit composing it “carriage”?

We need to go one step back for the naming. What does this proposal want to achieve in the future? Is this proposal only a way to identify a formation of carriages, with an order and subelement. For example to model a locomotive, bistro, passenger combination combination. Something would also be approachable from a tram, an articulated bus to describe the number of seats in certain parts.

Or is the wish to model a specific transit thing (called a JourneyPart) where a certain part of a vehicle is operating on a trip, but not the full trip, where passengers sometimes can move freely, but must have an exact reference at uncoupling in a future moment in time.

For me the above subjects are extremely closely related, especially since I made the mistake to actually implement another transit standard called RailML.

@flocsy
Copy link
Contributor

flocsy commented Mar 10, 2020

A: for sure not A2, but maybe even A1 is confusing. We can have multiple "categories" (in your example: cat1: number of doors, cat2: number of seats, cat3: number of bike spots). In my view these are "dimensions" and each vehicle has a value in each dimension: cat1: 3 doors, cat2: 60 seats, cat3: 6 bike spots

B: B1

@Bertware
Copy link

Bertware commented Mar 18, 2020

Question 1:

(For me) a vehicle type is a train, a bus, ... as used by route_type and location_type, which means vehicle_type (1A) would be confusing. But the use cases for this data wouldn't be to group all vehicles with 3 doors or 50 seats, but for example to differentiate between non-articulated and articulated buses, or maybe more specific between each type of bus (Let's say Volvo B1, B2 or a Scania B5, I know nothing about bus types). Between different train types, for example an M4,M5,M6 carriage. In this case you'd be specifying a vehicle (type), which leads to 1A, even though it'd be confusing. A vehicle (without the type) would imply one specific vehicle with a certain number plate or id, so that isn't an option either.

Question 2:

I agree with @skinkie that question 2 needs to be more clear as to what the exact goal is.

2A: There is a train which splits up and joins another train later on. Are there 2 "parent vehicles" which are splitting, of which one is joining a 3rd parent vehicle later? This would mean a parent can have another parent? Or is it one parent vehicle splitting of a child vehicle, but would that mean that there would be a "child vehicle" travelling alone for a part of its journey.

2B: Having a "train" as "root element" would be confusing, as it would also be used for trams and other means of transport.

Question 3:

3B, 2 levels. A formation of two trains, 8 carriages each, would be 16 carriages where you cannot pass from carriage 8 to carriage 9 (assuming you can enter the steering carriages). 3A becomes complicated when the two trains would split (do the new trains get a grandparent?), whereas in 3B one formation would simply become two formations.

@botanize
Copy link
Contributor

Maybe I'm missing something, but I don't see the linkage between vehicle category and vehicle occupancy as mentioned in the needs that would be addressed. I had expected that the information in GTFS-realtime VehiclePositions would be linked to information about the features of a vehicle, but I don't see how you can connect the vehicle.id field in VehiclePositions to vehicle_category, unless you provide yet another table that links vehicle_id to vehicle_category.

Why not just provide a vehicles table instead of a vehicle category table? At Metro Transit it would be easier for us to provide a table of vehicles than vehicle categories (our information about vehicles is at the individual vehicle level, we sometimes modify seating in vehicles, but not for the entire series of that vehicles) and it would still result in less duplication than something like trip_headsign in the trips table.

Is there not even a strong consensus on a few vehicle attributes that could be included in this pull request (seated capacity, standing_capacity), to motivate its use?

Finally, we have a single line (commuter rail) that is restricted to a single vehicle type on any single route or trip. Although we are often consistent in the type of bus used on a trip, it's not a rule, and it does change. So all of the GTFS-Vehicles extensions to static files are pretty much useless for our service. But we do want a standard way of sharing vehicle information with other consumers for things like occupancy status, and realtime departure and occupancy prediction generation.

For me, the minimal initial implementation would be a vehicles.txt file with vehicle_id as the primary key, and a few informational fields like seated_capacity, standing_capacity, low_floor, wheelchair_access, and wheelchair_capacity. This would be immediately useful and you could work out how to model consists later.

@stale
Copy link

stale bot commented Aug 21, 2021

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. label Aug 21, 2021
@stale
Copy link

stale bot commented Aug 28, 2021

This pull request has been closed due to inactivity. Pull requests can always be reopened after they have been closed. See the Specification Amendment Process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants