From 6fdfe808bd280477412f37760af25d429d2098cb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?L=C3=A9o=20Frachet?= Date: Wed, 7 Aug 2019 16:54:50 -0400 Subject: [PATCH 1/6] GTFS-Translations (without record_sub_id and field_value) --- gtfs/spec/en/reference.md | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/gtfs/spec/en/reference.md b/gtfs/spec/en/reference.md index 26e516ef9..090d6cefb 100644 --- a/gtfs/spec/en/reference.md +++ b/gtfs/spec/en/reference.md @@ -25,6 +25,7 @@ This document defines the format and structure of the files that comprise a GTFS - [transfers.txt](#transferstxt) - [pathways.txt](#pathwaystxt) - [levels.txt](#levelstxt) + - [translations.txt](#translationstxt) - [feed\_info.txt](#feed_infotxt) ## Term Definitions @@ -359,17 +360,32 @@ Describe the different levels of a station. Is mostly useful when used in conjun | `level_name` | Text | Optional | Optional name of the level (that matches level lettering/numbering used inside the building or the station). Is useful for elevator routing (e.g. “take the elevator to level “Mezzanine” or “Platforms” or “-1”).| -### feed_info.txt +### translations.txt File: **Optional** +In regions that have multiple official languages, transit agencies/operators typically have language-specific names and web pages. In order to best serve riders in those regions, it is useful for the dataset to include these language-dependent values. + +| Field Name | Type | Required | Description | +| ------ | ------ | ------ | ------ | +| `table_name` | Enum | **Required** | Defines the table that contains the field to be translated. Allowed values are: `agency`, `stops`, `routes`, `trips`, `stop_times`, `levels` and `feed_info` (do not include the `.txt` file extension). If a table with a new file name is added by another proposal in the future, the table name is the name of the filename without the `.txt` file extension. | +| `field_name` | Text | **Required** | Name of the field to be translated. Fields with type `Text` can be translated, fields with type `URL`, `Email` and `Phone number` can also be “translated” to provide resources in the correct language. Fields with other types should not be translated. | +| `language` | Language code | **Required** | Language of translation.

If the language is the same as in `feed_info.feed_lang`, the original value of the field will be assumed to be the default value to use in languages without specific translations (if `default_lang` doesn't specify otherwise).

Example: In Switzerland, a city in an officially bilingual canton is officially called “Biel/Bienne”, but would simply be called “Bienne” in French and “Biel” in German. | +| `translation` | Text or URL or Email or Phone number | **Required** | Translated value. | +| `record_id` | ID | **Conditionally Required** | Defines the record that corresponds to the field to be translated. The value in `record_id` should be a main ID of the table, as defined below:
• `agency_id` for `agency.txt`;
• `stop_id` for `stops.txt`;
• `route_id` for `routes.txt`;
• `trip_id` for `trips.txt`;
• `trip_id` for `stop_times.txt`;
• `NONE` for `feed_info.txt`.

No field should be translated in the other tables. However producers sometimes add extra fields that are outside the official specification and these unofficial fields may need to be translated. Below is the recommended way to use `record_id` for those tables:
• `service_id` for `calendar.txt`;
• `service_id` for `calendar_dates.txt`;
• `fare_id` for `fare_attributes.txt`;
• `fare_id` for `fare_rules.txt`;
• `shape_id` for `shapes.txt`;
• `trip_id` for `frequencies.txt`;
• `from_stop_id` for `transfers.txt`.

**Conditionally Required:**
- **forbidden** if `table_name` is `feed_info`;
- **forbidden** if `field_value` is defined;
- **required** if `field_value` is empty;. | + +### feed_info.txt + +File: **Optional** (**Required** if `translations.txt` is provided) + The file contains information about the dataset itself, rather than the services that the dataset describes. Note that, in some cases, the publisher of the dataset is a different entity than any of the agencies. | Field Name | Type | Required | Description | | ------ | ------ | ------ | ------ | | `feed_publisher_name` | Text | **Required** | Full name of the organization that publishes the dataset. This may be the same as one of the `agency.agency_name` values. | | `feed_publisher_url` | URL | **Required** | URL of the dataset publishing organization's website. This may be the same as one of the `agency.agency_url` values. | -| `feed_lang` | Language code | **Required** | Default language used for the text in this dataset. This setting helps GTFS consumers choose capitalization rules and other language-specific settings for the dataset. | +| `feed_lang` | Language code | **Required** | Default language used for the text in this dataset. This setting helps GTFS consumers choose capitalization rules and other language-specific settings for the dataset.

`translations.txt` can be used if languages other than the default language need to be defined.

If the dataset contains values in multiple languages (e.g. in multilingual countries like Switzerland, Belgium or Canada), the norm ISO 639-2 contains the language code “`mul`” to describe such reality. In such case, the best practice is to provide a translation for each of the languages used in the dataset.

For example, a dataset in Switzerland will have `feed_lang=mul` and will contain by default stop names “Genève” for Geneva, “Zürich” for Zurich and “Biel/Bienne” for the bilingual city of Biel/Bienne. But translations will be provided, in German: “Genf”, “Zürich” and “Biel”; in French: “Genève”, “Zurich” and “Bienne”; in Italian: “Ginevra”, “Zurigo” and “Bienna”; and in English: “Geneva”, “Zurich” and “Biel/Bienne”. | +| `default_lang` | Language code | Optional | Defines the language that should be used when the data consumer doesn’t know the language of the rider. It will often be "`en`" (English). | | `feed_start_date` | Date | Optional | The dataset provides complete and reliable schedule information for service in the period from the beginning of the `feed_start_date` day to the end of the `feed_end_date` day. Both days can be left empty if unavailable. The `feed_end_date` date must not precede the `feed_start_date` date if both are given. Dataset providers are encouraged to give schedule data outside this period to advise of likely future service, but dataset consumers should treat it mindful of its non-authoritative status. If `feed_start_date` or `feed_end_date` extend beyond the active calendar dates defined in [calendar.txt](#calendartxt) and [calendar_dates.txt](#calendar_datestxt), the dataset is making an explicit assertion that there is no service for dates within the `feed_start_date` or `feed_end_date` range but not included in the active calendar dates. | | `feed_end_date` | Date | Optional | (see above) | | `feed_version` | Text | Optional | String that indicates the current version of their GTFS dataset. GTFS-consuming applications can display this value to help dataset publishers determine whether the latest dataset has been incorporated. | From 310f424ca5a03c7079c21a22fea6a6214b587e00 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?L=C3=A9o=20Frachet?= Date: Wed, 7 Aug 2019 21:46:53 -0400 Subject: [PATCH 2/6] Add fields record_sub_id and field_value --- gtfs/spec/en/reference.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/gtfs/spec/en/reference.md b/gtfs/spec/en/reference.md index 090d6cefb..18063c212 100644 --- a/gtfs/spec/en/reference.md +++ b/gtfs/spec/en/reference.md @@ -372,7 +372,9 @@ In regions that have multiple official languages, transit agencies/operators typ | `field_name` | Text | **Required** | Name of the field to be translated. Fields with type `Text` can be translated, fields with type `URL`, `Email` and `Phone number` can also be “translated” to provide resources in the correct language. Fields with other types should not be translated. | | `language` | Language code | **Required** | Language of translation.

If the language is the same as in `feed_info.feed_lang`, the original value of the field will be assumed to be the default value to use in languages without specific translations (if `default_lang` doesn't specify otherwise).

Example: In Switzerland, a city in an officially bilingual canton is officially called “Biel/Bienne”, but would simply be called “Bienne” in French and “Biel” in German. | | `translation` | Text or URL or Email or Phone number | **Required** | Translated value. | -| `record_id` | ID | **Conditionally Required** | Defines the record that corresponds to the field to be translated. The value in `record_id` should be a main ID of the table, as defined below:
• `agency_id` for `agency.txt`;
• `stop_id` for `stops.txt`;
• `route_id` for `routes.txt`;
• `trip_id` for `trips.txt`;
• `trip_id` for `stop_times.txt`;
• `NONE` for `feed_info.txt`.

No field should be translated in the other tables. However producers sometimes add extra fields that are outside the official specification and these unofficial fields may need to be translated. Below is the recommended way to use `record_id` for those tables:
• `service_id` for `calendar.txt`;
• `service_id` for `calendar_dates.txt`;
• `fare_id` for `fare_attributes.txt`;
• `fare_id` for `fare_rules.txt`;
• `shape_id` for `shapes.txt`;
• `trip_id` for `frequencies.txt`;
• `from_stop_id` for `transfers.txt`.

**Conditionally Required:**
- **forbidden** if `table_name` is `feed_info`;
- **forbidden** if `field_value` is defined;
- **required** if `field_value` is empty;. | +| `record_id` | ID | **Conditionally Required** | Defines the record that corresponds to the field to be translated. The value in `record_id` should be a main ID of the table, as defined below:
• `agency_id` for `agency.txt`;
• `stop_id` for `stops.txt`;
• `route_id` for `routes.txt`;
• `trip_id` for `trips.txt`;
• `trip_id` for `stop_times.txt`.

No field should be translated in the other tables. However producers sometimes add extra fields that are outside the official specification and these unofficial fields may need to be translated. Below is the recommended way to use `record_id` for those tables:
• `service_id` for `calendar.txt`;
• `service_id` for `calendar_dates.txt`;
• `fare_id` for `fare_attributes.txt`;
• `fare_id` for `fare_rules.txt`;
• `shape_id` for `shapes.txt`;
• `trip_id` for `frequencies.txt`;
• `from_stop_id` for `transfers.txt`.

**Conditionally Required:**
- **forbidden** if `table_name` is `feed_info`;
- **forbidden** if `field_value` is defined;
- **required** if `field_value` is empty. | +| `record_sub_id` | ID | **Conditionally Required** | Helps the record that contains the field to be translated when the table doesn’t have a unique ID. Therefore, the value in `record_sub_id` is the secondary ID of the table, as defined by the table below:
• None for `agency.txt`;
• None for `stops.txt`;
• None for `routes.txt`;
• None for `trips.txt`;
• `stop_sequence` for `stop_times.txt`;
• None for `feed_info.txt`.

No field should be translated in the other tables. However producers sometimes add extra fields that are outside the official specification and these unofficial fields may need to be translated. Below is the recommended way to use `record_id` for those tables:
• None for `calendar.txt`;
• `date` for `calendar_dates.txt`;
• None for `fare_attributes.txt`;
• `route_id` for `fare_rules.txt`;
• None for `shapes.txt`;
• `start_time` for `frequencies.txt`;
• `to_stop_id` for `transfers.txt`.

**Conditionally Required:**
- **forbidden** if `table_name` is `feed_info`;
- **forbidden** if `field_value` is defined;
- **forbidden** if `field_value` is defined
- **required** if `table_name=stop_times` and `record_id` is defined. | +| `field_value` | Text or URL or Email or Phone number | **Conditionally Required** | Instead of defining which record should be translated by using `record_id` and `record_sub_id`, this field can be used to define the value which should be translated. When used, the translation will be applied when the fields identified by `table_name` and `field_name` contains the exact same value defined in field_value.

The field must have **exactly** the value defined in `field_value`. If only a subset of the value matches `field_value`, the translation won’t be applied.

If two translation rules match the same record (one with `field_value`, and the other one with `record_id`), then the rule with `record_id` is the one which should be used.

**Conditionally Required:**
- **forbidden** if `table_name` is `feed_info`;
- **forbidden** if `record_id` is defined;
- **required** if `record_id` is empty. | ### feed_info.txt From 90d6c8792d1a916f22cf9ab01ed966409e53fb02 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?L=C3=A9o=20Frachet?= Date: Wed, 7 Aug 2019 21:50:04 -0400 Subject: [PATCH 3/6] fix typos --- gtfs/spec/en/reference.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gtfs/spec/en/reference.md b/gtfs/spec/en/reference.md index 18063c212..c01f9a5ab 100644 --- a/gtfs/spec/en/reference.md +++ b/gtfs/spec/en/reference.md @@ -373,7 +373,7 @@ In regions that have multiple official languages, transit agencies/operators typ | `language` | Language code | **Required** | Language of translation.

If the language is the same as in `feed_info.feed_lang`, the original value of the field will be assumed to be the default value to use in languages without specific translations (if `default_lang` doesn't specify otherwise).

Example: In Switzerland, a city in an officially bilingual canton is officially called “Biel/Bienne”, but would simply be called “Bienne” in French and “Biel” in German. | | `translation` | Text or URL or Email or Phone number | **Required** | Translated value. | | `record_id` | ID | **Conditionally Required** | Defines the record that corresponds to the field to be translated. The value in `record_id` should be a main ID of the table, as defined below:
• `agency_id` for `agency.txt`;
• `stop_id` for `stops.txt`;
• `route_id` for `routes.txt`;
• `trip_id` for `trips.txt`;
• `trip_id` for `stop_times.txt`.

No field should be translated in the other tables. However producers sometimes add extra fields that are outside the official specification and these unofficial fields may need to be translated. Below is the recommended way to use `record_id` for those tables:
• `service_id` for `calendar.txt`;
• `service_id` for `calendar_dates.txt`;
• `fare_id` for `fare_attributes.txt`;
• `fare_id` for `fare_rules.txt`;
• `shape_id` for `shapes.txt`;
• `trip_id` for `frequencies.txt`;
• `from_stop_id` for `transfers.txt`.

**Conditionally Required:**
- **forbidden** if `table_name` is `feed_info`;
- **forbidden** if `field_value` is defined;
- **required** if `field_value` is empty. | -| `record_sub_id` | ID | **Conditionally Required** | Helps the record that contains the field to be translated when the table doesn’t have a unique ID. Therefore, the value in `record_sub_id` is the secondary ID of the table, as defined by the table below:
• None for `agency.txt`;
• None for `stops.txt`;
• None for `routes.txt`;
• None for `trips.txt`;
• `stop_sequence` for `stop_times.txt`;
• None for `feed_info.txt`.

No field should be translated in the other tables. However producers sometimes add extra fields that are outside the official specification and these unofficial fields may need to be translated. Below is the recommended way to use `record_id` for those tables:
• None for `calendar.txt`;
• `date` for `calendar_dates.txt`;
• None for `fare_attributes.txt`;
• `route_id` for `fare_rules.txt`;
• None for `shapes.txt`;
• `start_time` for `frequencies.txt`;
• `to_stop_id` for `transfers.txt`.

**Conditionally Required:**
- **forbidden** if `table_name` is `feed_info`;
- **forbidden** if `field_value` is defined;
- **forbidden** if `field_value` is defined
- **required** if `table_name=stop_times` and `record_id` is defined. | +| `record_sub_id` | ID | **Conditionally Required** | Helps the record that contains the field to be translated when the table doesn’t have a unique ID. Therefore, the value in `record_sub_id` is the secondary ID of the table, as defined by the table below:
• None for `agency.txt`;
• None for `stops.txt`;
• None for `routes.txt`;
• None for `trips.txt`;
• `stop_sequence` for `stop_times.txt`;

No field should be translated in the other tables. However producers sometimes add extra fields that are outside the official specification and these unofficial fields may need to be translated. Below is the recommended way to use `record_sub_id` for those tables:
• None for `calendar.txt`;
• `date` for `calendar_dates.txt`;
• None for `fare_attributes.txt`;
• `route_id` for `fare_rules.txt`;
• None for `shapes.txt`;
• `start_time` for `frequencies.txt`;
• `to_stop_id` for `transfers.txt`.

**Conditionally Required:**
- **forbidden** if `table_name` is `feed_info`;
- **forbidden** if `field_value` is defined;
- **required** if `table_name=stop_times` and `record_id` is defined. | | `field_value` | Text or URL or Email or Phone number | **Conditionally Required** | Instead of defining which record should be translated by using `record_id` and `record_sub_id`, this field can be used to define the value which should be translated. When used, the translation will be applied when the fields identified by `table_name` and `field_name` contains the exact same value defined in field_value.

The field must have **exactly** the value defined in `field_value`. If only a subset of the value matches `field_value`, the translation won’t be applied.

If two translation rules match the same record (one with `field_value`, and the other one with `record_id`), then the rule with `record_id` is the one which should be used.

**Conditionally Required:**
- **forbidden** if `table_name` is `feed_info`;
- **forbidden** if `record_id` is defined;
- **required** if `record_id` is empty. | ### feed_info.txt From 3ea014c101a6179f172be2d0b2fc4de40bfc6376 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?L=C3=A9o=20Frachet?= Date: Wed, 7 Aug 2019 21:58:45 -0400 Subject: [PATCH 4/6] Adding pathways and levels. --- gtfs/spec/en/reference.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gtfs/spec/en/reference.md b/gtfs/spec/en/reference.md index c01f9a5ab..134c9d7f7 100644 --- a/gtfs/spec/en/reference.md +++ b/gtfs/spec/en/reference.md @@ -372,8 +372,8 @@ In regions that have multiple official languages, transit agencies/operators typ | `field_name` | Text | **Required** | Name of the field to be translated. Fields with type `Text` can be translated, fields with type `URL`, `Email` and `Phone number` can also be “translated” to provide resources in the correct language. Fields with other types should not be translated. | | `language` | Language code | **Required** | Language of translation.

If the language is the same as in `feed_info.feed_lang`, the original value of the field will be assumed to be the default value to use in languages without specific translations (if `default_lang` doesn't specify otherwise).

Example: In Switzerland, a city in an officially bilingual canton is officially called “Biel/Bienne”, but would simply be called “Bienne” in French and “Biel” in German. | | `translation` | Text or URL or Email or Phone number | **Required** | Translated value. | -| `record_id` | ID | **Conditionally Required** | Defines the record that corresponds to the field to be translated. The value in `record_id` should be a main ID of the table, as defined below:
• `agency_id` for `agency.txt`;
• `stop_id` for `stops.txt`;
• `route_id` for `routes.txt`;
• `trip_id` for `trips.txt`;
• `trip_id` for `stop_times.txt`.

No field should be translated in the other tables. However producers sometimes add extra fields that are outside the official specification and these unofficial fields may need to be translated. Below is the recommended way to use `record_id` for those tables:
• `service_id` for `calendar.txt`;
• `service_id` for `calendar_dates.txt`;
• `fare_id` for `fare_attributes.txt`;
• `fare_id` for `fare_rules.txt`;
• `shape_id` for `shapes.txt`;
• `trip_id` for `frequencies.txt`;
• `from_stop_id` for `transfers.txt`.

**Conditionally Required:**
- **forbidden** if `table_name` is `feed_info`;
- **forbidden** if `field_value` is defined;
- **required** if `field_value` is empty. | -| `record_sub_id` | ID | **Conditionally Required** | Helps the record that contains the field to be translated when the table doesn’t have a unique ID. Therefore, the value in `record_sub_id` is the secondary ID of the table, as defined by the table below:
• None for `agency.txt`;
• None for `stops.txt`;
• None for `routes.txt`;
• None for `trips.txt`;
• `stop_sequence` for `stop_times.txt`;

No field should be translated in the other tables. However producers sometimes add extra fields that are outside the official specification and these unofficial fields may need to be translated. Below is the recommended way to use `record_sub_id` for those tables:
• None for `calendar.txt`;
• `date` for `calendar_dates.txt`;
• None for `fare_attributes.txt`;
• `route_id` for `fare_rules.txt`;
• None for `shapes.txt`;
• `start_time` for `frequencies.txt`;
• `to_stop_id` for `transfers.txt`.

**Conditionally Required:**
- **forbidden** if `table_name` is `feed_info`;
- **forbidden** if `field_value` is defined;
- **required** if `table_name=stop_times` and `record_id` is defined. | +| `record_id` | ID | **Conditionally Required** | Defines the record that corresponds to the field to be translated. The value in `record_id` should be a main ID of the table, as defined below:
• `agency_id` for `agency.txt`;
• `stop_id` for `stops.txt`;
• `route_id` for `routes.txt`;
• `trip_id` for `trips.txt`;
• `trip_id` for `stop_times.txt`.

No field should be translated in the other tables. However producers sometimes add extra fields that are outside the official specification and these unofficial fields may need to be translated. Below is the recommended way to use `record_id` for those tables:
• `service_id` for `calendar.txt`;
• `service_id` for `calendar_dates.txt`;
• `fare_id` for `fare_attributes.txt`;
• `fare_id` for `fare_rules.txt`;
• `shape_id` for `shapes.txt`;
• `trip_id` for `frequencies.txt`;
• `from_stop_id` for `transfers.txt`;
• `pathway_id` for `pathways.txt`;
• `level_id` for `levels.txt`.

**Conditionally Required:**
- **forbidden** if `table_name` is `feed_info`;
- **forbidden** if `field_value` is defined;
- **required** if `field_value` is empty. | +| `record_sub_id` | ID | **Conditionally Required** | Helps the record that contains the field to be translated when the table doesn’t have a unique ID. Therefore, the value in `record_sub_id` is the secondary ID of the table, as defined by the table below:
• None for `agency.txt`;
• None for `stops.txt`;
• None for `routes.txt`;
• None for `trips.txt`;
• `stop_sequence` for `stop_times.txt`;

No field should be translated in the other tables. However producers sometimes add extra fields that are outside the official specification and these unofficial fields may need to be translated. Below is the recommended way to use `record_sub_id` for those tables:
• None for `calendar.txt`;
• `date` for `calendar_dates.txt`;
• None for `fare_attributes.txt`;
• `route_id` for `fare_rules.txt`;
• None for `shapes.txt`;
• `start_time` for `frequencies.txt`;
• `to_stop_id` for `transfers.txt`;
• None for `pathways.txt`;
• None for `levels.txt`.

**Conditionally Required:**
- **forbidden** if `table_name` is `feed_info`;
- **forbidden** if `field_value` is defined;
- **required** if `table_name=stop_times` and `record_id` is defined. | | `field_value` | Text or URL or Email or Phone number | **Conditionally Required** | Instead of defining which record should be translated by using `record_id` and `record_sub_id`, this field can be used to define the value which should be translated. When used, the translation will be applied when the fields identified by `table_name` and `field_name` contains the exact same value defined in field_value.

The field must have **exactly** the value defined in `field_value`. If only a subset of the value matches `field_value`, the translation won’t be applied.

If two translation rules match the same record (one with `field_value`, and the other one with `record_id`), then the rule with `record_id` is the one which should be used.

**Conditionally Required:**
- **forbidden** if `table_name` is `feed_info`;
- **forbidden** if `record_id` is defined;
- **required** if `record_id` is empty. | ### feed_info.txt From d10a5d6301dc98ab8669c2708c0d7c6d613c9ba5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?L=C3=A9o=20Frachet?= Date: Thu, 15 Aug 2019 11:15:45 -0400 Subject: [PATCH 5/6] Improve "mul" explaination --- gtfs/spec/en/reference.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gtfs/spec/en/reference.md b/gtfs/spec/en/reference.md index 134c9d7f7..d538d0a04 100644 --- a/gtfs/spec/en/reference.md +++ b/gtfs/spec/en/reference.md @@ -386,7 +386,7 @@ The file contains information about the dataset itself, rather than the services | ------ | ------ | ------ | ------ | | `feed_publisher_name` | Text | **Required** | Full name of the organization that publishes the dataset. This may be the same as one of the `agency.agency_name` values. | | `feed_publisher_url` | URL | **Required** | URL of the dataset publishing organization's website. This may be the same as one of the `agency.agency_url` values. | -| `feed_lang` | Language code | **Required** | Default language used for the text in this dataset. This setting helps GTFS consumers choose capitalization rules and other language-specific settings for the dataset.

`translations.txt` can be used if languages other than the default language need to be defined.

If the dataset contains values in multiple languages (e.g. in multilingual countries like Switzerland, Belgium or Canada), the norm ISO 639-2 contains the language code “`mul`” to describe such reality. In such case, the best practice is to provide a translation for each of the languages used in the dataset.

For example, a dataset in Switzerland will have `feed_lang=mul` and will contain by default stop names “Genève” for Geneva, “Zürich” for Zurich and “Biel/Bienne” for the bilingual city of Biel/Bienne. But translations will be provided, in German: “Genf”, “Zürich” and “Biel”; in French: “Genève”, “Zurich” and “Bienne”; in Italian: “Ginevra”, “Zurigo” and “Bienna”; and in English: “Geneva”, “Zurich” and “Biel/Bienne”. | +| `feed_lang` | Language code | **Required** | Default language used for the text in this dataset. This setting helps GTFS consumers choose capitalization rules and other language-specific settings for the dataset.

`translations.txt` can be used if languages other than the default language need to be defined.

If the untranslated values in the dataset are in multiple languages (e.g. in multilingual countries like Switzerland, Belgium or Canada the `stop_name` in `stops.txt` will be by default in different languages depending of the area), the `feed_lang` field should contain the language code `mul` defined by the norm ISO 639-2 to describe such situation. In such case, the best practice is to provide a translation for each of the languages used in the dataset. If all the untranslated values in the dataset are in the same language, then "mul" should not to be use.

For example, a dataset in Switzerland will have `feed_lang=mul` and will contain by default stop names “Genève” for Geneva, “Zürich” for Zurich and “Biel/Bienne” for the bilingual city of Biel/Bienne. But translations will be provided, in German: “Genf”, “Zürich” and “Biel”; in French: “Genève”, “Zurich” and “Bienne”; in Italian: “Ginevra”, “Zurigo” and “Bienna”; and in English: “Geneva”, “Zurich” and “Biel/Bienne”. | | `default_lang` | Language code | Optional | Defines the language that should be used when the data consumer doesn’t know the language of the rider. It will often be "`en`" (English). | | `feed_start_date` | Date | Optional | The dataset provides complete and reliable schedule information for service in the period from the beginning of the `feed_start_date` day to the end of the `feed_end_date` day. Both days can be left empty if unavailable. The `feed_end_date` date must not precede the `feed_start_date` date if both are given. Dataset providers are encouraged to give schedule data outside this period to advise of likely future service, but dataset consumers should treat it mindful of its non-authoritative status. If `feed_start_date` or `feed_end_date` extend beyond the active calendar dates defined in [calendar.txt](#calendartxt) and [calendar_dates.txt](#calendar_datestxt), the dataset is making an explicit assertion that there is no service for dates within the `feed_start_date` or `feed_end_date` range but not included in the active calendar dates. | | `feed_end_date` | Date | Optional | (see above) | From 788746941db1a98ecb46897e3065b1a6c8ad7d59 Mon Sep 17 00:00:00 2001 From: Tim Millet Date: Mon, 16 Dec 2019 10:52:24 -0500 Subject: [PATCH 6/6] Change feed_info.feed_lang definition --- gtfs/spec/en/reference.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gtfs/spec/en/reference.md b/gtfs/spec/en/reference.md index d538d0a04..785012028 100644 --- a/gtfs/spec/en/reference.md +++ b/gtfs/spec/en/reference.md @@ -386,8 +386,8 @@ The file contains information about the dataset itself, rather than the services | ------ | ------ | ------ | ------ | | `feed_publisher_name` | Text | **Required** | Full name of the organization that publishes the dataset. This may be the same as one of the `agency.agency_name` values. | | `feed_publisher_url` | URL | **Required** | URL of the dataset publishing organization's website. This may be the same as one of the `agency.agency_url` values. | -| `feed_lang` | Language code | **Required** | Default language used for the text in this dataset. This setting helps GTFS consumers choose capitalization rules and other language-specific settings for the dataset.

`translations.txt` can be used if languages other than the default language need to be defined.

If the untranslated values in the dataset are in multiple languages (e.g. in multilingual countries like Switzerland, Belgium or Canada the `stop_name` in `stops.txt` will be by default in different languages depending of the area), the `feed_lang` field should contain the language code `mul` defined by the norm ISO 639-2 to describe such situation. In such case, the best practice is to provide a translation for each of the languages used in the dataset. If all the untranslated values in the dataset are in the same language, then "mul" should not to be use.

For example, a dataset in Switzerland will have `feed_lang=mul` and will contain by default stop names “Genève” for Geneva, “Zürich” for Zurich and “Biel/Bienne” for the bilingual city of Biel/Bienne. But translations will be provided, in German: “Genf”, “Zürich” and “Biel”; in French: “Genève”, “Zurich” and “Bienne”; in Italian: “Ginevra”, “Zurigo” and “Bienna”; and in English: “Geneva”, “Zurich” and “Biel/Bienne”. | -| `default_lang` | Language code | Optional | Defines the language that should be used when the data consumer doesn’t know the language of the rider. It will often be "`en`" (English). | +| `feed_lang` | Language code | **Required** | Default language used for the text in this dataset. This setting helps GTFS consumers choose capitalization rules and other language-specific settings for the dataset. The file `translations.txt` can be used if the text needs to be translated into languages other than the default one.

The default language may be multilingual for datasets with the original text in multiple languages. In such cases, the `feed_lang` field should contain the language code `mul` defined by the norm ISO 639-2. The best practice here would be to provide, in `translations.txt`, a translation for each language used throughout the dataset. If all the original text in the dataset is in the same language, then `mul` should not be used.
_Example: Consider a dataset from a multilingual country like Switzerland, with the original `stops.stop_name` field populated with stop names in different languages. Each stop name is written according to the dominant language in that stop’s geographic location, e.g. `Genève` for the French-speaking city of Geneva, `Zürich` for the German-speaking city of Zurich, and `Biel/Bienne` for the bilingual city of Biel/Bienne. The dataset `feed_lang` should be `mul` and translations would be provided in `translations.txt`, in German: `Genf`, `Zürich` and `Biel`; in French: `Genève`, `Zurich` and `Bienne`; in Italian: `Ginevra`, `Zurigo` and `Bienna`; and in English: `Geneva`, `Zurich` and `Biel/Bienne`._ | +| `default_lang` | Language code | Optional | Defines the language that should be used when the data consumer doesn’t know the language of the rider. It will often be `en` (English). | | `feed_start_date` | Date | Optional | The dataset provides complete and reliable schedule information for service in the period from the beginning of the `feed_start_date` day to the end of the `feed_end_date` day. Both days can be left empty if unavailable. The `feed_end_date` date must not precede the `feed_start_date` date if both are given. Dataset providers are encouraged to give schedule data outside this period to advise of likely future service, but dataset consumers should treat it mindful of its non-authoritative status. If `feed_start_date` or `feed_end_date` extend beyond the active calendar dates defined in [calendar.txt](#calendartxt) and [calendar_dates.txt](#calendar_datestxt), the dataset is making an explicit assertion that there is no service for dates within the `feed_start_date` or `feed_end_date` range but not included in the active calendar dates. | | `feed_end_date` | Date | Optional | (see above) | | `feed_version` | Text | Optional | String that indicates the current version of their GTFS dataset. GTFS-consuming applications can display this value to help dataset publishers determine whether the latest dataset has been incorporated. |