Skip to content

Are there cases where the labels structure needs to address internationalization? #73

@Brianthered

Description

@Brianthered

Looking at:

<h4 id="_label_naming"><a class="anchor" href="#_label_naming"></a>6.2.2. Label Naming</h4>
<div class="paragraph">
<p>Each assertion has a label defined either by the C2PA specifications or an external entity. These labels are strings which are namespaced, as described in the preceding clause or by an entity. The most common labels will be defined in the <code>c2pa</code> namespace, but labels may use any namespace that follows the conventions. Labels are also versioned with a simple incrementing integer scheme (e.g., <code>c2pa.actions.v2</code>). If no version is provided, it is considered as <code>v1</code>. The list of publicly known labels can be found in <a href="#_c2pa_standard_assertions">Chapter 18, <em>C2PA Standard Assertions</em></a>.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
Previous versions of this document also provided for namespacing for well-established standards, but that has been superseded by simply having them via entity-specific namespaces (e.g., <code>org.iso</code>, <code>org.w3</code>).
</td>
</tr>
</table>
</div>
<div id="abnf_for_assertion_labels" class="listingblock">
<div class="title">ABNF for Assertion Labels</div>
<div class="content">
<pre class="highlightjs highlight"><code class="language-abnf hljs" data-lang="abnf">namespaced-label = qualified-namespace label
qualified-namespace = "c2pa" / entity
entity = entity-component *( "." entity-component )
entity-component = 1( DIGIT / ALPHA ) *( DIGIT / ALPHA / "-" / "_" )
label = 1*( "." label-component )
label-component = 1( DIGIT / ALPHA ) *( DIGIT / ALPHA / "-" / "_" )</code></pre>
</div>
</div>
<div class="paragraph">
<p>The period-separated components of a label follow the variable naming convention (<code>[a-zA-Z][a-zA-Z0-9_-]*</code>) specified in the POSIX or C locale, with the restriction that the use of a repeated underscore character (<code>__</code>) is reserved for labelling multiple assertions of the same type.</p>

it occurs to me that there should probably be additional information if the standard wants to support internationalized domain names. It seems like the assumption is that they will have the Punycode representation? It might be good to be explicit. I don't know if other labels will be using characters outside of ASCII, in which case it might be good to explicitly state how that would work.

Internationalized code will make processing and reading the metadata more difficult, so this seems like one of those things that doesn't have a great single answer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions