Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
writer.pyintroduces enhancements to the Avro schema used for encoding relationships (links) within PFB files.importers/json.pyThis update enhances how JSON records are validated and transformed into PFB graph relationships, introducing stricter checks and improved link expressiveness.🔧 Specific Changes: writer.py
New Relationship Fields Added to Avro Schema:
"label"["null", "string"]None"requestor","owner")"properties"["null", {"type": "map", "values": "string"}]None📦 Implications:
Links between entities in PFB files now support:
This expands the expressiveness of PFB for graph-based data models, supporting richer relationship semantics.
🔧 Specific Changes: importers/json.py
Stricter JSON Record Validation:
New pre-condition added to
convert_json():Ensures that all JSON records contain at least one of:
"submitter_id""code"Prevents generation of incomplete or invalid PFB nodes.
Enriched Relationship Metadata:
When building links in
convert_json(), two new fields are populated:"label": The original JSON property name (e.g.,"knows","colleagues")."properties": Any nested"properties"dictionary provided within the link object.Example:
{ "knows": { "submitter_id": "person_2", "properties": { "since": "2020-01-01" } } }Results in a PFB link with:
label = "knows"properties = {"since": "2020-01-01"}📦 Implications:
writer.py.✅ Summary for Release Notes:
Schema Enhancement: Extended the PFB Avro schema to support
labelandpropertiesfields on relationships, enabling richer and semantically meaningful graph links.JSON Import Enhancements:
submitter_idorcodeon all JSON records.labelandpropertiesfields, enabling richer, metadata-enhanced links.Expanded Documentation:
labelandpropertiesFields in a PFB LinkThe
labelandpropertiesfields are recent schema enhancements to the PFB format that enrich graph relationships with semantic meaning and optional metadata.🔗 What is a Link?
In PFB, a link represents a relationship or edge between two nodes (entities) in the graph.
Minimum structure of a link:
{ "dst_id": "target_node_id", "dst_name": "target_node_type", "label": "relationship_label", "properties": { "property_key": "property_value" } }🏷️
labelFieldstringornull"knows","colleagues","owner","requestor"Example:
In JSON input:
Results in:
🗂️
propertiesFieldmap<string, string>ornull{ "since": "2020-01-01" },{ "workplace": "acme_corp" }"properties"key nested within the link in JSON input.Example:
In JSON input:
Results in:
✅ Why These Fields Matter
labelandproperties.nullfor legacy data.📦 Summary for Users
labelwhen multiple relationship types exist between the same node types.propertiesto attach relevant metadata to links.New Features
See #134
Breaking Changes
None
Bug Fixes
None
Improvements
Dependency updates
Deployment changes
None
Tests ✅
tests/test_foaf.pyThese FOAF tests cover:
Schema generation integrity.
Round-trip export of FOAF data.
Correct handling of links and link properties.
Providing JSON with and without
"submitter_id"to confirm validation works.Including links with labels and properties to validate correct PFB output.
🔬 Detailed Breakdown:
Fixtures:
gen3_schema_path: FOAF schema JSON file for Gen3 model.avro_data_path: Output PFB Avro file for data.avro_schema_path: Output PFB Avro file for schema.Helper Functions:
_create_schema():pfb from dictto convert FOAF schema to Avro format._test_schema():"person"node._assert_links():"person"record.check_link_propertiesisTrue, verifies presence and correctness of link properties like"since"or"workplace".Test Functions:
test_gen3_schema()test_foaf_data_no_links()"person"records.test_foaf_data_links()_assert_links().test_foaf_data_links_and_properties()"since"and"workplace".