-
Notifications
You must be signed in to change notification settings - Fork 0
Configuration
The parser config is an XDM map from type map(xs:QName, item()*). Each available feature is detected by a specific qualified name. The namespace is alway http://www.nkutsche.com/xmlml/parser/features/ (prefix mlpf). Any feature name is also available as global variable with the final visibility. Thereby the variable name of the feature mlpf:EXAMPLE-FEATURE is mlml:EXAMPLE-FEATURE.
Some parser features are relative to features of Xerces which is regularly used as the underlying XML parser by Saxon. If there is no custom configuration the XmlML parser will try to detect the configuration of the underlying Xerces for these features. The main goal is to achieve similar parsing results.
- QName:
mlpf:STRIP-WHITESPACE($mlml:STRIP-WHITESPACE) - Type:
xs:string - Allowed values:
all,ignorable,none - Default: detected by underlying Xerces configuration
Configures if text nodes which contains only whitespace characters (x9, x10, x13, x20) should be treated as regular text nodes (<text>) or ignorable whitespace (<ws>). Note that text nodes which are descendants of an element which are marked with @xml:space='preserve' are never treated as ignorable whitespace.
Meaning of the values:
-
all→ any whitespace node are treated as ignorable whitespace -
ignoreable→ only whitespace nodes that are children of an element that has a DTD content model that prevents text content are treated as ignoreable whitespace. -
none→ no whitespace node are treated as ignorable whitespace
- QName:
mlpf:RESOLVE-DTD-URIS($mlml:RESOLVE-DTD-URIS) - Type:
xs:boolean - Default: detected by underlying Xerces configuration
NOT IMPLEMENTED YET.
- QName:
mlpf:EXPAND-DEFAULT-ATTRIBUTES($mlml:EXPAND-DEFAULT-ATTRIBUTES) - Type:
xs:boolean - Default: detected by underlying Xerces configuration
NOT IMPLEMENTED YET.
- QName:
mlpf:URI_RESOLVER($mlml:URI_RESOLVER) - Type:
function($href as xs:string, $baseUri as xs:string) as map(xs:string, xs:string)? - Default: build in URI resolver based on the XPath function
unparsed-text()
The URI resolver must be a function which expects two arguments from type xs:string. The first argument ($href) is the URI which has to be resolved. The second argument ($baseUri) is used as base URI to resolve $href if it is relative.
Return value must describe the resource which should be assigned to the requested URI. If the return value is not an empty sequence it must be a map with the following fields:
| Key | Description | Required |
|---|---|---|
base-uri |
The new base URI of the returned resource | Yes |
content |
The string content of the returned resource | Yes |
mediatype |
The media type of the returned resource | NO |
linefeed |
The line feed format. Posible values are: 'n', 'r', 'rn'
|
NO |
If the result of the URI resolver is an empty sequence, the build in URI resolver is used to resolve the URI request.
- QName:
mlpf:ENTITY_RESOLVER($mlml:ENTITY_RESOLVER) - Type:
function($publicId as xs:string?, $systemId as xs:string?) as map(xs:string, xs:string)? - Default: #unset
The entity resolver must be a function which expects two arguments from type xs:string?. The first argument ($publicId) is used as public identifier, the second argument ($systemId) is used as system ID. The entity works similar to the URI resolver with the following differences:
- The entity resolver is called only for external entity references.
- The enitty resolver is called at first. If it returns an empty sequence the URI resolver is called.
- QName:
mlpf:IGNORE-INLINE-DTD-PIS($mlml:IGNORE-INLINE-DTD-PIS) - Type:
xs:boolean - Default: detected by underlying Xerces configuration
It is allowed to insert processing instructions into the internal subset of a doctype declartion.
<!DOCTYPE root [
<?inline-dtd-pi?>
]>If the value of the feature $mlml:IGNORE-INLINE-DTD-PIS is true these PIs are ignored. Otherwise they are recognized as children of the root node.
- QName:
mlpf:CUSTOM-STRUCTUR-ELEMENTS($mlml:CUSTOM-STRUCTUR-ELEMENTS) - Type:
function($result as element(mlml:document)) as element()* - Default: #unset
If set the value must be a function which expects a provisional result of the parsing process as first argument. The function may return additional XmlML elements who's content model should be treated as structured (not mixed-content). This feature is used to overwrite the default detection of whitespace stripping (see mlml:STRIP-WHITESPACE).
Note: the returned XmlML elements must be the origin instances of the elements. Copy of elements as results are ignored.
- QName:
mlpf:IGNORE-EXTERNAL-DTD($mlml:IGNORE-EXTERNAL-DTD) - Type:
xs:boolean - Default:
false
Ignores the reference to an external DTD in the Doctype declaration.
- QName:
mlpf:IGNORE-INLINE-DTD($mlml:IGNORE-INLINE-DTD) - Type:
xs:boolean - Default:
false
Ignores the internal subset of a Doctype declaration.
- QName:
mlpf:IGNORE-UNDECLARED-ENTITIES($mlml:IGNORE-UNDECLARED-ENTITIES) - Type:
xs:boolean - Default:
false
If false named entities with not available declarations are causing a parsing error. If true the parsing error is not thrown and an empty string is used as value of the entity.
- QName:
mlpf:PARSER-LOG-LEVEL($mlml:PARSER-LOG-LEVEL) - Type:
xs:string - Allowed values (in breakets corresponding global variables):
-
VERBOSE($mlml:LOG-LEVEL-VERBOSE) -
DEBUG($mlml:LOG-LEVEL-DEBUG) -
WARNING($mlml:LOG-LEVEL-WARN) -
ERROR($mlml:LOG-LEVEL-ERROR)
-
Log level of the parser log messages.