[WIP] new config system, 1.2 tagset support#700
Conversation
so that other classes inheriting from it can use them * Move methods from SafeConstructor to BaseConstructor * Move methods from SafeRepresenter to BaseRepresenter
More and more YAML libraries are implementing YAML 1.2, either new ones
simply starting with 1.2 or older ones adding support for it.
While also the syntax was changed in YAML 1.2, this pull request is about the
schema changes.
As an example, in 1.1, Y, yes, NO, on etc. are resolved as booleans in 1.1.
This sounds convenient, but also means that all these 22 different strings must
be quoted if they are not meant as booleans. A very common obstacle is the
country code for Norway, NO ("Norway Problem").
In YAML 1.2 this was improved by reducing the list of boolean representations.
Also other types have been improved. The 1.1 regular expression for float allows
. and ._ as floats, although there isn't a single digit in these strings.
While the 1.2 Core Schema, the recommended default for 1.2, still allows a few
variations (true, True and TRUE, etc.), the 1.2 JSON Schema is there to match
JSON behaviour regarding types, so it allows only true and false.
Note that this implementation of the YAML JSON Schema might not be exactly like
the spec defines it (all plain scalars not resolving to numbers, null or
booleans would be an error).
Short usage example:
class MyCoreLoader(yaml.BaseLoader): pass
class MyCoreDumper(yaml.CommonDumper): pass
MyCoreLoader.init_tags('core')
MyCoreDumper.init_tags('core')
data = yaml.load(input, Loader=MyCoreLoader)
output = yaml.dump(data, Dumper=MyCoreDumper)
Detailed example code to play with:
import yaml
class MyCoreLoader(yaml.BaseLoader): pass
MyCoreLoader.init_tags('core')
class MyJSONLoader(yaml.BaseLoader): pass
MyJSONLoader.init_tags('json')
class MyCoreDumper(yaml.CommonDumper): pass
MyCoreDumper.init_tags('core')
class MyJSONDumper(yaml.CommonDumper): pass
MyJSONDumper.init_tags('json')
input = """
- TRUE
- yes
- ~
- true
#- .inf
#- 23
#- #empty
#- !!str #empty
#- 010
#- 0o10
#- 0b100
#- 0x20
#- -0x20
#- 1_000
#- 3:14
#- 0011
#- +0
#- 0001.23
#- !!str +0.3e3
#- +0.3e3
#- &x foo
#- *x
#- 1e27
#- 1x+27
"""
print('--------------------------------------------- BaseLoader')
data = yaml.load(input, Loader=yaml.BaseLoader)
print(data)
print('--------------------------------------------- SafeLoader')
data = yaml.load(input, Loader=yaml.SafeLoader)
print(data)
print('--------------------------------------------- CoreLoader')
data = yaml.load(input, Loader=MyCoreLoader)
print(data)
print('--------------------------------------------- JSONLoader')
data = yaml.load(input, Loader=MyJSONLoader)
print(data)
print('--------------------------------------------- SafeDumper')
out = yaml.dump(data, Dumper=yaml.SafeDumper)
print(out)
print('--------------------------------------------- MyCoreDumper')
out = yaml.dump(data, Dumper=MyCoreDumper)
print(out)
print('--------------------------------------------- MyJSONDumper')
out = yaml.dump(data, Dumper=MyJSONDumper)
print(out)
This way people can play with it, and we don't promise this wrapper will stay
around forever, and newly created classes CommonDumper/CommonRepresenter aren't
exposed.
MyCoreLoader = yaml.experimental_12_Core_loader()
data = yaml.load(input, Loader=MyCoreLoader)
MyCoreDumper = yaml.experimental_12_Core_dumper()
out = yaml.dump(data, Dumper=MyCoreDumper)
* Loader/Dumper config mixins to create dynamic types and configure them at instantiation with generated partials * New `FastestBaseLoader`/`FastestBaseDumper` base classes to auto-select C-back impl if available
|
|
||
| # preserve wrapped config defaults for values where we didn't get a default | ||
| # FIXME: share this code with the one in __init__.dump_all (and implement on others) | ||
| dumper_init_kwargs = dict( |
There was a problem hiding this comment.
this would probably be better done with inspect.signature(); then it's 100% dynamic off whatever init args the current class accepts
|
This is a great start, Matt! I really like the approach overall of generating a configured subclass around the existing architecture. The new While we can add this method to the the Loader and Dumper classes I think that: I think that's a super clean top level way of using the new ideas you've added. Also, the I assume you can do this: All this is to say, I really like where you have gone so far, at least as I understand it. I'd just like to see the common usage idioms be even cleaner. I'll try to write up a file of all the possible usages so we can discuss them among the release team. |
|
Any updates on this? |
Here's a quick and dirty crack at a more broadly-encompassing dynamic config system like we've talked about before...
TagSettype with some pre-defined instances for common 1.2 schemasyaml.load()/yaml.dump()et al to get the customized behavior, and the customized subclasses are GC'd once dropped on the floor.FastestBaseLoaderandFastestBaseDumperhelper classes backed by libyaml if available, and the pure-Python version if notWe can also wrap up the existing individual types that make up the tagsets so that users can pick and choose, and combine partial schemas at will without having to redefine everything. Once that's all done, we should actually be able to completely redefine all the existing currently unrelated Loader and Dumper subclasses using this, and they'll actually be related in the class hierarchy instead of just sharing mixins.
A few examples of what's in and working:
json tagset on fastest available loader
minimal custom tagset
override dumper behavior so it doesn't have to be specified every call to
dumpStill TODO: