I tried to import some records finalised last week (ins763822, ins791177, ins919778) that contain trailing tab characters in the YAML data files. These data files gave exceptions like yaml.scanner.ScannerError: while scanning for the next token found character '\t' that cannot start any token. This seems to be a long-standing bug in PyYAML (yaml/pyyaml#306 and yaml/pyyaml#450). The problem is avoided by using LibYAML, for example, yaml.CSafeLoader instead of yaml.SafeLoader. With PyYAML 5.3.1 on my laptop (macOS), the LibYAML extension is not automatically included, but this problem is resolved by upgrading to PyYAML 5.4.1 (yaml/pyyaml#407). It seems that the HEPData Docker images (Linux) use LibYAML with both PyYAML 5.3.1 and PyYAML 5.4.1, so this might just be a macOS problem. It can be solved by pinning PyYAML 5.4.1 in requirements.txt. We should also add a test for a trailing tab to check the PyYAML installation.
I tried to import some records finalised last week (ins763822, ins791177, ins919778) that contain trailing tab characters in the YAML data files. These data files gave exceptions like
yaml.scanner.ScannerError: while scanning for the next token found character '\t' that cannot start any token. This seems to be a long-standing bug in PyYAML (yaml/pyyaml#306 and yaml/pyyaml#450). The problem is avoided by using LibYAML, for example,yaml.CSafeLoaderinstead ofyaml.SafeLoader. With PyYAML 5.3.1 on my laptop (macOS), the LibYAML extension is not automatically included, but this problem is resolved by upgrading to PyYAML 5.4.1 (yaml/pyyaml#407). It seems that the HEPData Docker images (Linux) use LibYAML with both PyYAML 5.3.1 and PyYAML 5.4.1, so this might just be a macOS problem. It can be solved by pinning PyYAML 5.4.1 inrequirements.txt. We should also add a test for a trailing tab to check the PyYAML installation.