Skip to content

Dependency version check fails for tokenizers #11107

@guyrosin

Description

@guyrosin

Environment info

  • transformers version: 4.5.0
  • Platform: Linux-4.15.0-134-generic-x86_64-with-glibc2.10
  • Python version: 3.8.5
  • PyTorch version (GPU?): 1.8.1 (False)
  • Tensorflow version (GPU?): 2.4.0 (False)
  • Using GPU in script?: N/A
  • Using distributed or parallel set-up in script?: N/A
  • tokenizers version: 0.10.2 (checked also 0.10.1)

Who can help

@stas00, @sgugger

Information

When importing transformers, the new dependency version check code (#11061) seems to fail for the tokenizers library:
importlib.metadata.version('tokenizers') returns None instead of the version string.

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

To reproduce

Steps to reproduce the behavior:

  1. import transformers
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/guyrosin/miniconda3/envs/pt/lib/python3.8/site-packages/transformers/__init__.py", line 43, in <module>
    from . import dependency_versions_check
  File "/home/guyrosin/miniconda3/envs/pt/lib/python3.8/site-packages/transformers/dependency_versions_check.py", line 41, in <module>
    require_version_core(deps[pkg])
  File "/home/guyrosin/miniconda3/envs/pt/lib/python3.8/site-packages/transformers/utils/versions.py", line 101, in require_version_core
    return require_version(requirement, hint)
  File "/home/guyrosin/miniconda3/envs/pt/lib/python3.8/site-packages/transformers/utils/versions.py", line 92, in require_version
    if want_ver is not None and not ops[op](version.parse(got_ver), version.parse(want_ver)):
  File "/home/guyrosin/miniconda3/envs/pt/lib/python3.8/site-packages/packaging/version.py", line 57, in parse
    return Version(version)
  File "/home/guyrosin/miniconda3/envs/pt/lib/python3.8/site-packages/packaging/version.py", line 296, in __init__
    match = self._regex.search(version)
TypeError: expected string or bytes-like object

The root problem is this:

from importlib.metadata import version
version('tokenizers') # returns None

Expected behavior

importlib.metadata.version('tokenizers') should return its version string.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions