Skip to content

Error trying a push on HDFS remote #3418

@anderl80

Description

@anderl80

DVC version: 0.86.5

Error

Suddenly happened from one day to another. Worked the other day.

DVC version: 

# Error

```bash
(project) [username@hostname:~/project/assets/models]dvc push -r hdfscache -v
DEBUG: Trying to spawn '['/home/username/.conda/envs/project/bin/python', '/home/username/.conda/envs/project/bin/dvc', 'daemon', '-q', 'updater']'
DEBUG: Spawned '['/home/username/.conda/envs/project/bin/python', '/home/username/.conda/envs/project/bin/dvc', 'daemon', '-q', 'updater']'
DEBUG: PRAGMA user_version;                                             
DEBUG: fetched: [(3,)]
DEBUG: CREATE TABLE IF NOT EXISTS state (inode INTEGER PRIMARY KEY, mtime TEXT NOT NULL, size TEXT NOT NULL, md5 TEXT NOT NULL, timestamp TEXT NOT NULL)
DEBUG: CREATE TABLE IF NOT EXISTS state_info (count INTEGER)
DEBUG: CREATE TABLE IF NOT EXISTS link_state (path TEXT PRIMARY KEY, inode INTEGER NOT NULL, mtime TEXT NOT NULL)
DEBUG: INSERT OR IGNORE INTO state_info (count) SELECT 0 WHERE NOT EXISTS (SELECT * FROM state_info)
DEBUG: PRAGMA user_version = 3;
DEBUG: Preparing to upload data to 'hdfs://default/tmp/platform_raw/cache'
DEBUG: Preparing to collect status from hdfs://default/tmp/platform_raw/cache
DEBUG: Collecting information from local cache...
DEBUG: Path ../../.dvc/cache/a4/abf1123fc98c6f70d057953c0ef1a0 inode 1994063                                                                                  
DEBUG: SELECT mtime, size, md5, timestamp from state WHERE inode=?                                                                                            
DEBUG: fetched: [('1582799936985768192', '1537776', 'a4abf1123fc98c6f70d057953c0ef1a0', '1582818347397361664')]                                               
DEBUG: UPDATE state SET timestamp = ? WHERE inode = ?                                                                                                         
DEBUG: cache '../../.dvc/cache/a4/abf1123fc98c6f70d057953c0ef1a0' expected 'a4abf1123fc98c6f70d057953c0ef1a0' actual 'a4abf1123fc98c6f70d057953c0ef1a0'       
DEBUG: Path ../../.dvc/cache/c5/b03cb15aad7772ef895e87989d8888 inode 10893160                                                                                 
DEBUG: SELECT mtime, size, md5, timestamp from state WHERE inode=?                                                                                            
DEBUG: fetched: [('1582810084441739776', '342631', 'c5b03cb15aad7772ef895e87989d8888', '1582818347406528768')]                                                
DEBUG: UPDATE state SET timestamp = ? WHERE inode = ?                                                                                                         
DEBUG: cache '../../.dvc/cache/c5/b03cb15aad7772ef895e87989d8888' expected 'c5b03cb15aad7772ef895e87989d8888' actual 'c5b03cb15aad7772ef895e87989d8888'       
DEBUG: Path ../../.dvc/cache/84/b5dddbef9a299890a89a9fc8f2c31f inode 1446805                                                                                  
DEBUG: SELECT mtime, size, md5, timestamp from state WHERE inode=?                                                                                            
DEBUG: fetched: [('1582553549656297472', '130584165', '84b5dddbef9a299890a89a9fc8f2c31f', '1582818347416467712')]                                             
DEBUG: UPDATE state SET timestamp = ? WHERE inode = ?                                                                                                         
DEBUG: cache '../../.dvc/cache/84/b5dddbef9a299890a89a9fc8f2c31f' expected '84b5dddbef9a299890a89a9fc8f2c31f' actual '84b5dddbef9a299890a89a9fc8f2c31f'       
DEBUG: Path ../../.dvc/cache/1b/497e74d864c34c51846f62312037da inode 1446804                                                                                  
DEBUG: SELECT mtime, size, md5, timestamp from state WHERE inode=?                                                                                            
DEBUG: fetched: [('1582553533029446656', '1598211', '1b497e74d864c34c51846f62312037da', '1582818347425707008')]                                               
DEBUG: UPDATE state SET timestamp = ? WHERE inode = ?                                                                                                         
DEBUG: cache '../../.dvc/cache/1b/497e74d864c34c51846f62312037da' expected '1b497e74d864c34c51846f62312037da' actual '1b497e74d864c34c51846f62312037da'       
DEBUG: Collecting information from remote cache...                                                                                                            
  0% Querying cache in hdfs://default/tmp/platform_raw/cache|                                                                      |0/4 [00:00<?,     ?file/s]20/02/27 16:46:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
DEBUG: SELECT count from state_info WHERE rowid=?                                                                                                             
DEBUG: fetched: [(27,)]
DEBUG: UPDATE state_info SET count = ? WHERE rowid = ?
ERROR: unexpected error - Command '('hadoop', 'classpath', '--glob')' returned non-zero exit status 1.
------------------------------------------------------------
Traceback (most recent call last):
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/dvc/remote/pool.py", line 51, in get_connection
    return self._conns.popleft()
IndexError: pop from an empty deque
 
During handling of the above exception, another exception occurred:
 
Traceback (most recent call last):
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/dvc/main.py", line 49, in main
    ret = cmd.run()
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/dvc/command/data_sync.py", line 49, in run
    recursive=self.args.recursive,
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/dvc/repo/__init__.py", line 31, in wrapper
    ret = f(repo, *args, **kwargs)
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/dvc/repo/push.py", line 25, in push
    return self.cloud.push(used, jobs, remote=remote)
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/dvc/data_cloud.py", line 81, in push
    show_checksums=show_checksums,
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/dvc/remote/local.py", line 385, in push
    download=False,
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/dvc/remote/local.py", line 358, in _process
    download=download,
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/dvc/remote/local.py", line 279, in status
    md5s, jobs=jobs, name=str(remote.path_info)
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/dvc/remote/base.py", line 849, in cache_exists
    ret = list(itertools.compress(checksums, in_remote))
  File "/home/username/.conda/envs/project/lib/python3.6/concurrent/futures/_base.py", line 586, in result_iterator
    yield fs.pop().result()
  File "/home/username/.conda/envs/project/lib/python3.6/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/home/username/.conda/envs/project/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/home/username/.conda/envs/project/lib/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/dvc/remote/base.py", line 842, in exists_with_progress
    ret = self.exists(path_info)
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/dvc/remote/hdfs.py", line 122, in exists
    with self.hdfs(path_info) as hdfs:
  File "/home/username/.conda/envs/project/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/dvc/remote/pool.py", line 11, in get_connection
    conn = pool.get_connection()
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/dvc/remote/pool.py", line 53, in get_connection
    return self._conn_func(*self._conn_args, **self._conn_kwargs)
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/pyarrow/hdfs.py", line 211, in connect
    extra_conf=extra_conf)
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/pyarrow/hdfs.py", line 36, in __init__
    _maybe_set_hadoop_classpath()
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/pyarrow/hdfs.py", line 136, in _maybe_set_hadoop_classpath
    classpath = _hadoop_classpath_glob('hadoop')
  File "/home/username/.conda/envs/project/lib/python3.6/site-packages/pyarrow/hdfs.py", line 161, in _hadoop_classpath_glob
    return subprocess.check_output(hadoop_classpath_args)
  File "/home/username/.conda/envs/project/lib/python3.6/subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "/home/username/.conda/envs/project/lib/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '('hadoop', 'classpath', '--glob')' returned non-zero exit status 1.
------------------------------------------------------------

Metadata

Metadata

Assignees

No one assigned

    Labels

    awaiting responsewe are waiting for your reply, please respond! :)bugDid we break something?

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions