Skip to content

vecino docker run seems failed before ENDING #8

@billmetangmo

Description

@billmetangmo

Expected behavior

I expect that ouput of vecino docker similarities finding of Levis0045/MetaLex would be :

docker run -it --rm srcd/vecino https://github.com/Levis0045/MetaLex
                                    github-repo1	x.XX
                                    github-repo2	x.XX
                                    github-repo3	x.XX

Actual behavior

It seems to work fine at the beginning :

INFO:bblfsh:Detected bblfsh server: 172.17.0.1:9432
INFO:enry:Fetching https://api.github.com/repos/src-d/enry/releases/latest
INFO:enry:Latest release resolved to enry_v1.6.3_linux_amd64.tar.gz
INFO:enry:Fetching https://github.com/src-d/enry/releases/download/v1.6.3/enry_v1.6.3_linux_amd64.tar.gz
INFO:enry:Extracting the binary
INFO:enry:Downloaded /enry
INFO:gcs-backend:Fetching https://storage.googleapis.com/models.cdn.sourced.tech/index.json?ignoreCache=1...
INFO:gcs-backend:Fetching https://storage.googleapis.com/models.cdn.sourced.tech/models%2Fid2vec%2F92609e70-f79c-46b5-8419-55726e873cfc.asdf...
[################################] 17044/17044 - 00:12:06
INFO:id2vec:Reading /root/.source{d}/id2vec/default.asdf...
INFO:id2vec:Building the token index...
INFO:similar_repos:Loaded id2vec model: {'created_at': datetime.datetime(2017, 6, 18, 17, 37, 6, 255615),
 'dependencies': [],
 'model': 'id2vec',
 'uuid': '92609e70-f79c-46b5-8419-55726e873cfc',
 'version': [1, 0, 0]}
Shape: (999424, 300)
First 10 words: ['get', 'name', 'type', 'string', 'class', 'set', 'data', 'value', 'self', 'test']
INFO:gcs-backend:Fetching https://storage.googleapis.com/models.cdn.sourced.tech/index.json?ignoreCache=1...
INFO:gcs-backend:Fetching https://storage.googleapis.com/models.cdn.sourced.tech/models%2Fdocfreq%2Ff64bacd4-67fb-4c64-8382-399a8e7db52a.asdf...
[################################] 372/372 - 00:00:17
INFO:docfreq:Reading /root/.source{d}/docfreq/default.asdf...
INFO:docfreq:Building the docfreq dictionary...
INFO:docfreq:Pruning to min 20 occurrences
INFO:similar_repos:Loaded document frequencies: {'created_at': datetime.datetime(2017, 6, 19, 9, 59, 14, 766638),
 'dependencies': [],
 'model': 'docfreq',
 'uuid': 'f64bacd4-67fb-4c64-8382-399a8e7db52a',
 'version': [1, 0, 0]}
Number of words: 416370
First 10 words: ['aaa', 'aaaa', 'aaaaa', 'aaaaaa', 'aaaaaaa', 'aaaaaaaa', 'aaaaaaaaa', 'aaaaaaaaaa', 'aaaaaaaaaaa', 'aaaaaaaaaaaa']
Number of documents: 112273
INFO:gcs-backend:Fetching https://storage.googleapis.com/models.cdn.sourced.tech/index.json?ignoreCache=1...
INFO:gcs-backend:Fetching https://storage.googleapis.com/models.cdn.sourced.tech/models%2Fnbow%2F1e3da42a-28b6-4b33-94a2-a5671f4102f4.asdf...
[################################] 5672/5672 - 00:05:20
INFO:nbow:Reading /root/.source{d}/nbow/default.asdf...
INFO:nbow:Building the repository names mapping...
INFO:similar_repos:Loaded nBOW model: {'created_at': datetime.datetime(2017, 6, 19, 9, 16, 8, 942880),
 'dependencies': [{'created_at': datetime.datetime(2017, 6, 18, 17, 37, 6, 255615),
                   'dependencies': [],
                   'model': 'id2vec',
                   'uuid': '92609e70-f79c-46b5-8419-55726e873cfc',
                   'version': [1, 0, 0]},
                  {'created_at': datetime.datetime(2017, 6, 19, 9, 59, 14, 766638),
                   'dependencies': [],
                   'model': 'docfreq',
                   'uuid': 'f64bacd4-67fb-4c64-8382-399a8e7db52a',
                   'version': [1, 0, 0]}],
 'model': 'nbow',
 'uuid': '1e3da42a-28b6-4b33-94a2-a5671f4102f4',
 'version': [1, 0, 0]}
Shape: (112273, 999424)
First 10 repos: ['ikizir/HohhaDynamicXOR', 'ditesh/node-poplib', 'Code52/MarkPadRT', 'wp-shortcake/shortcake', 'capaj/Moonridge', 'HugoGiraudel/hugogiraudel.github.com', 'crosswalk-project/crosswalk-website', 'apache/parquet-mr', 'dciccale/kimbo.js', 'processone/oneteam']
INFO:bblfsh:Detected bblfsh server: 172.17.0.1:9432
INFO:similar_repos:Creating the WMD engine...
INFO:repo_cloner:Cloning from https://github.com/Levis0045/MetaLex...
INFO:repo_cloner:Finished cloning https://github.com/Levis0045/MetaLex
INFO:repo_cloner:Classifying the files...
INFO:repo_cloner:Result: {'HTML': 1, 'CSS': 1, 'Shell': 1, 'Python': 20, 'Text': 5}
INFO:repo2nbow:Fetching and processing UASTs...

Then start to fail:

ERROR:repo2nbow:bblfsh: RpcError on /tmp/repo2-vb2b74e1/Levis0045&MetaLex_github.com/metalex/api.py: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, Connect Failed)>
WARNING:repo2nbow:/tmp/repo2-vb2b74e1/Levis0045&MetaLex_github.com/metalex/api.py was skipped
ERROR:repo2nbow:bblfsh: RpcError on /tmp/repo2-vb2b74e1/Levis0045&MetaLex_github.com/metalex/logs/__init__.py: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, Connect Failed)>
WARNING:repo2nbow:/tmp/repo2-vb2b74e1/Levis0045&MetaLex_github.com/metalex/logs/__init__.py was skipped
INFO:repo2nbow:https://github.com/Levis0045/MetaLex pending tasks: 19
.........
INFO:repo2nbow:https://github.com/Levis0045/MetaLex pending tasks: 0
Traceback (most recent call last):
  File "/usr/local/bin/vecino", line 11, in <module>
    load_entry_point('vecino==0.1.6a0', 'console_scripts', 'vecino')()
  File "/usr/local/lib/python3.5/dist-packages/vecino/__main__.py", line 76, in main
    max_time=args.max_time, skipped_stop=args.skipped_stop)
  File "/usr/local/lib/python3.5/dist-packages/vecino/similar_repositories.py", line 80, in query
    neighbours = self._query_foreign(url_or_path_or_name, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/vecino/similar_repositories.py", line 108, in _query_foreign
    return self._wmd.nearest_neighbors((words, weights), **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/wmd/__init__.py", line 507, in nearest_neighbors
    "Too little vocabulary for %s: %d" % (index, len(words)))
ValueError: Too little vocabulary for None: 0

Steps to reproduce the behavior

docker build -t srcd/vecino .
docker run -d --privileged -p 9432:9432 --name bblfshd bblfsh/bblfshd
docker exec -it bblfshd bblfshctl driver install --all
docker run -it --rm srcd/vecino https://github.com/Levis0045/MetaLex

Any advice ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions