houzss/CNN_Text
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
1.打开terminal
2.conda env list查看现有虚拟环境,若没有ColNet,则转3,否则4
3.1.conda create -n ColNet python=2.7
3.2.conda activate ColNet
3.3 pip install requests sparql-client==2.6.0 sklearn
3.4 conda install tensorflow-gpu=1.9.0 gensim=3.4.0 Pattern
3.5.python; import tensorflow as tf; print(tf.test.is_gpu_available())查看是否配置GPU成功
4.conda activate ColNet
Our software settings:
python 2.7
tensorflow 1.9.0
gensim 3.4.0
python sparql-client 2.6
sklearn 0.19.2
从step 2开始跑
先下原来的跑一下,然后看懂了再实现
黑盒跑一下
Requirements:
wikipedia corpus (e.g., https://github.com/attardi/wikiextractor)
pre-train word2vec (e.g., https://radimrehurek.com/gensim/models/word2vec.html)
Pre-trained word2vec model by Wikipedia Dump: https://drive.google.com/open?id=1d_xrUPRLQjpcZrlm_cpKJO3jWBFKYhcl
Instructions:
Step #0 (get okay ground truth classes):
evaluate_infer_col_gt.py
--> column_gt_extend_fg.csv
Step #1 (sampling):
sample_lookup.py
--> lookup_entities, lookup_classes.py, lookup_col_classes.py
sample_particular.py
--> particular_pos_samples.csv, particular_neg_samples.py
sample_general.py
--> general_pos_samples.csv
Step #2 (training):
train_cnn.py
--> in_out/cnn/cnn_1_2_1.00/ (1: synthetic_column_size, 2: train_type, 1.00: knowledge gap)
Step #3 (prediction):
predict_colnet.py
--> predictions/p_cnn_1_2_1.00.csv
predict_lookup.py (voting by entities matched by DBpedia lookup)
--> predictions/p_lookup.csv
predict_ensemble.py
--> predictions/p_cnn_1_2_1.00_lookup.csv
predict_ent_class.py (voting by entities matched by ISWC'17 paper)
--> predictions/p_ent_class.csv
Step #4 (evaluation):
evaluate_strict.py
evaluate_tolerant.py