GitHub - houzss/CNN_Text: ColNet + Text

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Limaye		Limaye
T2Dv2		T2Dv2
exp_Limaye		exp_Limaye
exp_T2D		exp_T2D
.gitattributes		.gitattributes
README		README

Repository files navigation

1.打开terminal
2.conda env list查看现有虚拟环境，若没有ColNet，则转3，否则4
3.1.conda create -n ColNet python=2.7  
3.2.conda activate ColNet
3.3 pip install requests sparql-client==2.6.0 sklearn
3.4 conda install tensorflow-gpu=1.9.0 gensim=3.4.0 Pattern
3.5.python; import tensorflow as tf; print(tf.test.is_gpu_available())查看是否配置GPU成功
4.conda activate ColNet
Our software settings:
    python 2.7
    tensorflow 1.9.0
    gensim 3.4.0
    python sparql-client 2.6
    sklearn 0.19.2
从step 2开始跑
先下原来的跑一下，然后看懂了再实现
黑盒跑一下
Requirements:
    wikipedia corpus (e.g., https://github.com/attardi/wikiextractor)
    pre-train word2vec (e.g., https://radimrehurek.com/gensim/models/word2vec.html)
    Pre-trained word2vec model by Wikipedia Dump: https://drive.google.com/open?id=1d_xrUPRLQjpcZrlm_cpKJO3jWBFKYhcl

Instructions:
    Step #0 (get okay ground truth classes):
        evaluate_infer_col_gt.py
            --> column_gt_extend_fg.csv

    Step #1 (sampling):
        sample_lookup.py
            --> lookup_entities, lookup_classes.py, lookup_col_classes.py
        sample_particular.py
            --> particular_pos_samples.csv, particular_neg_samples.py
        sample_general.py
            --> general_pos_samples.csv

    Step #2 (training):
        train_cnn.py
            --> in_out/cnn/cnn_1_2_1.00/ (1: synthetic_column_size, 2: train_type, 1.00: knowledge gap)

    Step #3 (prediction):
        predict_colnet.py
            --> predictions/p_cnn_1_2_1.00.csv
        predict_lookup.py (voting by entities matched by DBpedia lookup)
            --> predictions/p_lookup.csv
        predict_ensemble.py
            --> predictions/p_cnn_1_2_1.00_lookup.csv
        predict_ent_class.py (voting by entities matched by ISWC'17 paper)
            --> predictions/p_ent_class.csv

    Step #4 (evaluation):
        evaluate_strict.py
        evaluate_tolerant.py