Skip to content

Generating/Training on Space/Amasum - missing scripts #2

@hamza1av

Description

@hamza1av

Hello there,

I run the pre-processing eval recipe without any issues

torchseq-eval --recipe opagg.hiro_pre --model ./models/20240130_183901_d671_space --test

As I figured out, this reads already pre-generated summaries. In Debugging mode I modified the conditions so that no pre-generated summareis are read, but rather new summaries are generated. I rerun the above command and get this error message.

12:52 INFO Eval TorchSeq eval runner 12:52 INFO Eval Running EvalRecipe: opagg.twostage_pre Traceback (most recent call last): File "/home/user/.conda/envs/torchseqenv/bin/torchseq-eval", line 33, in <module> sys.exit(load_entry_point('torchseq', 'console_scripts', 'torchseq-eval')()) File "/home/user/code/torchseq/torchseq/eval/cli.py", line 49, in main result = recipe.run() File "/home/user/code/torchseq/torchseq/eval/recipes/opagg/hiro_pre.py", line 54, in run instance = model_from_path(self.model_path, use_cuda=(not self.cpu)) File "/home/user/code/torchseq/torchseq/utils/model_loader.py", line 57, in model_from_path instance = AGENT_TYPES[config.task]( File "/home/user/code/torchseq/torchseq/agents/retrieval_agent.py", line 66, in __init__ with jsonlines.open(os.path.join(self.data_path, dataset_path, "reviews.train.jsonl")) as reader: File "/home/user/.conda/envs/torchseqenv/lib/python3.10/site-packages/jsonlines/jsonlines.py", line 643, in open fp = builtins.open(file, mode=mode + "t", encoding=encoding) FileNotFoundError: [Errno 2] No such file or directory: './data/opagg/space-filtered/space-filtered-25toks-1pronouns-aspects-charfilt-all/reviews.train.jsonl'
so this file probably hasn't been created yet.

Now in order to use the setup and generate summaries, I tried running the tgi client as described. I run the command
python tgi-client/runner.py --input runs/hiro/space/llm_inputs_oneshot_test.jsonl --output runs/hiro/space/llm_outputs_oneshot_test_mistaral7b.js onl --model mistralai/Mistral-7B-Instruct-v0.2
But there is no such file for the input runs/hiro/space/llm_inputs_oneshot_test.jsonl

I also tried to rebuild the datasets as described in the section Training on Space/Amasum. The instructions tell me to run the dataset filtering scripts, but there are none in the repository. Not here and also not in the torchseq repo.

Unfortunately I couldn't get the setup running out of the box, but I am interested in the implementation and would like to reproduce some of your results.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions