Generating/Training on Space/Amasum - missing scripts

Hello there,

I run the pre-processing eval recipe without any issues

`torchseq-eval --recipe opagg.hiro_pre --model ./models/20240130_183901_d671_space --test`

As I figured out, this reads already pre-generated summaries. In Debugging mode I modified the conditions so that no pre-generated  summareis are read, but rather new summaries are generated. I rerun the above command and get this error message.

`12:52   INFO    Eval    TorchSeq eval runner
12:52   INFO    Eval    Running EvalRecipe: opagg.twostage_pre
Traceback (most recent call last):
  File "/home/user/.conda/envs/torchseqenv/bin/torchseq-eval", line 33, in <module>
    sys.exit(load_entry_point('torchseq', 'console_scripts', 'torchseq-eval')())
  File "/home/user/code/torchseq/torchseq/eval/cli.py", line 49, in main
    result = recipe.run()
  File "/home/user/code/torchseq/torchseq/eval/recipes/opagg/hiro_pre.py", line 54, in run
    instance = model_from_path(self.model_path, use_cuda=(not self.cpu))
  File "/home/user/code/torchseq/torchseq/utils/model_loader.py", line 57, in model_from_path
    instance = AGENT_TYPES[config.task](
  File "/home/user/code/torchseq/torchseq/agents/retrieval_agent.py", line 66, in __init__
    with jsonlines.open(os.path.join(self.data_path, dataset_path, "reviews.train.jsonl")) as reader:
  File "/home/user/.conda/envs/torchseqenv/lib/python3.10/site-packages/jsonlines/jsonlines.py", line 643, in open
    fp = builtins.open(file, mode=mode + "t", encoding=encoding)
FileNotFoundError: [Errno 2] No such file or directory: './data/opagg/space-filtered/space-filtered-25toks-1pronouns-aspects-charfilt-all/reviews.train.jsonl'
`
so this file probably hasn't been created yet.

Now in order to use the setup and generate summaries, I tried running the tgi client as described. I run the command
`python tgi-client/runner.py --input runs/hiro/space/llm_inputs_oneshot_test.jsonl --output runs/hiro/space/llm_outputs_oneshot_test_mistaral7b.js
onl --model mistralai/Mistral-7B-Instruct-v0.2`
 But there is no  such file for the input `runs/hiro/space/llm_inputs_oneshot_test.jsonl`


I also tried to rebuild the datasets as described in the section Training on Space/Amasum. The instructions tell me to run the dataset filtering scripts, but there are none in the repository. Not here and also not in the torchseq repo.

Unfortunately I couldn't get the setup running out of the box, but I am interested in the implementation and would like to reproduce some of your results. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generating/Training on Space/Amasum - missing scripts #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Generating/Training on Space/Amasum - missing scripts #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions