Describe the bug
When running the Tutorial - 3 with multiple models and asking a question "Who killed Ned Stark?" the first result is almost in all models is
1st Answer "n. When Ned\'s father and brother went south to reclaim her, the "Mad King" Aerys Targaryen burned both of them alive. Ned and Robert Baratheon led the"
Some of the 2nd and 3rd answers are actually correct (especially in deepset/minilm-uncased-squad2 and deepset/electra-base-squad2 ), but the very top answer is factually not correct. It's a shame because the rest of the answers are good.
I think that the reason could be a bad tokenization of some cases. In this case the text reads Ned\'s father and brother and maybe the algorithm thinks that Ned is one of the characters in the this very story that later is burned alive when in reality it's Ned's farther and brother are burned.
How can we remove this \ from the text so that the model starts understanding the text better?
Error message
Error that was thrown (if available)
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here, like document types / preprocessing steps / settings of reader etc.
To Reproduce
Steps to reproduce the behavior
System:
- OS: Colab Notebook
- GPU/CPU:
- Haystack version (commit or version number): 0.4.0
- DocumentStore:
- Reader:
- Retriever:
Describe the bug
When running the Tutorial - 3 with multiple models and asking a question "Who killed Ned Stark?" the first result is almost in all models is
1st Answer
"n. When Ned\'s father and brother went south to reclaim her, the "Mad King" Aerys Targaryen burned both of them alive. Ned and Robert Baratheon led the"Some of the 2nd and 3rd answers are actually correct (especially in
deepset/minilm-uncased-squad2anddeepset/electra-base-squad2), but the very top answer is factually not correct. It's a shame because the rest of the answers are good.I think that the reason could be a bad tokenization of some cases. In this case the text reads
Ned\'s father and brotherand maybe the algorithm thinks that Ned is one of the characters in the this very story that later is burned alive when in reality it's Ned's farther and brother are burned.How can we remove this
\from the text so that the model starts understanding the text better?Error message
Error that was thrown (if available)
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here, like document types / preprocessing steps / settings of reader etc.
To Reproduce
Steps to reproduce the behavior
System: