Currently I base my code on this tutorial and I have some problems with tag method after the train section. I catch the UnicodeDecodeError exception like this
try:
for xseq in X_test:
Y_pred.append(tagger.tag(xseq))
except UnicodeDecodeError as e:
print(e)
print(e.object)
The output looks like this
'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)
b'B-qu\xc3\xa9'
I tried to decode my X_test before tag using decode('utf-8') but does seems not to works.
Just in case, I had some UnicodeEncodeError problems at the trainer object as shown below but seems that works using encode('utf-8') for every substring. With this method I'm forcing manual encoding before append objects in trainer. This issue is mentioned at #96 and this solution works for me.
for xseq, yseq in zip(X_train, Y_train):
trainer.append(xseq, yseq)
NOTE: Sorry for my deficent english. I hope I've been clear enough. If not, please tell me!!! :)
Currently I base my code on this tutorial and I have some problems with
tagmethod after the train section. I catch theUnicodeDecodeErrorexception like thisThe output looks like this
I tried to decode my
X_testbeforetagusingdecode('utf-8')but does seems not to works.Just in case, I had some
UnicodeEncodeErrorproblems at thetrainerobject as shown below but seems that works usingencode('utf-8')for every substring. With this method I'm forcing manual encoding before append objects in trainer. This issue is mentioned at #96 and this solution works for me.NOTE: Sorry for my deficent english. I hope I've been clear enough. If not, please tell me!!! :)