Skip to content

xces and xces-disamb implicitly assume different lemma selection strategy #30

@GoogleCodeExporter

Description

@GoogleCodeExporter
What steps will reproduce the problem?

echo "Mają kręgi." > seg.txt
~/pantera/bin/pantera -t nkjp --engine 
~/pantera/engines/ultimarum-tertia-np0-6.btengine seg.txt -o xces-disamb
mv seg.txt.disamb seg.xces
~/pantera/bin/pantera -t nkjp --engine 
~/pantera/engines/ultimarum-tertia-np0-6.btengine seg.txt -o xces
mv seg.txt.disamb seg.xces-sh

The standard XCES output (-o xces-disamb) implicitly assumes that only one 
(lemma,tag) pair is selected. The selection seems arbitrary.

The “sh” XCES dialect preserves both lemmata when having the same tag.

Is this really intended?

Original issue reported on code.google.com by kociki...@gmail.com on 20 Feb 2012 at 2:52

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions