I am using Stanford POS Tagger (for the first time), and although it correctly points to English, it does not seem to recognize (simplified) Chinese, even when changing the model parameter. Did I miss something?
I downloaded and unpacked the latest full version from here:
http://nlp.stanford.edu/software/tagger.shtml
Then I entered a sample text in "sample-input.txt".
这 是 一个 测试 的 句子. 这 是 另一个 句子.
Then i just run
./stanford-postagger.sh models / chinese-distsim.tagger sample-input.txt
The expected conclusion is to mark each word as part of speech, but instead, it recognizes the entire line of text as one word:
Loading default properties from tag models / chinese -distsim.tagger
Reading the POS tag model from / chinese -distsim.tagger ... done models [3.5 sec].
這 是 一個 測試 的 句子. 這 是 另一個 句子. # NR
1 word is noted at 30.30 words per second.
I appreciate any help.
source
share