An evaluation of distributed word representation is generally conducted using a word similarity task and/or a word analogy task. There are many datasets readily available for these tasks in English. However, evaluating distributed representation in languages that do not have such resources (e.g., Japanese) is difficult. Therefore, as a first step toward evaluating distributed representations in Japanese, we constructed a Japanese word similarity dataset. To the best of our knowledge, our dataset is the first resource that can be used to evaluate distributed representations in Japanese. Moreover, our dataset contains various parts of speech and includes rare words in addition to common words.
Construction of a Japanese Word Similarity Dataset
Published 2017 in International Conference on Language Resources and Evaluation
ABSTRACT
PUBLICATION RECORD
- Publication year
2017
- Venue
International Conference on Language Resources and Evaluation
- Publication date
2017-03-17
- Fields of study
Linguistics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-19 of 19 references · Page 1 of 1
CITED BY
Showing 1-13 of 13 citing papers · Page 1 of 1