依存句法分析评价指标

依存句法分析器的性能主要可以从3个方面进行评价,分别是依存关系准确率(Dependency Accuracy, DA)、词根准确率(Root Accuracy, RA)以及完整句子依存结构准确率(Sentence Accuracy, SA)。
其中,考虑依存关系是否带有标签,又可以将DA分为带标签依存关系准确率(Labeled Attachment Score, LAS)和无标签依存关系准确率(Unlabeled Attachment Score, UAS)。

CoNLL对LAS和UAS的定义如下:
labeled attachment score: the proportion of “scoring” tokens that are assigned both the correct head and the correct dependency relation label.
Punctuation tokens are non-scoring. In very exceptional cases, and depending on the original treebank annotation, some additional types of tokens might also be non-scoring.
The overall score of a system is its labeled attachment score on all test sets taken together.

Unlabeled attachment score is the proportion of “scoring” tokens that are assigned the correct head (regardless of the dependency relation label).

DA = 正确预测的中心词的数量/预测的中心词数量
RA = 正确识别的根词数/根词总数
SA = 正确分析的句子数/句子总数
LAS = 正确预测中心词和标签的词的数量/总词数
UAS = 正确预测中心词的词的数量/总词数

Link Grammar

From Wikipedia: Link grammar and Categorial grammar

Link grammar (LG) is a theory of syntax by Davy Temperley and Daniel Sleator which builds relations between pairs of words, rather than constructing constituents in a phrase structure hierarchy. Link grammar is similar to dependency grammar, but dependency grammar includes a head-dependent relationship, whereas link grammar makes the head-dependent relationship optional (links need not to indicate direction).

more >>

Some Linguistics Terminology

All from wikipedia. (Argument (linguistics), Adjunct (grammar), Valency (linguistics))

Arguments

In linguistic, an argument is an expression that helps complete the meaning of a predicate, the latter referring in the context to a main verb and its auxiliaries. In this regard, the complement is a closely related concept. Most predicate take one, two or three arguments. A predicate and its arguments from a predicate-argument structure. The discussion of predicates and arguments is associated most with (content) verb and noun phrases (NPs), although other syntactic categories can also be construed as predicates and as arguments.

more >>

LSTM学习笔记

Long Short-Term Memory(LSTM) 是一种循环神经网络(Recurrent Neural Network, RNN)。跟所有RNN一样,在网络单元足够多的条件下,LSTM可以计算传统计算机所能计算的任何东西。

Like most RNNs, an LSTM network is universal in the sense that given enough network units it can compute anything a conventional computer can compute. 维基百科

RNN

传统前馈神经网络(feedforward neural networks)如下图所示:

more >>

<br> &nbsp; 蔡家勋<br><br> &nbsp; 计算机科学与工程系<br> &nbsp; 上海交通大学<br><br>