Longformer BigBird
allenai/longformer-large-4096¶
epoch 3
with pretrained Lead 0.7826552462526767 Position 0.6857142857142857 Claim 0.6016325707951224 Evidence 0.6062992125984252 Concluding Statement 0.7744827586206896 Counterclaim 0.5159301130524152 Rebuttal 0.43537414965986393
Overall 0.6288697623847826
========================================¶
epoch4
witout pretrained Lead 0.7926960257787325 Position 0.6743119266055045 Claim 0.5527019174898314 Evidence 0.6058080479229067 Concluding Statement 0.7251962883654532 Counterclaim 0.4868686868686869 Rebuttal 0.39381153305203936
Overall 0.6044849180118792
with pretrained Lead 0.7948164146868251 Position 0.6745484400656815 Claim 0.5881818181818181 Evidence 0.5861433087460485 Concluding Statement 0.7867698803659395 Counterclaim 0.5420207743153919 Rebuttal 0.43478260869565216
Overall 0.6296090350081938
========================================¶
epoch5
witout pretrained Lead 0.7926565874730022 Position 0.6712629269821373 Claim 0.5932255111382362 Evidence 0.6297068563718876 Concluding Statement 0.7207586933614331 Counterclaim 0.48604860486048607 Rebuttal 0.42297650130548303
Overall 0.6166622402132379 online: 0.612
more ...Mapping chars and words to tokens
1. Mapping char to token.¶
Feedback prize evaluating student writing
Scipy stats
Binomial distribution
n次伯努利实验,样本相互独立,单次成功概率为p,服从参数为n和p的二项分布:
$$P\{ x= m\} =C_{n}^{m}p^{m}\left( 1-p\right) ^{n-m} \ \ (其中,0<p<1, m=0,1,...,n)$$累计概率分布函数:
$$F\left( m\right) =P\{ X \leq m\} =\sum ^{m}_{i=0}C_{n}^{i}p^{i}\left( 1-p\right) ^{n-i}$$二项分布的两种逼近:泊松分布 和 标准正态分布(拉普拉斯中心极限定理)
当n很大,p较小(稀有事件,一般小于0.1),即np=$\lambda$较小,近似逼近泊松分布
当n很大,p较大,即np也很大,近似逼近标准正态分布 $Z=\dfrac{X-np}{\sqrt{np\left( 1-p\right) }}$ ,$X=\sum ^{n}_{i=0}x_{i}$ 对于二项分布,$x_{i}$为所有事件和,即成功次数。
abnormality = scipy.stats.binom(total / 100, p).cdf((total - loss) / 100)
abnormality = ((total - loss) - total * p) /
more ...
Numpy [None, ...]
None
在对应位置增加一个维度(类似于unsqueeze(axis=)
)
...
等价于[:,:,:]
Hierarchical Position Embedding
预训练模型 预训练的三种embedding
word_embedding: [vocab_size, hidden_size]
position_embedding: [max_len, hidden_size]
token_type_embedding: [token_type_size, hidden_size]
Paddle dtype
Tokenizer offerset mapping
paddle implements torch.repeat_interleave/K.repeat_elements using paddle.reshape & paddle.tile
https://pytorch.org/docs/stable/generated/torch.Tensor.repeat.html#torch.Tensor.repeat
https://pytorch.org/docs/stable/generated/torch.repeat_interleave.html
If the repeats is tensor([n1, n2, n3, …]), then the output will be tensor([0, 0, …, 1, 1, …, 2, 2, …, …]) where 0 appears n1 times, 1 appears n2 times, 2 appears n3 times, etc.