Longformer Torch2Paddle
PaddlePaddle-Longformer-model-base-4096
PyTorch | Shape | Paddle | Shape |
---|---|---|---|
embeddings.word_embeddings.weight | [50265, 768] | embeddings.word_embeddings.weight | |
embeddings.position_embeddings.weight | [4098, 768] | embeddings.position_embeddings.weight | |
embeddings.token_type_embeddings.weight | [1, 768] | embeddings.token_type_embeddings.weight | |
embeddings.LayerNorm.weight | [768] | embeddings.layer_norm.weight | |
embeddings.LayerNorm.bias | [768] | embeddings.layer_norm.bias | |
encoder.layer.0.attention.self.query.weight | [768, 768] | encoder.layers.0.self_attn.query.weight | T |
encoder.layer.0.attention.self.query.bias | [768] | encoder.layers.0.self_attn.query.bias | |
encoder.layer.0.attention.self.key.weight | [768, 768] | encoder.layers.0.self_attn.key.weight | T |
encoder.layer.0.attention.self.key.bias | [768] | encoder.layers.0.self_attn.key.bias | |
encoder.layer.0.attention.self.value.weight | [768, 768] | encoder.layers.0.self_attn.value.weight | T |
encoder.layer.0.attention.self.value.bias | [768] | encoder.layers.0.self_attn.value.bias | |
encoder.layer.0.attention.self.query_global.weight | [768, 768] | encoder.layers.0.self_attn.query_global.weight | T |
encoder.layer.0.attention.self.query_global.bias | [768] | encoder.layers.0.self_attn.query_global.bias | |
encoder.layer.0.attention.self.key_global.weight | [768, 768] | encoder.layers.0.self_attn.key_global.weight | T |
encoder.layer.0.attention.self.key_global.bias | [768] | encoder.layers.0.self_attn.key_global.bias | |
encoder.layer.0.attention.self.value_global.weight | [768, 768] | encoder.layers.0.self_attn.value_global.weight | T |
encoder.layer.0.attention.self.value_global.bias | [768] | encoder.layers.0.self_attn.value_global.bias | |
encoder.layer.0.attention.output.dense.weight | [768, 768] | encoder.layers.0.self_attn.out.weight | T |
encoder.layer.0.attention.output.dense.bias | [768] | encoder.layers.0.self_attn.out.bias | |
encoder.layer.0.attention.output.LayerNorm.weight | [768] | encoder.layers.0.norm1.weight | |
encoder.layer.0.attention.output.LayerNorm.bias | [768] | encoder.layers.0.norm1.bias | |
encoder.layer.0.intermediate.dense.weight | [3072, 768] | encoder.layers.0.linear1.weight | T [768, 3072] |
encoder.layer.0.intermediate.dense.bias | [3072] | encoder.layers.0.linear1.bias | |
encoder.layer.0.output.dense.weight | [768, 3072] | encoder.layers.0.linear2.weight | T [3072, 768] |
encoder.layer.0.output.dense.bias | [768] | encoder.layers.0.linear2.bias | |
encoder.layer.0.output.LayerNorm.weight | [768] | encoder.layers.0.norm2.weight | |
encoder.layer.0.output.LayerNorm.bias | [768] | encoder.layers.0.norm2.bias | |
pooler.dense.weight | [768, 768] | pooler.dense.weight | T |
pooler.dense.bias | [768] | pooler.dense.bias |
Python-encode-decode Base64-encode-decode
python encode decode¶
是字节码bytes与字符串str之间的转换
- encode() 方法为字符串类型(str)提供的方法,用于将 str 类型转换成 bytes 类型,这个过程也称为“编码”。
str.encode([encoding="utf-8"][,errors="strict"])
- decode() 方法用于将 bytes 类型的二进制数据转换为 str 类型,这个过程也称为“解码”。
bytes.decode([encoding="utf-8"][,errors="strict"])
base64 encode decode¶
是字节码之间的转换
base64.b64encode():其他bytes字节码 -> base64 bytes字节码
base64.b64deocde():base64 bytes字节码 -> 其他bytes字节码
图片的网络传输为例¶
网络传输数据格式是:base64 str
Send: binary bytes -> base64 bytes -> base64 str
Rec: base64 str -> base64 bytes -> binary bytes
Paddle gather index_select
gather实现torch数组花式索引
https://github.com/PaddlePaddle/Paddle/issues/42554 [受到启发]如果要多个list做索引建议一个一个来分开处理
https://github.com/PaddlePaddle/Paddle/issues/35072
Paddle.nn.functional.unfold
Pytorch tensor.stride & tensor.as_strided
tensor.stride()¶
Stride is the jump necessary to go from one element to the next one in the specified dimension dim.
一个元素到另一个元素,元素粒度
任意维度上的步长,是其低维度乘积。
shape: (12, 512, 768) stride: (512x768x1, 768x1, 1x1)
tensor.as_strided()¶
input (Tensor) – the input tensor.
size (tuple or ints) – the shape of the output tensor
stride (tuple or ints) – the stride of the output tensor
more ...
Pytorch View vs Reshape
torch.view has existed for a long time. It will return a tensor with the new shape. The returned tensor will share the underling data with the original tensor. See the documentation here.
On the other hand, it seems that torch.reshape has been introduced recently in version 0.4. According to the document more ...
pytorch.nn.functional.pad
torch¶
从输入input的最后一个维度向前padding
输入input的$\left\lfloor\frac{\text{len(pad)}}{2}\right\rfloor$个维度进行padding
如果只padding输入张量input的最后1个维度,pad的形式如:(padding_left, padding_right)
如果只padding输入张量input的最后2个维度,pad的形式如:(padding_left, padding_right, padding_top, padding_bottom)
如果只padding输入张量input的最后3个维度,pad的形式如:(padding_left, padding_right, padding_top, padding_bottom, padding_front, padding_back)