JERRYLSU

GPT Training
by Jerry Su, post on Feb 20, 2024

Teacher Forcing

Exposure bias

  • 最优路径。一步错,步步错

Scheduled sampling

  • 实现修正纠错。开始更大概率选取ground truth作为target,随着时间更多概率选取模型predict结果作为target,最终逐渐使 …

Read in 1 mins
Nucleus Sampling Top-p Sampling
by Jerry Su, post on Feb 20, 2024

1. 温度调节(Temperature Scaling)

  • 为了调整概率分布的“锐利度”,可以引入一个温度参数(Temperature)。温度较高时,概率分布变得更加平坦,增加了低概 …

Read in 3 mins
CLIP
by Jerry Su, post on Jan 04, 2024

CLIP(Contrastive Language-Image Pretraining)

Vision Vocabulary:CLIP的视觉字典是指该模型通过对比学习从大规模图像和文本数据中学到的关于图像的表示和语义信息的集合。
 在训练过程中,CLIP学会了将图像嵌入(embed)到一个高维空间中,并在该空间中通过文本描述对图像进行分类或检索。这个视觉字 …

Read in 3 mins
SELF-INSTRUCT: Aligning Language Model with Self Generated Instructions
by Jerry Su, post on Apr 29, 2023

Self-Instruct is a framework that helps language models improve their ability to follow natural language instructions. It does this by using the model’s own generations to create a large collection of instructional data. With Self-Instruct, it is possible to improve the instruction-following capabilities of language models without relying on …

Read in 2 mins
Paddle adversarial train
by Jerry Su, post on Nov 03, 2022

Reason is the light and the light of life.

Read in 2 mins
Longformer Torch2Paddle
by Jerry Su, post on May 10, 2022

Reason is the light and the light of life.

Read in 2 mins
Paddle.nn.functional.unfold
by Jerry Su, post on May 07, 2022

Reason is the light and the light of life.

Read in 1 mins
Pytorch tensor.stride & tensor.as_strided
by Jerry Su, post on May 04, 2022

Reason is the light and the light of life.

Read in 2 mins
Pytorch View vs Reshape
by Jerry Su, post on May 03, 2022

Reason is the light and the light of life.

Read in 1 mins
pytorch.nn.functional.pad
by Jerry Su, post on May 02, 2022

Reason is the light and the light of life.

Read in 2 mins