【RL】Policy Gradient

1. Reinforcement Learning

  • Actor(Policy)

    Neural Network as Actor (Deep). vs lookup Table(Q Learning).

使用神经网络作为Actor比查表的优势?

查表无法穷举输入,e.g.图像画面或者语言输入 …
more ...



【NLP】RASA Featurizer

  • 训练集stories如何构建状态state作为训练输入数据?

  • 构建的状态state作为输入X如何编码?

  • 输出y是什么?如何编码?

https://rasa.com/docs/rasa/api …

more ...

【NLP】Embedding

content
    embedding_dim=16
    
    model = keras.Sequential([
      layers.Embedding(vocab_size, embedding_dim, input_length=maxlen),
      layers.GlobalAveragePooling1D(),
      layers.Dense(16, activation='relu'),
      layers.Dense(1, activation='sigmoid')
    ])
    

    Embedding: This layer takes the integer-encoded vocabulary and looks up the embedding vector for each word-index. These vectors are learned as the …

    more ...

    【NLP】GlobalAveragePooling1D

    content
      embedding_dim=16
      
      model = keras.Sequential([
        layers.Embedding(vocab_size, embedding_dim, input_length=maxlen),
        layers.GlobalAveragePooling1D(),
        layers.Dense(16, activation='relu'),
        layers.Dense(1, activation='sigmoid')
      ])
      

      GlobalAveragePooling1D: return a fixed-length output vector for each example by averaging over the steps dimension. This allows the model to handle input …

      more ...