ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • Log-bilinear Language Model
    Data/Machine learning 2021. 4. 20. 17:09

    It computes the probability of a next word $w_i$ given the previous words (context) as follows:

    $$ P(w_i = w | w_{i-1}, ..., w_1) = \frac{ \exp\{ \phi(w)^T c \} }{ \sum_{w^\prime \in V} \exp\{ \phi(w^\prime)^T c \}  } $$

    Here $\phi (w)$ is a word-vector and $c$ is the context for $w_i$ computed as 

    $$ c = \sum_{n=1}^{i-1} \alpha_n \phi(w_n) $$

    Thus, the log-bilinear language model computes a context vector as a linear combination of the previous word vectors. And, a distribution of the next word $w_i$ is computed based on similarity between the word embedding $\phi(w)$ and the context, by taking a softmax over the vocabulary $V$.

    Application

    This is used in the CPC (Contrastive Predictive Coding) paper.

    Reference

    [1] Quora, www.quora.com/What-is-a-log-bilinear-model

     

    'Data > Machine learning' 카테고리의 다른 글

    PyTorch Example of LSTM  (0) 2021.04.21
    Visualization of CNN  (0) 2021.04.15
    What does .eval() do in PyTorch?  (0) 2021.04.14

    Comments