Data Journey

International Studies Grad. racing for AI Engineer

Download as .zip Download as .tar.gz View on GitHub

Prerequisite

Transformer

Transformer

Positional Encoding vs Positional Embedding

Encoding

단어 간 Similarity 계산 방법

Embedding (representation learning): mapping discrete type to a point in the vector space (the dense vector/real-value vector). When the discrete types are words, the dense vector representation is called “word embedding”

Distributional contextual representation: the order (syntactic/contextual) and positioning (distributional semantic) of words. Syntactic (grammar) and semantic (meaning) aspects of language (Cloze task). (op. Word2vec, GloVe: learn semantic representation solely from the distributional properties of large amounts of text)

if the model learns what the word to be predicted is, it will lose the learning of the context information, and if the model cannot learn which word will be predicted during the model training process, then it must be judged by learning the context information Only the words that need to be predicted can such a model have the ability to express features of sentences.

back next

Share this: