NLP — BERT & Transformer

Source (After using BERT in understanding the query)
Generated from source using GPT-2 model

Encoder-Decoder & Sequence to Sequence

  • Learning the long-range context with RNN gets more difficult as the distance increases.
Source (sfgate)
  • RNN is directional. In the example below, a backward RNN may have a better chance to guess the word “win” correctly.
Source
Source

Attention

Transformer

Transformer Encoder

Source
Source
Source
Modified from source
Source

Transformer Decoder (Optional)

NLP Tasks

Source (SQuAD)
Source
Source

BERT (Bidirectional Encoder Representations from Transformers)

Source
Source
Source
Modified from source

Pretraining

Source

Fine-tuning BERT

Source

SQuAD Fine-tuning

Model

Next

Credit and References

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store