Contributor
Mrityunjay Tripathi

Transformer and BERT in mlpack


Mentors
Mikhail Lozhnikov
Organization
mlpack

Connecting the encoder and decoder through an attention mechanism has made some of the best translation models. The encoder encodes input sentences into some fixed-length vector. The decoder then translates the encoded vector. Attention Mechanism allows the decoder to attend to different parts of the source sentence at each step of the output generation.

The BERT is designed to pre-train deep bidirectional representations from unlabeled tests by jointly conditioning on both left and right context in all layers. The project will facilitate us in providing the Transfer Learning method in Natural language Processing tasks using the BERT model. After the pre-training, the BERT model can be finetuned with adding some output layer to create models that can be helpful in translation, next word prediction or next sentence prediction, question-answering, etc.