Question Answering With Large Language Models
Abstract
In the field of open-domain question-answering systems, Transformers,
especially BERT - Bidirectional Encoder Representations from Transformers – a “state of
the art (SOTA)” in Natural Language Processing (NLP), have shown an incredible
performance. After a few weeks of launch, BERT has almost taken the top of all natural
language processing tasks so far once again reaffirming its strength till today with its
variant. This can be considered a great leap for Google in the field of natural language
processing. This is also a big push to improve problems in natural language processing
for Vietnamese languages.
In this thesis, we will be focusing on finding a suitable model for the
Vietnamese question-answering tasks and the model we will mainly focus on is
Bloomz1b1, a large multi-lingual language model, finetune from a mixture of 13 training
tasks in 46 languages with English prompts. We will fine-tune Bloomz-1b1 with the
LoRA(low-rank adaptation of large language) and make a user interface with the resulting
model to do a question-answering app.