NLP and Transformer Models

Description

Here, I focused on natural language processing and transformer models. I first evaluated GPT‑2 models of different sizes on the CoLA dataset for grammatical acceptability, showing that larger models performed better and that averaging log probabilities gave more reliable results. I then compared GPT‑2 with BERT, finding that BERT consistently outperformed GPT‑2 thanks to its bidirectional architecture, even without fine‑tuning.
Finally, I fine‑tuned BERT large on CoLA for sequence classification, which significantly boosted performance and achieved about 83.6% accuracy. My conclusion was that while model size helps, architecture is even more important, and fine‑tuning makes BERT the most effective approach for this task.

PDF Preview

Project Files

Loading project files…
Project Information
Tags:
Natural Language Processing Transformers GPT-2 BERT CoLA Dataset Fine-Tuning Language Models Text Classification
Status: Completed