Loading...

Media is loading
 

Paper Type

Complete

Abstract

The demand for language models has increased drastically due to their popularization in recent years. In the natural language processing scenario, this popularization demonstrates a demand for models for low-resource languages, so that populations in emerging countries also have access to this type of technology. Nevertheless, most existing models are developed predominantly with English resources, struggling to adapt their knowledge to the complexities of sub-represented languages. This work proposes an evaluation of the current landscape of multilingual and specific language models, Aya and Sabiá-7B, focusing on their application and performance in Brazilian Portuguese, through Aspect-Based Sentiment Analysis (ABSA), Hate Speech Detection (HS), Irony Detection (ID), and Question Answering (QA) tasks. During our experiments, our approach had shown promising results of the Portuguese focused Sabiá-7B model on datasets made from native Portuguese examples, while the multilingual Aya model showed the best results when using texts translated from English, on the QA task.

Paper Number

1795

Author Connect URL

https://authorconnect.aisnet.org/conferences/AMCIS2024/papers/1795

Comments

LACAIS

Author Connect Link

Share

COinS
 
Aug 16th, 12:00 AM

A Systematic Analysis of Multilingual and Low-Resource Languages Models: A Review on Brazilian Portuguese

The demand for language models has increased drastically due to their popularization in recent years. In the natural language processing scenario, this popularization demonstrates a demand for models for low-resource languages, so that populations in emerging countries also have access to this type of technology. Nevertheless, most existing models are developed predominantly with English resources, struggling to adapt their knowledge to the complexities of sub-represented languages. This work proposes an evaluation of the current landscape of multilingual and specific language models, Aya and Sabiá-7B, focusing on their application and performance in Brazilian Portuguese, through Aspect-Based Sentiment Analysis (ABSA), Hate Speech Detection (HS), Irony Detection (ID), and Question Answering (QA) tasks. During our experiments, our approach had shown promising results of the Portuguese focused Sabiá-7B model on datasets made from native Portuguese examples, while the multilingual Aya model showed the best results when using texts translated from English, on the QA task.

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.