Emotion Classification in Indonesian Text Using IndoBERT
Abstract
Mental health issues have become a challenge that affects many individuals around the world. A 2018 WHO report noted an increase in deaths by suicide, with a frequency of one case every 40 seconds. The Ipsos Global 2023 survey showed that 44% of respondents in 31 countries are concerned about mental health, while 30% identified stress as a major issue. In Indonesia, the mental health situation is also a serious concern. The 2022 I-NAMHS survey found that 34.9% of adolescents face mental health problems, but only 2.6% of them utilize counseling services. Emotion detection in text is challenging due to the absence of facial expressions or voice modulation. This study aims to classify emotions in Indonesian text using the IndoBERT model. The dataset used consists of 5079 tweets with five emotion labels: Angry, Fear, Joy, Love, and Sad. Parameter variations include the composition of training, validation, and test data split (80:10:10, 75:15:15, and 60:20:20), as well as the combination of learning rate (1e-2 to 1e-7) and batch size (8, 16, and 32). The model was trained for 25 epochs with the application of early stop and patience for 5 epochs. The experimental results showed that the composition of data split 80:10:10, learning rate 1e-6, and batch size 8 resulted in optimal classification. Although some experiments showed indications of overfitting, this research has important implications in the early detection of emotions and can help in mental health treatment efforts.