BERT-based Models for Keyword Extraction from Arabic Scientific Articles

BABAYİĞİT, BİLAL; Sattuf, Hamza; Abubaker, Mohammed

doi:10.1145/3761805

BERT-based Models for Keyword Extraction from Arabic Scientific Articles

BABAYİĞİT B., Sattuf H., Abubaker M.

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, cilt.24, sa.10, 2025 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 24 Sayı: 10
Basım Tarihi: 2025
Doi Numarası: 10.1145/3761805
Dergi Adı: ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Erciyes Üniversitesi Adresli: Evet

Özet

Keywords at the beginning of research articles are crucial for conveying the content and main ideas of academic works. They serve as essential tools for researchers to efficiently search for relevant topics. The integration of traditional natural language processing (NLP) techniques with modern deep learning methods has significantly advanced keyword extraction across various languages. However, extracting keywords from Arabic research poses unique challenges due to the language's complex morphology, rich semantics, and a scarcity of available resources. This study aims at identifying and analyzing keywords in a sample of studies from 480 peer-reviewed Arabic journals across diverse scientific fields, leading to the development of a specialized dataset for Arabic keyword extraction. The dataset comprises 38,728 records, each containing abstract and authors-specified keywords. We utilized this novel dataset to train and evaluate several BERT-based models tailored for the Arabic language, including bertBaseArabic, bertBaseQarib, camelBERT, multidialect-BERT, and baseAraBERT. Prior to training, the dataset underwent comprehensive pre-processing, including data cleaning, lemmatization, and binary tagging. The keyword extraction task was transformed into a binary classification problem, where tokens were labeled as either keywords or non-keywords, simplifying the learning process. Experimental results indicate that the bertBaseQarib model achieved the highest performance, with an F1-score of 0.951, precision of 0.968, and recall of 0.940. This study highlights the effectiveness of BERT-based models in Arabic keyword extraction and emphasizes the importance of large, diverse datasets in achieving robust NLP outcomes. Future work will focus on expanding the dataset and optimizing model architectures for even better performance.