Building Safer Social Spaces: Addressing Body Shaming with LLMs and Explainable AI (مقاله علمی وزارت علوم)

درجه علمی: نشریه علمی (وزارت علوم)

نویسندگان: ساجده طالبی ندا عبدالوند

منبع: International Journal of Web Research, Volume 8, Issue 3, 2025

کلیدواژه‌ها: Body Shaming Reddit Machine Learning deep learning Large Language Models Local Interpretable Model-agnostic Explanations content moderation

حوزه‌های تخصصی:

doi: 10.22133/ijwr.2025.525312.1286

شماره صفحات: ۵۹ - ۷۲

دریافت مقاله تعداد دانلود : ۲۹

آرشیو

چکیده

This study tackles body shaming on Reddit using a novel dataset of 8,067 comments from June to November 2024, encompassing external and self-directed harmful discourse. We assess traditional Machine Learning (ML), Deep Learning (DL), and transformer-based Large Language Models (LLMs) for detection, employing accuracy, F1-score, and Area Under the Curve (AUC). Fine-tuned Psycho-Robustly Optimized BERT Pretraining Approach (Psycho-RoBERTa), pre-trained on psychological texts, excels (accuracy: 0.98, F1-score: 0.994, AUC: 0.990), surpassing models like Extreme Gradient Boosting (XG-Boost) (accuracy: 0.972) and Convolutional Neural Network (CNN) (accuracy: 0.979) due to its contextual sensitivity. Local Interpretable Model-agnostic Explanations (LIME) enhance transparency by identifying influential terms like “fat” and “ugly.” A term co-occurrence network graph uncovers semantic links, such as “shame” and “depression,” revealing discourse patterns. Targeting Reddit’s anonymity-driven subreddits, the dataset fills a platform-specific gap. Integrating LLMs, LIME, and graph analysis, we develop scalable tools for real-time moderation to foster inclusive online spaces. Limitations include Reddit-specific data and potential misses of implicit shaming. Future research should explore multi-platform datasets and few-shot learning. These findings advance Natural Language Processing (NLP) for cyberbullying detection, promoting safer social media environments.