Information Extraction

The volume of Farsi information on the Internet has been increasing in recent years. However, most of this information is in the form of unstructured or semi-structured free text. For quick and accurate access to the vast knowledge contained in these texts, the information extraction methods are essential to generate knowledge bases. In recent years, relation extraction as a sub-task of information extraction has received much attention. While many of these systems were developed in English and other well-known languages, the systems for information extraction in Farsi have received less attention from researchers. In this systematic research for semi-automatic relation extraction, Persian Wikipedia articles were presented as reliable and semi-structured sources. In this system, the relation extraction is performed with the assistance of patterns that are automatically obtained with an approach based on distant supervised. In order to apply the distant supervised, the vast knowledge base of Wikidata has been used as a source in perfect synchronization with Wikipedia. The results show that the average precision value for all relations is 76.81%, which indicates an enhancement of precision compared to other methods in Farsi.

۲.

Enhancing Fake News Detection by Attention-Based BiLSTM and Hybrid Whale-Multi-Verse Optimization(مقاله علمی وزارت علوم)

نویسنده: Varalakshmi K. Ashok Kumar P. M.

منبع: Journal of Information Technology Management , Volume ۱۷, Special Issue on SI: Intelligent Security and Management, ۲۰۲۵ 168 - 197

کلیدواژه‌ها: Bidirectional LSTM deep learning Fake news detection Hierarchical Hybrid Op-timization Information Extraction

حوزه‌های تخصصی:

حوزه‌های تخصصی مدیریت مدیریت دانش و IT

تعداد بازدید : ۹۸ تعداد دانلود : ۸۶

The proliferation of fake news, characterized by the dissemination of inaccurate information to deceive audiences, has become a pressing concern in recent times. Traditional approaches to phony news detection, often focused on analyzing Twitter content, are susceptible to noise and variations in input sequences, leading to suboptimal performance. To address these challenges, this study proposes a novel method called Multi-Head Attention-Hierarchical Bidirectional Long Short-Term Memory (MHA-HBiLSTM) Networks. Our approach involves two phases: training and testing, wherein we employ tweet pre-processing techniques such as stemming, punctuation removal, stop-word elimination, URL handling, and Twitter control removal. Features are represented using the Glove word embedding technique for experimental evaluation and comparison. The MHA-HBiLSTM model integrates multi-head attention and hierarchical concepts, allowing meaningful information extraction from Twitter data. Notably, our model utilizes dual-level attention mechanisms and a hierarchical structure, reflecting the inherent hierarchy in documents and prioritizing key material during document representation. The effectiveness of the proposed MHA-HBiLSTM algorithm is evaluated using the Whale & Multi-Verse (W-MVO) Optimizer approach, with tests conducted on Kaggle and FakeNewsNet datasets. Comparative analysis with traditional machine learning approaches and deep learning models demonstrates the superior performance of the MHA-HBiLSTM approach in fake news detection.

Information Extraction

A Distant Supervised Approach for Relation Extraction in Farsi Texts(مقاله علمی وزارت علوم)

Enhancing Fake News Detection by Attention-Based BiLSTM and Hybrid Whale-Multi-Verse Optimization(مقاله علمی وزارت علوم)