چکیده

Q&A forums are designed to help users in finding useful information and accessing high-quality content posted by other users in text forums. Automatically identifying high-quality replies posted in response to the initial posts not only provides users with appropriate content, but also saves their time. Existing methods for classifying user replies based on their quality, try to extract quality features from both the textual content and metadata of the replies. This feature engineering step is a time and labor-intensive task. The current study addresses this problem by proposing new model based on deep learning for detecting quality user replies using only raw textual content. Specifically, we propose a long short-term memory (LSTM) model that exploits the embeddings from language models (ELMo) for representing words as contextual numerical vectors. We compared the effectiveness of the proposed model with four traditional machine learning models on the TripAdvisor for New York City (NYC) and the Ubuntu Linux distribution online forums datasets. Experimental results indicated that the proposed model significantly outperformed the four traditional algorithms on both datasets. Moreover, the proposed model achieved about 16% higher accuracy compared to that obtained by the traditional algorithms trained on both textual and quality dimension features.

تبلیغات