Devising a Cross-Domain Model to Detect Fake Review Comments
Chen-Shan Wei1, Ping-Yu Hsu1, Chen-Wan Huang1,
Ming-Shien Cheng2(&), and Grandys Frieska Prassida1
1 Department of Business Administration, National Central University, No. 300,
Jhongda Road, Jhongli City, Taoyuan County 32001, Taiwan (R.O.C.)
984401019@cc.ncu.edu.tw
2 Department of Industrial Engineering and Management,
Ming Chi University of Technology, No. 84, Gongzhuan Road, Taishan District,
New Taipei City 24301, Taiwan (R.O.C.)
mscheng@mail.mcut.edu.tw
Abstract
The online reviews not only have huge impact on consumer shopping behavior but also online stores’ marketing strategy. Positive reviews will have positive influence for consumer’s buying decision. Therefore, some sellers want to boost their sales volume. They will hire spammers to write undeserving positive reviews to promote their products. Currently, some of the researches related to detection of fake reviews based on the text feature, the model will reach to high accuracy. However, the same model test on the other dataset the accuracy decrease sharply. Relevant researches had gradually explored the identification of fake reviews across different domains, whether the model built using comprehensive methods such as text features or neural networks, encountering the decreasing of accuracy. On the other hand, the method didn’t explain why the model can be applied to cross-domain predictions. In our research, we using the fake reviews and truthful reviews from Ott et al. (2011) and Li, Ott, Cardie, and Hovy (2014) in the three domain (hotel, restaurant, doctor). The cross-domain detect model based on Stimuli Organism Response (S-O-R) combine LIWC (Linguistic Inquiry and Word Count), add word2vec quantization feature, overcoming the decreasing accuracy situation. According to the research result, in the method one SOR calculation of feature weight of reviews, the DNN classification algorithm accuracy is 63.6%. In the method two, calculation of frequent features of word vectors, the random forest classification algorithm accuracy is 73.75%.
Keywords: Fake reviews Stimuli-Organism-Response (S-O-R) framework word2vec
Full Paper: Download Full Paper
Plagiarism Check: Download Check Plagiarism
Peer Review: Download Peer Review