Abstract:
Online user-generated reviews provide consumers with abundant information, which influences their shopping decisions on a variety of products from daily consumption to entertainment. Due to the sheer size of the reviews, users are prevented from a clear picture of products. In fact, it is not easy for them to go through all reviews for each item. Existing solutions to information overload in ecommerce sites include estimating the quality of reviews and summarizing the opinions from the reviews. However, review ranking based on review quality may lead to information redundancy while review summarization fails to provide the context of reviews, resulting in poor readability. To this end, the paper aims at implementing an effective review selection method. We design two opinion extraction algorithms, which are dictionary and rule-based, and LDA-based respectively, to represent each review. A greedy approach is proposed to select a small set of high quality reviews for each product, and to maximize both the attribute coverage and opinion diversity. A set of experimental results on real datasets show that the proposed method is effective, and for the two opinion extraction algorithms, the dictionary and rule-based algorithm performs better than the LDA-based algorithm in solving review selection problem.