
Keyword extraction and topic modeling in the analysis of Gojek user reviews in Indonesian are very important. By understanding user preferences and needs through keyword extraction, as well as grouping user reviews into different topics through topic modeling, stakeholders can use the information to further improve services. This research uses TF-IDF and LDA approaches to analyze text data from Gojek user reviews and feedback. The data spans from Nov 5, 2021, to Jan 2, 2024, totaling 225,002 rows. Each row includes username, content, time, and app version. The focus is on content reviews. The average length is 8 words, with a maximum of 104 and a minimum of a few words. The variability indicates a non-normal distribution. Preprocessing is conducted to maintain topic analysis accuracy. The TF-IDF method is used to extract relevant keywords, while the LDA approach is used to model the topics in user reviews. The topic analysis reveals patterns in Gojek user reviews. The first topic discusses experience, services, and affordable pricing. The second emphasizes app usability and benefits. The third relates to promos, discounts, and vouchers. The fourth reflects positive evaluations of service quality. However, the fifth topic highlights high costs and app issues. The sixth underscores overall user satisfaction and service convenience. Testing on the topic model yielded a coherence level of 0.509, indicating that the model's topics demonstrate a good level of consistency in finding relevant topics from Gojek user review data. The use of a combination of TF-IDF and LDA in Indonesian text analysis, particularly in the context of Gojek user reviews, is an important step in enhancing understanding and utilization of text data to improve overall user experience.
QA76.75-76.765, tf-idf, lda, topic modeling, Information technology, Computer software, word extraction, T58.5-58.64, preferences
QA76.75-76.765, tf-idf, lda, topic modeling, Information technology, Computer software, word extraction, T58.5-58.64, preferences
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
