Cross-Cultural Differences in Public Discourse on #covid19 Vaccination in the United States and South Korea: Cross-Sectional Analysis Using Natural Language Processing
Background: The #covid19 vaccine was introduced as a crucial tool to combat the pandemic. However, concerns about its effectiveness, side effects, and misinformation spread remain. Prior research largely relied on survey-based approaches with limited populations. To address these limitations, social media offers a broader, more naturalistic lens into public discourse on #covid19 vaccination. Accordingly, our study leverages social media data to identify factors shaping vaccine-related information needs, perceptions, and communication dynamics. Objective: This study investigated public discourse about #covid19 vaccines on community-driven question-and-answer sites in the United States (Quora; Quora, Inc) and South Korea (Naver Knowledge-iN; Naver Corp) to identify cross-national similarities and differences in vaccine-related information needs, sentiment patterns, and public perceptions over time. Methods: We analyzed publicly available #covid19 vaccine–related questions and answers posted between June 27, 2020, and June 27, 2021, on 2 community-driven question-and-answer platforms: Quora (United States) and Naver Knowledge-iN (South Korea). After preprocessing and sample-size matching, the dataset included 3952 question-answer pairs per platform, with one community-selected (most upvoted) answer analyzed per question. Natural language processing (NLP) techniques were applied for topic classification and sentiment analysis. Questions were categorized using a hybrid topic modeling approach combining Latent Dirichlet Allocation (LDA) and Top2Vec, identifying 5 topics on Quora and 7 topics on Naver Knowledge-iN. Answer sentiments were classified using an ensemble of Bidirectional Encoder Representations from Transformers (BERT; Google LLC)– and Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA; Google LLC)–based transformer models, and temporal sentiment trends were examined using monthly aggregation. Results: Five shared information needs emerged, including effects of vaccines, variants, government policy, visiting overseas, and different vaccines, while South Korea uniquely exhibited vaccination appointments (711/3952, 18%) and school and education (513/3592, 13%). Negative sentiment predominated in US (Quora) answers across 4 of 5 topics, whereas positive sentiment exceeded 50% (498/790, 337/474, 367/592, 218/316, 348/553, 562/711, and 364/513) across all 7 topics on Naver Knowledge-iN. Temporally, US sentiment exhibited multiple positive-negative crossovers, whereas Korean sentiment stabilized toward positivity after February 2021, coinciding with the national vaccine rollout. Question-answer sentiment pairs showed contrasting interaction patterns, including negative-negative pairs dominated in the United States (eg, 504/978, 51.5% for different vaccines), while in South Korea, positive-positive and negative-positive pairs accounted for more than 63% (498/790, 337/474, 367/592, 218/316, 348/553, 562/711, and 364/513) of interactions in 7 topics, with positive-positive pairs most prevalent in 6 of 7 topics, except for variants. Conclusions: Public perceptions of #covid19 vaccines and related information needs differ between the 2 countries, shaped by cultural context, trust in government, and information-seeking environments. Analysis of social question and answer data from the 2 countries reveals shared information needs but divergent sentiment patterns. These findings highlight the value of social media data for public health research and the need for culturally and platform-specific communication strategies.