지도기계학습을 이용한 트위터 뉴스의 프레임 특성 분석 : 코로나19 보도 프레임의 자동화 판별 방법을 중심으로
초록
본 연구는 지도기계학습으로 코로나19 관련 언론보도를 건강신념모델에 기초한 심각성, 취약성, 이득, 장애의 프레임으로 분석했다. 특히, 기사의 헤드라인과 리드에서 확인된 각 뉴스 프레임의 언어적 특성으로 언론사의 트위터 게시물에서 드러나는 프레임을 판별하여 소셜미디어가 언론보도에 미치는 영향을 추론했다. 분석대상은 국내 주요 종합일간지인 조선일보, 중앙일보, 경향신문, 한겨레가 코로나19와 관련해 2020년 1월 20일부터 2021년 1월 19일까지 보도한 기사와 트윗이다. 지도기계학습을 이용한 자동화 방식의 프레임 판별을 위해 임의로 추출된 기사 표본 2000건에 대한 모델 정확성을 검증했고 이를 2000건의 트윗 표본에 적용하여 예측 정확성을 평가하고 언어적 특성의 차이를 살폈다. 그 결과, 심각성과 취약성의 지각된 위협 프레임이 이득과 장애의 행동적 평가 프레임에 비해 언론보도에서 부각되고 있어 코로나19에 대한 위험 인식 측면이 감염예방행동의 비용과 편익보다 강조되고 있음을 발견했다. 그러나 심각성과 장애 프레임은 취약성과 이득 프레임에 비해 기계학습모델의 예측 정확성이 저하된 점이 두드러져 프레임의 언어적 특성이 상대적으로 불규칙적이고 다양함을 포착했다. 더 나아가, 트위터에서는 이용자의 참여에 기반한 소셜미디어의 뉴스 확산 원리에 따라 프레임이 보다 유연하고 차별적인 방식으로 구성되어 개인적 차원에서의 감정적인 표현이 프레임에 자주 드러나고 있음을 확인했다. 본 연구는 기계학습을 활용해 대단위의 뉴스기사로부터 프레임을 분석하여 인간 코더의 주관적 해석이 아닌 투명하고 재현 가능한 자동화 방식으로 프레임 언어의 특징을 도출했다는 의의를 가진다.
Abstract
Using the supervised machine learning model, this study examined the coverage of COVID-19 by news frames—severity, susceptibility, benefits, and barriers—drawing upon the health belief model. In particular, linguistic features of each frame were automatically derived from the headline and lead of news and they were applied to explore framing features of Twitter messages posted by the major newspapers in South Korea. The data included news articles and tweets about COVID-19 from Chosun Ilbo, JoongAng Ilbo, Kyunghyang Shinmun, and Hankyoreh. To automatically identify news frames, we employed support vector machine(SVM) and naïve Bayes(NB) algorithms by evaluating the accuracy of classifying each frame in 2,000 randomly sampled articles. Furthermore, the optimal classification algorithm was applied to 2,000 randomly sampled tweets to evaluate the predication accuracy for each frame and reveal distinctive linguistic features of each frame on Twitter. Findings showed that perceived threat frames of severity and susceptibility were emphasized in the coverage to a greater extent than behavioral evaluation frames of benefits and barriers, highlighting the risk-aware aspects of Covid-19 prioritized over the costs and benefits of preventive behavior. But we also found that severity and barriers frames were not constructed by consistent and distinct features, given the reduced accuracy of models compared to susceptibility and benefits frames. Furthermore, news frames on Twitter were constructed in a more flexible and discriminatory manner insofar as the logic of social media engages more personalized and emotional use of language in the content. This study sets out a methodology whereby machine learning is employed to code news frames in large-scale news coverage of COVID-19 and identify the features of framing language in an automated, transparent, and reproducible way.
Keywords:
Automated Frame Analysis, Supervised Machine Learning, Twitter News, COVID-19, Health Belief Model키워드:
자동화 프레임 분석, 지도기계학습, 트위터 뉴스, 코로나19, 건강신념모델Acknowledgments
This research was supported by the Chung-Ang University Research Scholarship Grants in 2020. (이 논문은 2020년도 중앙대학교 연구장학기금 지원에 의한 것임)
References
- Al-Rawi, A. (2019). Viral news on social media. Digital Journalism, 7(1), 63-79. [https://doi.org/10.1080/21670811.2017.1387062]
- An, S., & Lee, H. (2016). Media guidelines for suicide prevention: Content analysis of news stories on depression according to health belief model. Health and Social Welfare Review, 36(1), 529-564. [https://doi.org/10.15709/hswr.2016.36.1.529]
- Andreeva, V. A., Unger, J. B., & Pentz, M. A. (2007). Breast cancer among immigrants: A systematic review and new research directions. Journal of Immigrant and Minority Health, 9(4), 307-322. [https://doi.org/10.1007/s10903-007-9037-y]
- Andrew, B. C. (2007). Media-generated shortcuts: Do newspaper headlines present another roadblock for low-information rationality? Harvard International Journal of Press/Politics, 12(2), 24-43. [https://doi.org/10.1177/1081180X07299795]
- Apuke, O. D., & Omar, B. (2021). Fake news and COVID-19: Modelling the predictors of fake news sharing among social media users. Telematics and Informatics, 56, 101475. [https://doi.org/10.1016/j.tele.2020.101475]
- Bae, B.-G., Lee, B.-R., & Choi, S.-H. (2015, June). The analysis of association between social bigdata and earthquake. Paper presented at the annual meeting of the Korean Institute of Information Scientists and Engineers, Jeju: Jeju National University.
- Becker, M. H. (1974). The health belief model and sick role behavior. Health Education Monographs, 2(4), 409-419. [https://doi.org/10.1177/109019817400200407]
- Bennett, W. L., & Segerberg, A. (2013). The logic of connective action: Digital media and the personalization of contentious politics. New York, NY: Cambridge University Press. [https://doi.org/10.1017/CBO9781139198752]
- Berger, J., & Milkman, K. (2010). Social transmission, emotion, and the virality of online content. Wharton Research Paper, 106, 1-52.
- Burscher, B., Odijk, D., Vliegenthart, R., De Rijke, M., & De Vreese, C. H. (2014). Teaching the computer to code frames in news: Comparing two supervised machine learning approaches to frame analysis. Communication Methods and Measures, 8(3), 190-206. [https://doi.org/10.1080/19312458.2014.937527]
- Carpenter, C. J. (2010). A meta-analysis of the effectiveness of health belief model variables in predicting behavior. Health Communication, 25(8), 661-669. [https://doi.org/10.1080/10410236.2010.521906]
- Champion, V. L., & Skinner, C. S. (2008). The health belief model. In K. Glanz, B. K. Rimer, & K. Viswanath (Eds.), Health Behavior and Health Education: Theory, Research, and Practice (4th ed.) (p. 45-65). San Francisco, CA: Jossey-Bass.
- Cho, H. L., & Jung, M. (2019). An exploratory content analysis of media coverage and framing on naturalistic medicine. Health and Social Welfare Review, 39(2), 332–357. [https://doi.org/10.15709/hswr.2019.39.2.332]
- Cho, M., & Lee, S. H. (2021). Analyzing news frames in the coverage of COVID-19: Data-driven approach to frame analysis. Journal of Speech, Media and Communication Research, 20(1), 65-107. [https://doi.org/10.51652/ksmca.2021.20.1.3]
- Choi, J. W. (2012). A study on improving the delivery system of emergency disaster broadcast using new media. Broadcasting and Media Magazine, 17(3), 24-39.
- Choi, S.-H. (2020). Preventive measures during outbreak of coronavirus disease 2019. Korean Journal of Medicine, 95(3), 134-140. [https://doi.org/10.3904/kjm.2020.95.3.134]
- Clarke, J. N., McLellan, L., & Hoffman-Goetz, L. (2006). The portrayal of HIV/AIDS in two popular African American magazines. Journal of Health Communication, 11(5), 495-507. [https://doi.org/10.1080/10810730600752001]
- Coleman, R., & Thorson, E. (2002). The effects of news stories that put crime and violence into context: Testing the public health model of reporting. Journal of Health Communication, 7(5), 401-425. [https://doi.org/10.1080/10810730290001783]
- De Grove, F., Boghe, K., & De Marez, L. (2020). (What) Can journalism studies learn from supervised machine learning? Journalism Studies, 21(7), 912-927. [https://doi.org/10.1080/1461670X.2020.1743737]
- Edy, J. A., & Meirick, P. C. (2007). Wanted, dead or alive: Media frames, frame adoption, and support for the war in Afghanistan. Journal of Communication, 57(1), 119-141. [https://doi.org/10.1111/j.1460-2466.2006.00332.x]
- Entman, R. M. (1993). Framing: Toward clarification of a fractured paradigm. Journal of Communication, 43(4), 51-58. [https://doi.org/10.1111/j.1460-2466.1993.tb01304.x]
- Entwistle, V. (1995). Reporting research in medical journals and newspapers. BMJ, 310(6984), 920-923. [https://doi.org/10.1136/bmj.310.6984.920]
- Gamson, W. A., & Modigliani, A. (1989). Media discourse and public opinion on nuclear power: A constructionist approach. American Journal of Sociology, 95(1), 1-37. [https://doi.org/10.1086/229213]
- García-Perdomo, V., Salaverría, R., Kilgo, D. K., & Harlow, S. (2018). To share or not to share: The influence of news values and topics on popular social media content in the United States, Brazil, and Argentina. Journalism Studies, 19(8), 1180-1201. [https://doi.org/10.1080/1461670X.2016.1265896]
- Goffman, E. (1974). Frame analysis: An essay on the organization of experience. New York, NY: Harper & Row.
- Ha, J.-H, & Rim, H. (2020). Does audience optimism transcend the severity of news frame during health pandemic outbreaks? Journal of Communication Science, 20(1), 191-229. [https://doi.org/10.14696/jcs.2020.03.20.1.191]
- Han, K. H. (2011). The impact of health belief components on the effectiveness of women’s cancer prevention campaigns: A cross-national examination towards Korean and Japanese women. The Korean Journal of Advertising and Public Relations, 13(2), 377-413.
- Harcup, T., & O’neill, D. (2017). What is news? News values revisited (again). Journalism Studies, 18(12), 1470-1488. [https://doi.org/10.1080/1461670X.2016.1150193]
- Hermida, A., Fletcher, F., Korell, D., & Logan, D. (2012). Share, like, recommend: Decoding the social media news consumer. Journalism Studies, 13(5-6), 815-824. [https://doi.org/10.1080/1461670X.2012.664430]
- Hyman, R. B., Baker, S., Ephraim, R., Moadel, A., & Philip, J. (1994). Health belief model variables as predictors of screening mammography utilization. Journal of Behavioral Medicine, 17(4), 391-406. [https://doi.org/10.1007/BF01858010]
- Janz, N. K., & Becker, M. H. (1984). The health belief model: A decade later. Health Education Quarterly, 11(1), 1-47. [https://doi.org/10.1177/109019818401100101]
- Jenkins, H. (2006). Fans, bloggers, and gamers: Exploring participatory culture. New York, NY: New York University Press.
- Jo, S. E., Shin, H. C., Yoo, S. W., & Roh, H. S. (2012). The study of factors affecting tuberculosis preventive behavior intentions: An extension of HBM with mediating effects of self-efficacy and fear. Journal of Public Relations Research, 16(1), 148-177. [https://doi.org/10.15814/jpr.2012.16.1.148]
- Jung, J., & Lee, D. (2012). A study on frame effect in elaboration likelihood perspective: Focusing on the cancer related news. Korean Journal of Journalism & Communication Studies, 56(6), 278-309.
- Jung, T., & Brann, M. (2014). Analyzing the extended parallel process model and health belief model constructs in texting while driving: News coverage in leading US news media outlets. International Journal of Health Promotion and Education, 52(4), 210-221. [https://doi.org/10.1080/14635240.2014.906967]
- Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47(2), 263-291. [https://doi.org/10.2307/1914185]
- Kananovich, V. (2018). Framing the taxation-democratization link: An automated content analysis of cross-national newspaper data. The International Journal of Press/Politics, 23(2), 247-267. [https://doi.org/10.1177/1940161218771893]
- Kim, E., Yu, H., & Han, K. (2019). Analysis of frame types in the income-led growth news reports. Korean Journal of Communication & Information, 96, 7-36. [https://doi.org/10.46407/kjci.2019.08.96.7]
- Kim, H. J. (2017). Psychological reactance against news articles on nuclear energy: Effects of prospect frames and issue frames. Korean Journal of Journalism & Communication Studies, 61(5), 130-164. [https://doi.org/10.20879/kjjcs.2017.61.5.005]
- Kim, J., &. Cho, J. (2019). Investigation of effects of individuals social viewing of fine dust information obtained through social media on behavioral intentions of disease prevention: Application of health beliefs model. Korean Journal of Broadcasting and Telecommunication Studies, 33(4), 37-65.
- Kim, J., & Yu, H. (2012). The effects of gain- or loss-framed health news and exemplars on the perception of reported issues and prevention intention. Korean Journal of Journalism & Communication Studies, 56(1), 5-30.
- Kim, J., & Yu, H. (2012). The effects of gain- or loss-framed internet health news and replies on the perception of reported issues and prevention intention. Korean Journal of Broadcasting and Telecommunication Studies, 26(3), 176-217.
- Kim, J., Choi, J., & Park, D.-J. (2018). The influence of health information overload caused by media use on health information distrust. The Korean Journal of Advertising and Public Relations, 20(2), 37-63. [https://doi.org/10.16914/kjapr.2018.20.2.37]
- Kim, J.-H. (2015). A study on public information service using Twitter: Focused on Twitters of major metropolitans. Journal of Korean Library and Information Science Society, 46(1), 115-133. [https://doi.org/10.16981/kliss.46.1.201503.115]
- Kim, M. H. (1997). Health belief model approach to health beliefs, attitude, and health behaviors concerning HIV/AIDS. Journal of Korean Society for Health Education and Promotion, 14(2), 125-147.
- Kim, S.-J., & Cha, H. (2009). The effect of public segmentation and message framing on the health risk communication: Applying anger activism model. Korean Journal of Journalism & Communication Studies, 53(2), 231-253.
- Klinger, U. (2013). Mastering the art of social media: Swiss parties, the 2011 national election and digital challenges. Information, Communication & Society, 16(5), 717-736. [https://doi.org/10.1080/1369118X.2013.782329]
- Koh, Y. C., & Choi, N. J. (2016). Investigation on the diversity of news reported by Jeju local newspapers: Regarding the news on ‘Free International City’ and ‘Island of Peace’. Journal of Communication Science, 6(2), 5-42.
- Kuiken, J., Schuth, A., Spitters, M., & Marx, M. (2017). Effective headlines of newspaper articles in a digital environment. Digital Journalism, 5(10), 1300-1314. [https://doi.org/10.1080/21670811.2017.1279978]
- Leavy, S. (2019). Uncovering gender bias in newspaper coverage of Irish politicians using machine learning. Digital Scholarship in the Humanities, 34(1), 48-63. [https://doi.org/10.1093/llc/fqy005]
- Lee, J. H., & Kil, W. (2019). News agenda classification and media diversity analysis using topic modeling: Based on news on the Presidential New Year Press Conference. Korean Journal of Broadcasting and Telecommunication Studies, 33(1), 161-196
- Lee, M.-K., & Lee, Y.-R. (2012). A news frame study of domestic Korean newspapers' coverage on animal infectious disease: Focused on the case of foot-and-mouth disease reporting among national & regional newspapers. Journal of Communication Science, 12(2), 378-414.
- Lee, S. H., & Lim, T. Y. (2019). Connective action and affective language: Computational text analysis of Facebook comments on social movements in South Korea. International Journal of Communication, 13, 2960–2983.
- Lee, Y. J. (2009). The use of Twitter service as social media for information sharing in emergency disaster response. Journal of Korean Society of Societal Security, 2(3), 16-18.
- McLeod, D. M., & Detenber, B. H. (1999). Framing effects of television news coverage of social protest. Journal of Communication, 49(3), 3-23. [https://doi.org/10.1111/j.1460-2466.1999.tb02802.x]
- Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger, (Eds.). Advances in Neural Information Processing Systems 26 (pp. 3111-3119). Lake Tahoe, NV: Curran Associates, Inc.
- Newman, N. (2011). Mainstream media and the distribution of news in the age of social media. Oxford, UK: Reuters Institute for the Study of Journalism.
- Opperhuizen, A. E., Schouten, K., & Klijn, E. H. (2019). Framing a conflict! How media report on earthquake risks caused by gas drilling: A longitudinal analysis using machine learning techniques of media reporting on gas drilling from 1990 to 2015. Journalism Studies, 20(5), 714-734. [https://doi.org/10.1080/1461670X.2017.1418672]
- Pan, Z., & Kosicki, G. M. (1993). Framing analysis: An approach to news discourse. Political Communication, 10(1), 55-75. [https://doi.org/10.1080/10584609.1993.9962963]
- Park, H. C., & Park, H. W. (2017). Disaster communication networks on Twitter: Gyeongju earthquake. Journal of the Korean Data Analysis Society, 19(1), 291-302. [https://doi.org/10.37727/jkdas.2017.19.1.291]
- Park, J., Ryu, P.-M., & Oh, H.-J. (2017). A collecting model of public opinion on social disaster in Twitter: A case study in `humidifier disinfectant`. KIPS Transactions on Software and Data Engineering, 6(4), 177-184. [https://doi.org/10.3745/KTSDE.2017.6.4.177]
- Pennington, J., Socher, R., & Manning, C. D. (2014, October). Glove: Global vectors for word representation. In Proceedings of The 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1532-1543). Doha, Qatar: Association for Computational Linguistics. [https://doi.org/10.3115/v1/D14-1162]
- Price, V., Tewksbury, D., & Powers, E. (1997). Switching trains of thought: The impact of news frames on readers' cognitive responses. Communication Research, 24(5), 481-506. [https://doi.org/10.1177/009365097024005002]
- Quick, B. L., & Bates, B. R. (2010). The use of gain-or loss-frame messages and efficacy appeals to dissuade excessive alcohol consumption among college students: A test of psychological reactance theory. Journal of Health Communication, 15(6), 603-628. [https://doi.org/10.1080/10810730.2010.499593]
- Rhee, J. (2000). Frames, interpretations and communication effects: An overview and assessment of frame studies, Media & Society, 29, 85-153.
- Rhee, J. (2001). Impacts of news frames in the coverage of conflicting issues on individual interpretation and opinion. Korean Journal of Journalism & Communication Studies, 46(1), 441-482.
- Rhee, J., & Kim, S.-H. (2018). News frames in the coverage of fine-dust disaster: Application of Structural Topic Modeling. Korean Journal of Journalism & Communication Studies, 62(4), 125-158. [https://doi.org/10.20879/kjjcs.2018.62.4.004]
- Rosenstock, I. M. (1974). The health belief model and preventive health behavior. Health Education Monographs, 2(4), 354-386. [https://doi.org/10.1177/109019817400200405]
- Sagi, E., Diermeier, D., & Kaufmann, S. (2013). Identifying issue frames in text. PLoS ONE, 8(7), e69185. [https://doi.org/10.1371/journal.pone.0069185]
- Sagi, E., & Dehghani, M. (2014). Measuring moral rhetoric in text. Social Science Computer Review, 32(2), 132-144. [https://doi.org/10.1177/0894439313506837]
- Scheufele, D. A. (1999). Framing as a theory of media effects. Journal of Communication, 49(1), 103-122. [https://doi.org/10.1111/j.1460-2466.1999.tb02784.x]
- Shih, T. J., Wijaya, R., & Brossard, D. (2008). Media coverage of public health epidemics: Linking framing and issue attention cycle toward an integrated theory of print news coverage of epidemics. Mass Communication & Society, 11(2), 141-160. [https://doi.org/10.1080/15205430701668121]
- Stieglitz, S., & Dang-Xuan, L. (2013). Emotions and information diffusion in social media—Sentiment of microblogs and sharing behavior. Journal of Management Information Systems, 29(4), 217-248. [https://doi.org/10.2753/MIS0742-1222290408]
- Stryker, J. E. (2003). Articles media and marijuana: A longitudinal analysis of news media effects on adolescents' marijuana use and related outcomes, 1977-1999. Journal of Health Communication, 8(4), 305-328. [https://doi.org/10.1080/10810730305724]
- Talosig-Garcia, M., & Davis, S. W. (2005). Information-seeking behavior of minority breast cancer patients: An exploratory study. Journal of Health Communication, 10(1), 53-64. [https://doi.org/10.1080/10810730500263638]
- Tanner-Smith, E. E., & Brown, T. N. (2010). Evaluating the health belief model: A critical review of studies predicting mammographic and pap screening. Social Theory & Health, 8(1), 95-125. [https://doi.org/10.1057/sth.2009.23]
- Tversky, A., & Kahneman, D. (1981). The framing of decisions and the psychology of choice. Science, 211(4481), 453-458. [https://doi.org/10.1126/science.7455683]
- Van Rijsbergen, C. (1979, September). Information retrieval: Theory and practice. In Proceedings of the Joint IBM/University of Newcastle upon Tyne Seminar on Data Base Systems (pp. 1-14). Newcastle, UK: University of Newcastle.
- Vidgen, B., & Yasseri, T. (2020). Detecting weak and strong Islamophobic hate speech on social media. Journal of Information Technology & Politics, 17(1), 66-78. [https://doi.org/10.1080/19331681.2019.1702607]
- Welbers, K., & Opgenhaffen, M. (2019). Presenting news on social media: Media logic in the communication style of newspapers on Facebook. Digital Journalism, 7(1), 45-62. [https://doi.org/10.1080/21670811.2018.1493939]
- Yanovitzky, I., & Blitz, C. L. (2000). Effect of media coverage and physician advice on utilization of breast cancer screening by women 40 years and older. Journal of Health Communication, 5(2), 117-134. [https://doi.org/10.1080/108107300406857]
- Yu, J., & Keum, H. (2018). Twitter as a risk information source: The effect of update recency and social endorsement on credibility judgments. Korean Journal of Broadcasting and Telecommunication Studies, 32(1), 33-65.
- Zhang, H., Park, J., & Lee, K. (2020). A study of factors affecting preventive behavior intentions of overseas infectious disease based on health belief model: Focusing on the moderating effects of SNS eWOM. The Korean Journal of Advertising and Public Relations, 22(2), 265-302. [https://doi.org/10.16914/kjapr.2020.22.2.265]