Research Projects

I study the collective and communicative practices on social media and how these acts uphold or challenge the dominant narratives, power hierarchies, and hegemonic ideologies, with a specific concentration on race and ethnicity. I operate on an interdisciplinary approach to my research; currently, most of my work investigates online communities and their racial discourses mostly in a computational fashion, but I also enjoy working on my research questions qualitatively and conceptually.


I have accumulated a rich experience during my previous academic and professional experience. My research mainly focused on library science during my undergradaute study. During my master's program, I conducted numerous independent reserach projects with a focus on tracking, modeling, and understanding online communities using data mining approaches. After my graduation, my jobas a data scientist in non-profit research orginations and universities gave me an unique exposure to social sciences research, such as public health and criminal justice.

Ph.D. Period (2023-Present)

Digitized Genome, Detailed Maps, and Dazzling Selfies: Platformed Racial Spectacle and DNA Test Results Image Sharing on Social Media

Ph.D. Student

  • Expatiated the collective practices of image sharing of “direct-to-consumer” (DTC) genetic ancestry tests (GAT) results on social media, specifically the strategic, selective, and structural coupling of DNA charts, ancestry maps, and selfies
  • Proposed the concept of “platformed racial spectacle”, fabricating the mythical connections between race, DNA, bodily traits, and geographies
  • Explicated the emergence and existence of platformed racial spectacle, unfurl its texture and format, and expound its functionality of unification and alteration
  • DNA Tests Resultsr/23andMeSpectacleSelfiesBiological Racism

    Subversive Humor and Platformed Asianness: Resisting West Homogenization and Asian Ultranationalism through Satirical Neologism

    Ph.D. Student

  • Analyzed the neologism created on the subreddit r/2Asia4u, exploring the resistance and identity work performed through satire and mediated by the coining of new lexicons
  • Unveiled the major patterns of neologism creation, elucidating specific morphological processes, their pragmatic use, and the context of online Asian discourses
  • Demonstrated a mixed methods approach to analyzing marginalized group identities work with quantitative networks and qualitative codings
  • Subversive HumorPlatformed AsiannessNeologismLinguistic Analysis

    Locating the Asymmetry in Information Flow between Local and National Media on Transgender Discourses

    Ph.D. Student

  • Analyzed the interplay of national and local news on transgender news by employing a combination of causal inference methods and critical discourse analysis
  • Dected whether, and how, transgender discourses spread across local and national media
  • Intermedia Agenda SettingTransgender IssuesNews EcosystemLocal News

    How Social Responses to Online Hate Messages Affect Hatefulness

    Ph.D. Student

  • Explored the premise that the toxicity of hate messaging is affected by social interactions among social media hate posters
  • Analyzed hateful posts on Gab, and Likes, Dislikes, and written replies from other users that affirmed or negated the original posts
  • Hate SpeechSocial Approval TheoryNeed-Threat TheoryGab

    “Just check OP’s post history”: Does Post History Matter for Online Anti-Social Behaviors?

    Ph.D. Student

  • Theorized the link between post history and Hyperpersonal Communication model and Hypernegative Dynamics with online hate
  • Analyzed the effects of calling out one's post history on the evolution of online communications
  • Tested the potential connections between a callout of one's post history and online anti-social behaviors
  • Post HistoryHyperpersonal CommunicationComputer-mediated CommunicationOnline Anti-socail Behaviors

    Understanding the Roles and Scenarios in Self-Disclosure of Everyday Experiences of Racism on Social Media

    Data Scientist/Ph.D. Student

  • Scraped all historical submissions and comments from r/racism
  • Proposed and developed the framework for systematically characterizing experiences of racism using Construal Level Theory and Agent-Structural framework
  • Conducted in-depth analysis on the content and intention of experiences of racism based on the proposed framework
  • Social MediaComputer Mediated CommunicationThematic AnalysisQualatitive Research
    Post-Master Working Professional Period (2020-2023)

    Neighborhood Racial Composition, Built Environment, and Health Outcomes

    Data Scientist

  • Performed clustering analysis with HDBSCAN and CLARA to discover the groupings of tracts based on the built environment indicators
  • Conducted exploratory factor analysis to identify latent factor structure across all built environment variables
  • Performed multi-level regression and mediation analysis to identify the racial disparities in the built enviornment and to what extent the disparities influence health outcomes
  • Built EnvironmentRacial Disparities in HealthMulti-level RegressionMediation Analysis

    Structural Racism in COVID-19 Vaccine Distribution

    Data Scientist

  • Developed and standardized the data wrangling pipeline that scraped data from public-facing websites, converting PDFs to machine-readable tables, formatted and geocoded addresses, and merged it with census data
  • Performed secondary regression diagnostics and developed visualizations for the publication
  • Participated in paper writing and assited with paper revisions and resubmision
  • Structural RacismCOVID-19 VaccineVaccine DistributionRegressionsGeocoding

    The Impact of Active Bystander Training on Officer Confidence and Ability to Address Ethical Challenges

    Research Data Scientist

  • Conducted exploratory data analysis on the survey data responses with Exploratory Factor Analysis and Canonical Correlation in R
  • Generated static and interactive data visualizations in Dash and participated in paper writing
  • Active Bystander TrainingEthical ChallengesInterventionVisual Analytics

    Police Union Contracts and Police Accountability

    Research Data Scientist

  • Supervised the qualitative data collection, including MTurk worker management and double-coding review for 100 cities' police union contracts
  • Performed clustering on the coded documents to identify general patterns of contract items across cities
  • Analyzed the LexisNexis news data to identify the progress of police union contract negotiations with NLP techniques in Python
  • Police Union ContractPolice AccountabilityNLPNews Analysis

    Review of the LAPD Response to First Amendment Assemblies and Protests

    Research Data Scientist

  • Identified policing hotspots and community opinion leaders with spatial-temporal clustering, and tweets analysis and developed interactive dashboards
  • Scraped Twitter data and helped the team to identify key opinion leaders for interviews, and key hashtags during the protests
  • Developed an interactive E-report using ArcGIS Story Map which was later presented to the LA City Council
  • After-action ReviewPolice BrutalitySpatial-temporal ClusteringVisual Analytics
    Master's Period (2018-2020)

    A Picture with A Thousand Words: Understanding the Text-Embedded Images on Pinterest

    Independent Study

  • Scraped 0.5 million random images and their metadata from Pinterest and extracted the texts in images using OCR techniques in Python
  • Constructed Genialized Linear Models (Negative Binominal, Logit) to find the relationship between dominant color, probability of image having texts and the image popularity
  • Built Author-Topic Model to find the topic distribution of the embedded among all categories of images
  • Conducted one-mode projection of bipartite and community detection to discover similar image categories
  • Social MediaComputer Mediated CommunicationText-Image IntergrationBipartite MiningPoint Pattern Analysis

    Friends or Foes: Understanding Communication and Interaction Patterns of Homogenous and Cross-Cutting Spaces in Online Activisms

    Master Thesis

  • Classified the standpoint of tweets and users with N-grams, Doc2Vec, and Word2Vec embeddings using Logistic Regression and Neural Network Classifier
  • Conducted non-parametric significant tests to understand the political homophily, linguistic and argumentative style, and hyperlink usage when users interact with people with different standpoints
  • Mined subgraphs of the network by counting triplets in the graphs and compared them with other simulated graphs like ER graph
  • Social MediaCommunication StructureClassificationSubgrah MiningLinguistic Style Analaysis

    “Musicalization of the Culture”: Is Music Becoming Louder, More Repetitive and Monotonous?

    Independent Research

  • Validated a sociomusicological theory about mainstream music with data mining techniques
  • Collected, wrangled, and merged lyrical and acoustic data from existing music datasets and Spotify’s API
  • Proposed and evaluated novel metrics for repetitiveness measurement using closed itemset mining, improving the previous compression approach of computing lyric repetitiveness
  • Performed Time-Series analysis with trend detection and Time-lagged Cross Correlation (TLCC) analysis
  • Music AnalyticsText MiningTime-Series AnalysisItemset MiningRepetitiveness

    Sleeping Beauties in Music: Predicting Sleeper Hits with Lyrical and Acoustic Features

    Class Teamwork Project

  • Constructed a comprehensive music dataset by scraping, cleaning, and merging data from Spotify API and MetroLyrics and Billboard datasets
  • Proposed a novel metric, 'Beauty Coefficient', to measure the sleeper-hit level of a song using chart data
  • Built a predictive Huber regression model of the sleeper hits using the word embedding features and acoustic data
  • Music AnalyticsText MiningRegressionSleeper HitsVisual Analytics

    When Power Goes Wild Online: How Did A Voluntary Moderator's Abuse of Power Affect an Online Community?

    Independent Research

  • Conducted an event study of an infamous excessive moderation event on Reddit using timeseries analysis with intervention, breakpoint detection, and social network analysis in R
  • Scrapped and cleaned historical posts and comments of subreddit r/xkcd and r/xkcdcomic
  • Connected the data analysis result with sociology theories, such as social movement theories and community choices theories
  • Power AbuseVoluntary ModerationRedditOnline CommunityTime-series AnalysisSocial Network Analysis

    Dumping the Closet Skeletons Online: Exploring the Guilty Information Disclosure Behavior on Social Media

    Independent Research

  • Retrieved, cleansed 225k posts and comments from a forum of Reddit and visualized the topics created with LDA using Python
  • Constructed a thematic model capturing the intents, contents, and interactions of the guilty information sharing behavior on social media with sampled data
  • Guilty Information DisclosureOnline ConfessionRedditPrivacy ParadoxMixed MethodlogySementic ModelLDA
    Undergradaute Period (2014-2018)

    Fortunes on Fingertips: Research on the Online Praying Behaviors on Chinese Social Media,Focusing on the Retweets of “Koi Fish”

    Independent Research

  • Collected, cleansed, and visualized 110k+ social media posts from a Chinese Microblog Website using Python, OpenRefine, and Gephi to investigate the posts spreading pattern, user characters, social network, and text contents
  • Conducted a confirmatory factor analysis using questionnaire responses to explore the incentives of online praying behavior
  • Online PrayingSocial MediaKoi FishMicroblogsStructure Equation ModelSocial Network Analysis

    Research on the Construction of Library Technologies’ Performance Appraisal Index System and Its Empirical Study - From the Perspective of Life Cycle

    Research Assistant

  • Constructed the qualitative and quantitative evaluation indexes for the technologies applied in libraries
  • Explored the theoretical logic, practical difficulties and future paths using artificial intelligence in libraries in China
  • Wrote research proposals for fund application and designed the research roadmap
  • Libraries TechnologiesPerformance AppraisalTechnology Life CycleAI in LibraryTheoretical Research

    Research on the Strategy of Cooperation and Information Resource Sharing among the Public Libraries in Xi'an Serving for Mass Innovation

    Research Assistant

  • Analyzed the data with SPSS Statistics, primarily with Chi-square Test and Rank-sum Test
  • Expatiated the Xi’an citizens’status quo of information need and real-life innovation
  • Proposed suggestions for public libraries to prompt mass innovation based on the data analysis
  • Information NeedPublic LibrariesInformation Resource Sharing

    Research on the Library Service Innovation Based on Social Media Users’ Information Behaviors

    Research Team Leader

  • Investigated on the online service effectiveness of libraries’ WeChat official accounts by conducting a comparative test on information retrieval, browsing, and interaction with 10 representatives
  • Analyzed the strategies on the application of WeChat Mini-Program in libraries using the SWOT-AHP method
  • Explored the user, information, and environment factors influencing information serendipity in the social media with Partial Least Squares Regression (PLS) method
  • Social MediaLibrary ServiceInformation BehaviorSerendipityWeChatSWOT-AHP