Representation of Professions in Entertainment Media

Insights into Frequency and Sentiment Trends

Project Description

Media acts as a mirror to society. Societal trends and culture dictate media narratives which informs and moulds our perception of the real world. We study media representation of professions and analyze its frequency and sentiment trends. We find that frequently mentioned professions employ more people.

Short Demo

Profession Taxonomy

The profession taxonomy contains 3 tiers: SOC Groups, WordNet synsets, and professions. The outermost circles are the SOC Groups. Click on a SOC circle to zoom in onto its synsets. Click on a synset circle to zoom in onto its professions. Click outside the circle to zoom out.

Subtitle Corpus

3.3M

Mentions

Professional Mentions

133K

IMDb titles

Movies and TV Shows

68

Years

Time Range

Method

We expand the SOC taxonomy by finding the noun WordNet synsets of the SOC professions, and include its synonyms and hyponyms. We map the profession and synsets in the new taxonomy to the major SOC groups.

We search mentions of job titles in the OpenSubtitles corpus. If the predicted sense of the job title phrase does not belong to the set of synsets of our expanded searchable taxonomy, we ignore the mention. We also remove mentions that are names of persons cast in the corresponding movie or TV show. The resulting subtitle corpus of professional mentions is automatically annotated with targeted sentiment labels (positive, negative, or neutral) using a BERT-based sentiment analysis model, which we trained on human-annotated subtitle sentences.

We compute the frequency and sentiment trends of top 500 occurring professions over time, and study the effect of media attributes, like genre, title type, and country of production, on these trends. We study the correlation between mention frequency and employment of the major SOC groups, and observe that professions that employ more people are also mentioned more in media content.

Visualization

Trends

For more information, find our arXiv paper here