We develop models, data and statistical analysis necessary to uncover the pervasiveness of stereotypical portrayals in characters’ actions at a scale.
To understand the influential relation between exposure to media content and accepted social norms and expectations through large-scale quantitative measurements of characters’ attributes (e.g., age, gender, or race)
Action descriptions
Distinct actions
Identified characters.
Our proposed SRL system. Starting at the bottom, the input to the system is an action description in natural language. The output, shown at the top of the figure, is a sequence of labels (one per word). Labels indicate whether this word is playing the role of action, agent, patient or none. From its inputs, our model obtains a highly-contextualized representation for each word using the BERT transformer. Each representation corresponds to a high dimensional dense vector that encodes the semantics of that word and the context it plays within the sentence. The sequence of vector representations is then fed into a recurrent neural network and a softmax layer for sequence labeling. As a post-processing step, a set of heuristics aggregate multi-word expressions to handle the case of groups of agents or patients.
We propose a statistical model to identify significant differences in the frequency of the action portrayals due to the role and gender of its participants. We use a Poisson-regression generalized linear mixed model (GLMM). GLMMs are an extension of generalized linear models (e.g., logistic regression) to include both fixed and random effects.
This work presents a novel large-scale analysis on the actions taken by the characters, and how these actions are related to gender-biases in media.