Benjamin Ma, Timothy Greer, Dillon Knox, and Shrikanth Narayanan. A computational lens into how music characterizes genre in film. PLOS ONE, 16(4):1–14, Public Library of Science, 04 2021.
Film music varies tremendously across genre in order to bring about different responses in an audience. For instance, composers may evoke passion in a romantic scene with lush string passages or inspire fear throughout horror films with inharmonious drones. This study investigates such phenomena through a quantitative evaluation of music that is associated with different film genres. We construct supervised neural network models with various pooling mechanisms to predict a film’s genre from its soundtrack. We use these models to compare handcrafted music information retrieval (MIR) features against VGGish audio embedding features, finding similar performance with the top-performing architectures. We examine the best-performing MIR feature model through permutation feature importance (PFI), determining that mel-frequency cepstral coefficient (MFCC) and tonal features are most indicative of musical differences between genres. We investigate the interaction between musical and visual features with a cross-modal analysis, and do not find compelling evidence that music characteristic of a certain genre implies low-level visual features associated with that genre. Furthermore, we provide software code to replicate this study at https://github.com/usc-sail/mica-music-in-media. This work adds to our understanding of music’s use in multi-modal contexts and offers the potential for future inquiry into human affective experiences.
@article{BenMa-journal.pone.0249957, doi = {10.1371/journal.pone.0249957} bib2html_rescat = {mica}, author = {Ma, Benjamin and Greer, Timothy and Knox, Dillon and Narayanan, Shrikanth} journal = {PLOS ONE} publisher = {Public Library of Science} title = {A computational lens into how music characterizes genre in film} year = {2021} month = {04} volume = {16} url = {https://doi.org/10.1371/journal.pone.0249957} pages = {1-14} abstract = {Film music varies tremendously across genre in order to bring about different responses in an audience. For instance, composers may evoke passion in a romantic scene with lush string passages or inspire fear throughout horror films with inharmonious drones. This study investigates such phenomena through a quantitative evaluation of music that is associated with different film genres. We construct supervised neural network models with various pooling mechanisms to predict a film’s genre from its soundtrack. We use these models to compare handcrafted music information retrieval (MIR) features against VGGish audio embedding features, finding similar performance with the top-performing architectures. We examine the best-performing MIR feature model through permutation feature importance (PFI), determining that mel-frequency cepstral coefficient (MFCC) and tonal features are most indicative of musical differences between genres. We investigate the interaction between musical and visual features with a cross-modal analysis, and do not find compelling evidence that music characteristic of a certain genre implies low-level visual features associated with that genre. Furthermore, we provide software code to replicate this study at https://github.com/usc-sail/mica-music-in-media. This work adds to our understanding of music’s use in multi-modal contexts and offers the potential for future inquiry into human affective experiences.} number = {4} }
Generated by bib2html.pl (written by Patrick Riley ) on Thu Apr 24, 2025 15:19:24