Benjamin Ma, Timothy Greer, Dillon Knox, and Shrikanth Narayanan. A computational lens into how music characterizes genre in film. PLOS ONE, 16(4):1–14, Public Library of Science, 04 2021.

Download

[PDF] 

Abstract

Film music varies tremendously across genre in order to bring about different responses in an audience. For instance, composers may evoke passion in a romantic scene with lush string passages or inspire fear throughout horror films with inharmonious drones. This study investigates such phenomena through a quantitative evaluation of music that is associated with different film genres. We construct supervised neural network models with various pooling mechanisms to predict a film’s genre from its soundtrack. We use these models to compare handcrafted music information retrieval (MIR) features against VGGish audio embedding features, finding similar performance with the top-performing architectures. We examine the best-performing MIR feature model through permutation feature importance (PFI), determining that mel-frequency cepstral coefficient (MFCC) and tonal features are most indicative of musical differences between genres. We investigate the interaction between musical and visual features with a cross-modal analysis, and do not find compelling evidence that music characteristic of a certain genre implies low-level visual features associated with that genre. Furthermore, we provide software code to replicate this study at https://github.com/usc-sail/mica-music-in-media. This work adds to our understanding of music’s use in multi-modal contexts and offers the potential for future inquiry into human affective experiences.

BibTeX Entry

@article{BenMa-journal.pone.0249957,
    doi = {10.1371/journal.pone.0249957}
    author = {Ma, Benjamin and  Greer, Timothy and Knox, Dillon and Narayanan, Shrikanth}
    journal = {PLOS ONE}
    publisher = {Public Library of Science}
    title = {A computational lens into how music characterizes genre in film}
    year = {2021}
    month = {04}
    volume = {16}
    url = {https://doi.org/10.1371/journal.pone.0249957}
    pages = {1-14}
    abstract = {Film music varies tremendously across genre in order to bring about different responses in an audience. For instance, composers may evoke passion in a romantic scene with lush string passages or inspire fear throughout horror films with inharmonious drones. This study investigates such phenomena through a quantitative evaluation of music that is associated with different film genres. We construct supervised neural network models with various pooling mechanisms to predict a film’s genre from its soundtrack. We use these models to compare handcrafted music information retrieval (MIR) features against VGGish audio embedding features, finding similar performance with the top-performing architectures. We examine the best-performing MIR feature model through permutation feature importance (PFI), determining that mel-frequency cepstral coefficient (MFCC) and tonal features are most indicative of musical differences between genres. We investigate the interaction between musical and visual features with a cross-modal analysis, and do not find compelling evidence that music characteristic of a certain genre implies low-level visual features associated with that genre. Furthermore, we provide software code to replicate this study at https://github.com/usc-sail/mica-music-in-media. This work adds to our understanding of music’s use in multi-modal contexts and offers the potential for future inquiry into human affective experiences.}
    number = {4}
}

Generated by bib2html.pl (written by Patrick Riley ) on Fri Oct 01, 2021 10:50:38