Matteo Ravasio

University of Auckland

Abstract         In this paper I discuss Stephen Davies’s defence of literalism about emotional descriptions of music. According to literalism, a piece of music literally possesses the expressive properties we attribute to it when we describe it as ‘sad’, ‘happy’, etc. Davies’s literalist strategy exploits the concept of polysemy: the meaning of emotion words in descriptions of expressive music is related to the meaning of those words when used in their primary psychological sense. The relation between the two meanings is identified by Davies in music’s presentation of emotion-characteristics-in-appearance. I will contend that there is a class of polysemous uses of emotion terms in descriptions of music that is not included in Davies’s characterization of the link between emotions in music and emotions as psychological states. I conclude by indicating the consequences of my claim for the phenomenology of expressive music.

1           Introduction

In this paper, I will discuss Stephen Davies’s defence of literalism about expressive properties in music. I will start with a brief outline of two opposite approaches, the metaphoricist and the literalist. As the claims defended by these two accounts are relative to a specific class of emotional descriptions of music, it is useful to specify the scope of the linguistic expressions in question. These are descriptions of music that attribute psychological properties to it, particularly properties related to the emotional sphere. Music is described as ‘sad’, ‘happy’, ‘despairing’, ‘joyful’, and so on, and these adjectives are presumably not meant to indicate the actual possession of emotional states by the music. Although agreement about these descriptions is hardly ever complete, all the evidence suggests that competent listeners reliably produce similar descriptions of the expressive qualities possessed by music. Davies’s defence of literalism is centred on the concept of polysemy. According to him, the use of emotion words in describing music bears a relation to the central use, in which these words denote psychological states, but it is not identical with it: emotion words refer in the musical case to the presentation of behavioural correlates of emotions. The use of emotion words in describing music is therefore not metaphorical, but polysemous and literal. I will argue that there is a class of polysemous uses of emotion terms in describing music that has not been analysed by Davies. I conclude by presenting the consequences of my claim for the phenomenology of music listening.

2        Metaphoricism vs. Literalism

Metaphoricists tend to couple the idea that descriptions of music as ‘sad’ or ‘happy’ are only metaphorical with a related anti-realist position concerning the metaphysics of expressive properties. For instance, Roger Scruton holds that emotional descriptions of music are ineliminable metaphors, and pairs this thesis with the claim that expressive properties are not real properties of music.[1] Nick Zangwill holds that emotional descriptions of music are metaphors that track the aesthetic properties of music and no other property.[2] Others, such as R.A. Sharpe, take emotional descriptions of music to be the layperson’s way of pointing to musical features.[3]

Against metaphoricists, literalists claim that emotion words applied to music constitute a case of literal use of language.[4] This view is normally paired with a form of aesthetic realism concerning expressive properties. In Davies’s case, the linguistic claim is linked to a metaphysical claim that construes expressive properties as response-dependent properties possessed by the music itself. Relative to a certain kind of experiencing subject (human listeners with a shared evolved nature), music can be legitimately described as possessing expressive properties.

3          The polysemy strategy

Davies’s literalist account appeals to the concept of polysemy. Polysemy is described by linguists as the association of a single phonological shape with two or more systematically interrelated meanings (e.g. ‘mole’ used to refer to both a small burrowing animal and a secret agent working undercover). It is normally contrasted with homonymy, which is the association of a phonological shape with two or more unrelated meanings (e.g. ‘bank’, which may refer to both the riverside and the financial institution). Davies contends that emotion words, when used in describing music, are not used to refer to emotions as psychological states. Instead, they refer to emotion-characteristics-in-appearance (henceforth ECA). These are behavioural correlates of emotions. Although we learn to recognize them from a concurrent actual emotion, to which they give expression, their subsequent recognition is independent from the occurrence of any actual emotion. We describe an actor’s face and posture as sad because we have learnt the sort of faces people pull when in the grip of sadness.

How does Davies explain the shift from actual emotions to ECA in the musical case? The polysemy strategy is grounded on a specific phenomenology of music listening: it is because we experience expressive music as presenting ECA that we may describe it with emotion words used in a secondary sense, related to the primary psychological one. Because this use is not an idiosyncratic use of emotion words in the psychological sense, but rather the use of a different polyseme with a related meaning, the main argument in favour of a metaphoricist strategy is rejected. This argument relies on the apparent impossibility of a literal use of emotion words in the description of inanimate entities such as pieces of music. This is taken to entail that either we are irrationally attributing psychological states to the music, or we are using emotion words metaphorically. Against this interpretation, Davies construes the expressive qualities we attribute to music as depending on the music’s presentation of behavioural correlates of emotions. This results in a use of emotion terms that, although related to the psychological use, differs from it. The emotion terms may therefore be literally applied to music and other inanimate objects (e.g. to a weeping willow presenting the typical downcast configuration of a body bent by sadness).

4          There is polysemy, and there is polysemy

The polysemous use of emotion words in music that Davies describes is based on our musical experience of ECA. According to Davies, music is experienced as resembling typical human emotional behaviour, and this is what grounds our use of emotion words in describing it. I will call this phenomenological foundation of the polysemous use of emotion terms in music the experiential basis of such use. Call the sort of polysemous use I just described ECA polysemy. My claim is that there is a kind of polysemous use of emotion words that is different from ECA polysemy. I call this secondary polysemy, with reference to Wittgenstein’s discussion of secondary meaning, to which I will briefly refer. I will present three examples of secondary polysemy. In these cases, I contend, the experiential basis cannot be traced back to the music’s presentation of ECA.

1) Underdetermined Musical Cues. To illustrate his contour theory, Peter Kivy discusses the well-known case of Monteverdi’s Lamento di Arianna.[5] One might concede that in this case the music bears a clear resemblance to the emotional prosody typical of sadness and despair. Other instances of expressive music might clearly appeal to our spatialized experience of music and recall the demeanour of people in the grip of emotions. But consider now the elated ascending glissando at the beginning of Gershwin’s Rhapsody in Blue. What kind of ECA does this glissando correspond to? Does it recall a vocal or a bodily human expression of emotion? Suppose we allow for both interpretations. The glissando could be heard as a musical presentation of either vocal or bodily behaviour. The trouble with this is that allowing ECA to be underdetermined as to their sense modality might result in an exceedingly ambiguous determination of the emotion expressed by the music. If we interpret the glissando as a rise in pitch, we might hear it as a piercing scream of pain. As a bodily behaviour, it might recall a body rising up as if freed from a burden. To go back to vocal behaviour, the glissando might also be considered analogous to the rise in pitch characteristic of happy or excited speech prosody. In contrast to the relatively determinate character of the emotional descriptions we might offer of this musical stimulus, the ECA we might associate with the glissando are unable to determine its emotional character, shifting as they do from the positive to the negative side of the emotional spectrum. These considerations suggest that the source of the glissando’s expressiveness is not to be found in its connection with the phenomenology of music listening outlined by Davies.

2) Timbre. ECA have an essential connection with emotions as psychological states: they acquire their secondary meaning of mere ‘appearances’ when they occur in isolation from those psychological states. Timbre, although expressive, often cannot be meaningfully interpreted as an ECA. A saturated, dark timbre, rich in overtones and using a low register, has a menacing character. It is natural, for the resemblance theorist, to look at resemblance to the speaking voice. Speaking in a raspy, dark voice, and using a low register can sound equally threatening. A raspy, dark voice, however, is not a behavioural correlate of an emotion: when it occurs in humans, it might give rise to what we are saying is a menacing and slightly ominous feel, but 1) it does not need to occur when people are in fact being menacing, nor does it often occur in these cases; and 2) it may occur regularly in the expression of other emotions, or when we are expressing no emotion at all (as some individuals simply happen to have a dark and raspy voice). Its expressive character seems independent of its occurrence in human expressive behaviour—something that is definitely not true of things such as prosodic cues or bodily behaviour. Whereas these assume their expressive character through their behavioural association with the expression of actual emotions, timbre does not need such an association.

3) The expressive character of harmonic intervals. I believe that the distinction between the two cases of emotional polysemy lies at the heart of various difficulties faced by resemblance theories of musical expressiveness. For instance, Peter Kivy realized from the outset that it was problematic for a resemblance-based account such as his contour theory to explain cases in which an expressive character is attributed to “isolate” musical features, that is, to bits of music that do not essentially occur as part of a melodic contour. This is a self-inflicted problem for a theory that seeks to explain musical expressiveness in terms of features that require a melodic contour or movement in tonal space. This problem is particularly evident in Kivy’s discussion of major and minor chords, the expressive quality of which he explains through conventional association.[6] Rather than accepting the questionable associationist strategy, one could consider the ascription of expressive quality to these musical entities as a case of secondary polysemy, in which there is no phenomenal resemblance between the music and ECA to justify our ascription.[7] Why should we accept this suggestion? In the lack of a plausible interpretation of chords and harmonic intervals in terms of ECA, I contend that secondary polysemy offers the best candidate for an account of the expressive character possessed by these musical entities.

Another example in favour of this strategy is represented by the tritone as harmonic interval. We speak of this interval as tense, and indeed we often use it to emphasize the yearning quality of the music when approaching a resolution. Yet the tritone’s resemblance to a tense voice (or to a tense body) is impossible to specify in terms of ECA. A tritone does not sound like a tense voice, because, for one thing, we can produce no more than one pitched sound at a time (unless appropriately trained). Neither could it resemble a tense movement, as hearing movement in music requires a succession of musical sounds, through which we hear movement in the musical space. In conclusion, what the tritone shares with a tense voice is nothing but its being tense, or rather, the fact that it seems to be aptly described as ‘tense’.

As anticipated, I believe that secondary polysemy is an instance of a sort of secondary use of words that has been described by Wittgenstein. Wittgenstein noticed that we talk about mental and physical ‘strains’. We do not find the use of the same word in the two circumstances puzzling, although we are at a loss when it comes to justifying the basis of this linguistic habit.[8] What grounds our use of the same word in these two cases? Wittgenstein denies that we can further specify the resemblance of mental and physical strains, except by pointing to the fact that we find the same word appropriate to describe both situations.[9]

In connection with these remarks, there is a passage in Davies’s most recent defence of his theory of musical expressiveness that is worth discussing.

“I think music is expressive in recalling the gait, attitude, air, carriage, posture, and comportment of the human body, just as someone who is stooped over, dragging, faltering, subdued and slow in his or her movements cuts a sad figure, so music that is slow, quiet, with heavy or thick harmonic bass textures, with underlying patterns of unresolved tension, with dark timbres, and a recurrently downward impetus sounds sad.”[10]

Here Davies mentions various “first-order” properties of music that make it possible to hear it as expressive. After mentioning melodic and rhythmic aspects (which in the passage are paired with behavioural correlates of emotions), he goes on to list ‘dark timbres’ and ‘unresolved tensions’, using a language that is characteristically ambiguous between an interpretation in terms of musical properties and one in terms of psychological properties. Music, he says, is expressive in recalling behavioural correlates of human emotions. It is clear how musical movement might recall the gait, carriage, and posture of people in the grip of emotions. Through the movements of a melodic line in tonal space, music may resemble, for instance, the figure cut by an individual who is expressing sadness. This is the experiential basis for the polysemous use of emotion words identified by ECA polysemy. But how do dark timbres and unresolved tensions secure the recalling of expressive behaviour? Dark and menacing timbres, I have argued, do not receive their expressive quality from their behavioural correlation with the expression of emotions. Unresolved tensions, as exemplified by the case of the tritone, are hard to hear in terms of vocal behaviour. Neither can they be interpreted as bodily behaviour, as their tense character is in principle independent from any “horizontal” musical movement (which seems required for the experience of movement in the tonal space).

Dark timbres and musical tensions, I contend, relate to sombre mood and psychological tension just as physical and mental strains relate to each other in the Wittgensteinian case discussed above. A view that relates musical properties to human emotion-characteristics-in-appearance in order to explain their expressiveness is bound to leave a number of basic cases unexplained, in which music is experienced as possessing expressive or psychological properties.

5          ECA polysemy and secondary polysemy: A sketch for a general distinction

I have so far pointed to a number of examples to suggest that our use of emotion words and other psychological predicates in describing music falls into two different categories: ECA polysemy and secondary polysemy. I have not yet proposed, however, a general criterion to distinguish the two cases. What is the best way to flesh out this distinction? I suppose that this could be done in a number of ways, but I wish to focus here on two aspects: justifiability and resemblance.

1) Justifiability. In the domain of emotional description of music, secondary polysemy differs from ECA polysemy because of differences in what grounds the use of emotion words in the two cases. Whereas ECA polysemy is grounded on Davies’s phenomenological analysis of expressive music, according to which “we experience music as presenting emotion characteristics in its aural appearance”,[11] secondary polysemy cannot be justified in the same way. That is, although we might be able to specify how a slow descending melody can constitute a musical appearance of a body bent and slowed down by sadness, we are not able to do so in cases such as those mentioned above. It is worth noting, in passing, that we might sometimes be able to suggest a causal explanation for cases of secondary polysemy. Suppose we discover that a timbre’s capacity to suggest a particular mood is related to its particular constitution in terms of upper harmonics and its effect on us. This might explain our related use of certain emotion words, but does not constitute a justification of the use in question.[12] Davies correctly identifies the experiential basis of ECA polysemy. This use of emotion words is based on the musical presentation of emotional-characteristics-in-appearance. Secondary polysemy, however, has a different experiential basis, and is therefore differently grounded.

2) Resemblance. Another way to describe the experiential basis of ECA polysemy is to say that music is experienced as resembling human expressive behaviour. This is how we come to hear music as presenting ECA, and it is because of this that theories of musical expressiveness such as Davies’ have been labelled ‘resemblance theories’. Secondary polysemy cannot be analysed in terms of an experience of resemblance with human expressive behaviour. More accurately, we might speak of a ‘resemblance’, or ‘recalling’, but this will have to be a different sort of resemblance to the resemblance in outline and dynamic structure that grounds ECA polysemy. This suggests that the distinction between the two cases of polysemy could be specified by developing a taxonomy of musical resemblance to extra-musical objects that distinguishes between the two cases.

6          Consequences for the phenomenology of expressive music

Davies’s polysemy strategy is linked to a particular phenomenology of music listening. This phenomenology provides the experiential basis on which the polysemous use of emotion terms in descriptions of music is grounded. My analysis of the kinds of polysemy involved in the relevant emotional descriptions of music suggests that there is a class of descriptions that does not acquire its polysemous status from the phenomenology outlined by Davies. Secondary polysemy undermines the argument according to which the literal use of emotion words in describing music is based on their reference to ECA. The experiential basis on which secondary polysemy is grounded must be of a different sort.

If all this is correct, I take it that it will not be without consequences for an account of the phenomenology of musical expressiveness. According to Davies, the expressiveness of music depends on its presentation of emotion-characteristics-in-appearance. He ties his defence of literalism to a phenomenology of musical expressiveness that entails a necessary reference to human expressive behaviour. I have suggested that a reference to human expressive behaviour is not necessary for the experience of expressive properties in music. A more liberal account of the connection between the use of emotion words in a psychological sense and their derivative musical use is therefore paired with a more liberal, thinner phenomenology of music listening. This is, I submit, a subject that deserves further investigation.

About The Author  Matteo Ravasio has a B.A. and M.A. in Philosophy from the Università degli Studi di Milano. He is currently a PhD candidate at the University of Auckland, New Zealand. His main research interests are in the philosophy of music, environmental aesthetics, and everyday aesthetics.


[1] Scruton 1997, p. 154.

[2] Zangwill 2007, p. 392.

[3] Sharpe 1982, p. 81.

[4] A sophisticated and somewhat unorthodox literalist position is the one held by Saam Trivedi (2008). As Davies points out, arousal and expression theories of musical expressiveness also tend to result in a literalist reading of emotional descriptions of music. According to these theories, words such as ‘sad’ and ‘happy’, when applied to music, refer to emotional states experienced by, respectively, the listener and the composer (Davies 2011, pp. 23–24).

[5] Kivy 1980, p. 20.

[6] Ibid., p. 77.

[7] Davies observes, however, that “chords are not merely aggregates of pitched tones” (Davies 2011b, p. 102). Rather, chords are perceived in terms of harmonic function. The tones of which the chords are composed are part of melodic lines that possess their own contour. This line of reply could allow Davies to argue in favour of a complete assimilation of the major/minor chords case to his standard explanation of musical expressiveness.

[8] “‘But why do you speak both of physical & mental ‘strain’?’—‘Because they have a certain similarity, they have a common element.’ What is this common element?’ […] ‘It is a certain tension.’ That doesn’t get us any further, for why do you talk of tension in these different cases.” (Wittgenstein, Ms 105, 14).

[9] Discussing these remarks by Wittgenstein, Michel ter Hark observes: “This does not mean that to speak of similarity here is forbidden. If the point is to emphasize a difference with word ambiguity where two items are referred to by the same word, e.g. ‘bank’, but not in virtue of any similarity, it might be even relevant. The point rather is that if the similarity cannot be specified, it cannot be cited as a justification of one’s use of words” (ter Hark 2009, p. 600).

[10] Davies 2006, p. 182.

[11] Davies 2011, p. 26.

[12] In a German formulation of the remarks discussed above, Wittgenstein considers and rejects a possible causal explanation for the use of ‘strain’ in both the physical and mental domain: “‘Nun weiß ich auch warum ich das immer “eine Spannung” nennen wollte.’ (Ich habe etwa herausgefunden, daß dabei gewisse Muskeln gespannt sind. — Aber das wußte ich eben nicht als ich geneigt war es Spannung zu nennen.)” (Wittgenstein, Ms150, p. 12) [‘I now know why I always wanted to call it “a strain’”. (I have found out that whenever that happens, certain muscles become tense.—But this is exactly what I did not know when I was inclined to call it a ‘strain’)] (my translation).