PROMETEICA - Revista de Filosofia y Ciencias. 2025, v. 32
Artículos
https://doi.org/10.34024/prometeica.2025.32.19603
THE THEORY-LADENNESS OF PERCEPTION ARGUMENT CRITERION
CLASIFICANDO LAS TEORÍAS DE LA INTERPRETACIÓN PICTÓRICA
El criterio de la carga teórica de la observación
CLASSIFICANDO AS TEORIAS DA INTERPRETAÇÃO PICTÓRICA
O critério da carga teórica da observação
Juan Pablo Aguilar Martínez
(Universidad Iberoamericana Puebla, Mexico)
juanpabloaguimar@gmail.com
Recibido: 14/10/2024
Aprobado: 22/04/2025
ABSTRACT
Analytic philosophy traditionally uses two classification criteria to distinguish theories of pictorial interpretation: mimetic or illusionistic, and natural or non-natural. In this paper, I propose an alternative and previously unconsidered criterion to classify these theories, based on the type of theory-ladenness argument they employ, whether it is strong or weak. I defend that this third criterion offers four advantages over traditional approaches: (i) it allows a clear distinction between theories of image interpretation that could not be differentiated using previous criteria; (ii) it enables the association of theories previously considered distinct;
(iii) it facilitates the recognition of the epistemic and ontological commitments of the theories, allowing a more precise appreciation of their strengths and weaknesses; and (iv) it provides a basis for using literature on the architecture of the mind to argue for or against contemporary theories of image interpretation.
Keywords: image interpretation. cognitive penetration. theory-ladenness of perception. analytic aesthetics.
RESUMEN
En la filosofía analítica tradicionalmente se emplean dos tipos de clasificaciones para distinguir las teorías de la interpretación pictórica: miméticas o ilusionistas; y naturales o no naturales. En este artículo propongo un criterio alternativo y no considerado para clasificarlas, basado en el tipo de argumento de carga teórica que emplean, ya sea fuerte o débil. Defiendo que este tercer criterio ofrece cuatro ventajas sobre los enfoques tradicionales: (i) permite distinguir claramente teorías de la interpretación de imágenes que no podían diferenciarse con los criterios previos; (ii) posibilita la asociación de teorías que anteriormente se consideraban distintas; (iii) facilita el reconocimiento de los compromisos epistémicos y ontológicos de las teorías, permitiendo así una apreciación más precisa de sus
fortalezas y debilidades; y (iv) proporciona una base para utilizar la literatura sobre la arquitectura de la mente para argumentar a favor o en contra de las teorías contemporáneas de la interpretación de imágenes.
Palabras clave: interpretación de imágenes. penetración cognitiva. carga teórica de la observación. estética analítica.
RESUMO
Na filosofia analítica, tradicionalmente se empregam dois tipos de classificações para distinguir as teorias da interpretação pictórica: miméticas ou ilusionistas; e naturais ou não naturais. Neste artigo, proponho um critério alternativo e não considerado para classificá- las, baseado no tipo de argumento de carga teórica que empregam, seja ele forte ou fraco. Defendo que este terceiro critério oferece quatro vantagens sobre as abordagens tradicionais:
(i) permite distinguir claramente teorias da interpretação de imagens que não podiam ser diferenciadas com os critérios anteriores; (ii) possibilita a associação de teorias anteriormente consideradas distintas; (iii) facilita o reconhecimento dos compromissos epistêmicos e ontológicos das teorias, permitindo assim uma apreciação mais precisa de suas fortalezas e fraquezas; e (iv) fornece uma base para utilizar a literatura sobre a arquitetura da mente para argumentar a favor ou contra as teorias contemporâneas da interpretação de imagens.
Palavras-chave: interpretação de imagens. penetração cognitiva. carga teórica da observação. estética analítica.
The ambiguity of images poses a problem for philosophical theories that aim to explain pictorial interpretation. For example, a photograph of a boxer can be interpreted as a representation of a man in a specific posture, as well as the one of a robust, short-haired man (cf. Wittgenstein, 1953; §22).
Throughout history, art theory and cultural studies have developed various theories of pictorial interpretation (henceforth PI) to explain why different observers opt for one interpretation over another or why they agree or differ in their interpretations. Analytic philosophy has also addressed the problem of PI by expanding the unit of analysis to include any figurative image, even those outside the artistic realm (e.g., children's or animal’s drawings, photographs, or scientific diagrams), and by considering the problem of PI as an instance of the problem of ‘underdetermination’. From this perspective, PI is the process of formulating a hypothesis about the image's semantic content, and explaining this process involves clarifying why one hypothesis is preferred over others that are consistent with the same data (cf. Stanford, 2021).
Various pictorial interpretation (PI) theories and classification criteria have recently emerged in analytic aesthetics and philosophy discussions. Among the most orthodox classification criteria are those that distinguish between theories that hold that PI is a process of capturing objective resemblances and those that reject this hypothesis (Hyman, 2006), and those that propose that image interpretation is a "natural" process versus those that do not accept this idea (Carroll, 1999, 2003).
In this article, I propose an alternative method for classifying PI theories. I base the criterion on distinguishing them according to the type of theory-ladenness of perception argument (henceforth TLPA) they use to address the contrastive underdetermination posed by images. I will begin by describing (Section II) the two most orthodox criteria proposed in the literature on PI. In Section III, I will present my proposal, which distinguishes PI theories according to whether the TLPA they defend is weak or strong. In Section IV, I will argue why this criterion allows to differentiate and group theories that, under other criteria, would not be clearly distinguished or associated, as well as to identify more
efficiently some of the epistemic and ontological commitments of the theories, along with their strengths and weaknesses. Finally, in Section V I suggest that this criterion facilitates understanding why exploring other discussions, such as the architecture of the mind, is necessary to decide on specific PI theories.
John Hyman (2006) proposes the first orthodox classification criterion, distinguishing between "mimetic" theories and illusionist or psychological theories (Hyman, 2006; p. 69).
Mimetic theories hold that an image is a marked surface that imitates the visible shapes and colors of the objects it represents, and that we interpret it by capturing this resemblance. This position can be traced back to Plato. Plato claims that a picture represents an object by copying its color and form (Hyman, 2006, p. 61).
Up until the 19th century, art critics like John Ruskin defended this theory to explain image representation, arguing that the artist should aspire to have an "innocent eye" to capture the actual visual appearance of objects as they are, eliminating his biases (Gombrich, 1960, p. 228). Today, authors such as Kendall Walton (1984), John Kulvicki (2013), and Laura Perini (2015) still defend certain aspects of mimetic theory from the perspective of aesthetics and analytic philosophy to explain how certain types of images are represented and interpreted.
For example, according to Walton (1984), photographs are "transparent" in the sense that they allow us to see directly what they represent through them (without implying that seeing something through a picture is the same as seeing it without the photograph) (Walton, 1984, p. 211). Walton supports that when observing a photo or video of a person or event, we see that person or event even if those people have passed away or if those events are not broadcast "live" (Walton, 1984, p. 251).
Walton considers some objections to his position. For example, if someone argues that we do not perceive objects in photographs because the events captured occurred in the past or because we perceive them through an instrument, Walton replies that the same happens with the perception of stars through a telescope (Walton, 1984, p. 252). He emphasizes that, despite this, people do not commonly assert that they do not perceive what they observe through them. What telescopes, microscopes, mirrors, and photographs all share, according to Walton, it’s the fact that whatever we perceive through them is or was in the environment; if not, we could not observe it through them.
Rather than focusing on criticizing or defending Walton's position, I want to highlight that his proposal can be classified as a mimetic theory, because if we define resemblance as a relationship that something can have objectively with itself, and if we concede that what we see in a photograph is the same object it represents (even though seeing an object is not the same as seeing it through a photo), then it follows that what we observe in a photographic image is objectively similar to what it represents. Therefore, if we interpret Walton as accepting that these resemblances determine the photograph's content, then his proposal would be a mimetic theory.
It is interesting to note that, despite Walton’s thesis being highly controversial from a philosophical perspective, some aesthetic or film theories defend or assume the same position (Cf. Bazin, 1960a, Cf. Sontag, 1966).
Another mimetic theory recently defended in philosophy is that of Laura Perini (2015), who argues that some scientific representations are objectively similar to what they represent. Perini illustrates her case with "Lewis structures" where, although the letters referring to the atom are arbitrary about what they
represent, the position of the dots is not, since under certain conditions they could indicate whether the represented elements are sharing electrons or not. Although understanding the conventions of the structure's use is essential to comprehend what the Lewis diagram represents, this does not mean, according to Perini, that it is not interpreted based on the assumption that it has objective similarities with its referent. In this case, the objective resemblance between the structure and its referent lies in the number of represented elements and their organization (i.e., the "informal isomorphism" between them). By accepting this resemblance, we interpret it.
John Kulvicki (2013) also argues that some images are objectively similar to what they represent. Kulvicki distinguishes between the bare-bones content of an image (some of the abstract properties of the image) and the fleshed-out content (e.g., people, animals, furniture that we see in the image) (Kulvicki, 2013; p. 538). The abstract properties that Kulvicki describes as bare-bones content are those syntactic-semantic characteristics that remain in copies of the images and do not remain in the linguistic description of the image or a text (Kulvicki, 2013; p. 542). For example, in the case of a photograph of a horse and a photograph of a photograph of a horse, the fleshed-out content of both images is not the same: the first photograph is of a horse; in contrast, the fleshed-out content of the second photograph is a photograph. However, the bare-bones content of both images is the same: the same patterns of light, darkness, and color are represented in both images (Kulvicki, 2013; p. 542).
According to Kulvicki, (authentic) images are only those susceptible of sharing their bare-bones content with copies in the same format of themselves. Under this criterion, the pictorial representation provided by a Magnetic Resonance Image (MRI) is not an image because it does not meet this condition, since an MRI cannot be made from another MRI (Kulvicki, 2013; p. 544). However, an audio recording of a voice is.
How do we interpret a pictorial representation? It depends on whether it is an authentic image —i.e., if it can share its bare-bones structure with a copy—and whether it is a visual image (i.e., not a non-visual image, such as sound recordings). According to Kulvicki, if it is a visual image, then we can assume that part of its bare-bones content is visual and that this content objectively shares abstract properties (shapes, relationships, links, ratios) with the object it represents (Kulvicki, 2013; p. 543). In other words, Kulvicki suggests that since we assume that visual images share bare-bones content with what they represent, we can interpret them in a consistent way with that premise, regardless of how we interpret their fleshed- out content. If this reading of Kulvicki’s position is correct, then his proposal can be characterized as a mimetic theory because it implies that pictorial interpretation consists of distinguishing which are the abstract similarities between the representation and the object represented.
Illusionist theories propose that "a picture is a marked surface that produces the experience that is normally caused by seeing an object of the kind depicted in the image” (Hyman, 2006, p. 60). These theories are closely related to modern theories of perception (e.g., Descartes’ Optics) in two interesting ways: they accept that an external stimulus triggers the constructed representation (by the mind or by the artist), and they do not commit to the existence of an objective resemblance between the representation and what is represented (cf. Caldarola, 2011, p. 30). Hyman calls these theories “illusionist” because they hold that the perceived similarities between the image and the represented object are illusions of the viewer (Hyman, 2006).
Ernst Gombrich’s (1960) proposal is one of several "illusionist" theories of PI. Gombrich compares interpreting images to the process of conjectures and refutations described by Karl Popper in science (1984, pp. 85, 219, 222). According to Gombrich, the interpreter attributes content to the image guided by a schema until encountering an obstacle, after which they seek another schema to interpret it (Gombrich, 1984, p. 60, 219-220). In this proposal, interpretation occurs before the observation of the image. Therefore, we do not interpret the image because it objectively resembles what it represents, but
rather we interpret it because it fits the schema we have about what it represents; thus, the interpretation can vary depending on the proposed schema.
In analytic philosophy, authors such as Goodman (1968), Wollheim (2012), and Stokes (2014) could be regarded as defending illusionist positions.
The distinction between mimetic and illusionist theories groups together theories that should be more clearly distinguished. For example, although Perini’s (2015) theory is mimetic in one sense, it acknowledges that interpreters also rely on their knowledge of conventions to interpret diagrams — conventions that guide them in focusing on the properties the image shares with its referent. This affirmation applies equally to diagrams, figurative drawings, or black-and-white photographs (Perini, 2004, p. 41). In this sense, it is very different from Walton's (1984) mimetic theory on the interpretation of photographs, which suggests that we understand a photographic image without needing any rule to guide us on what we should attend.
Furthermore, the mimetic/illusionist distinction overlooks theories that neither commit to the idea that the image resembles what it represents nor claim that the viewer's experience of the image enables its interpretation. For example, the following section describes some psychological approaches that may suggest images are interpreted through subpersonal mechanisms (outside of experience).
On the other hand, the mimetic/illusionist distinction separates theories that, maybe, should be grouped. For example, under this perspective proposals like those of Walton (1984) or Schier (1986) would be put aside from the one of Gombrich (1960). However, the proposals of all these authors are similar in that they suggest we have a natural mechanism that determines pictorial interpretation, contrasting with those that claim image interpretation is fundamentally a cultural matter, such as Goodman’s (Carroll, 1999, p. 40).
A second orthodox criterion for classifying PI theories I intend to describe is proposed by Noël Carroll (2003, 1999, pp. 39-50). According to this criterion, we can differentiate between 'naturalist' theories and those that compare image interpretation to language interpretation.
Naturalist theories assert that humans acquire the ability to recognize images at the same time we acquire the ability to recognize the objects they represent (Carroll, 2003, p. 19). Moreover, this perspective holds that our "natural abilities" allow us to distinguish the image as an image, so we do not identify it with the object it represents (Carroll, 2003, p. 19).
Noël Carroll (1999), Flint Schier (1986) and Susan Sontag (1966) defend or assume this position. This perspective can also be attributed to those who defend the orthodox intuition that photographic images are connected inherently (i.e., not conventionally) to what they represent, as Eastman, Stieglitz, Emerson, Bazin, and, from a philosophical standpoint, Walton (1984), Scrutton (1981), and Currie (1990) did (Linares, 2018, p. 217). Furthermore, in recent decades, several experiments in experimental psychology have been interpreted as supporting naturalism.
For example, in a study by Jean Dirks and Eleanor Gibson (1977), infants up to five months old were habituated to a "live" face. They found they did not become dishabituated to a color photograph of the same face if presented immediately. Still, they did become dishabituated if the image showed another
person's face. The study was replicated using dolls and their photographs. The authors concluded that infants of that age could identify the content of photos without prior training.
Judith DeLoache (1998) also conducted experiments providing systematic and robust evidence that humans can identify the content of photographs or some images, even without prior experience. DeLoache assessed the performance of infants aged 8 to 18 months in communities with no experience with images and found that these infants, like middle-class American infants of the same age, manipulated images in ways that attempted to interact with the represented objects as they would with real objects. This finding suggests that they identified the content of the photos. Animal cognition and primatology researchers have conducted similar experiments (cf. Fagot, 2010).
The results of these experiments are used to support the naturalistic thesis, as they suggest that PI is not learned culturally, but rather acquired simultaneously with the ability to recognize objects.
Naturalism is also present in discussions about representation and PI outside philosophy and psychology. In film theory, André Bazin seems to assume the premise that we interpret images using the same natural resources with which we observe reality (the naturalist thesis) to argue that cinematography that employs techniques (of photography and editing) that make the sequence of images more akin to direct observation of reality will be better interpreted (cf. Bazin, 2004, pp. 30-33).
In contrast to naturalist theories, Carroll (2003, p. 14-15) describes theories that compare image and language interpretations. He distinguishes two central positions within these theories: those that suggest the "compositionality" of images and those that appeal to conventions.
These theories compare images to language, assuming that images acquire meaning similarly to sentences: from the parts they consist of and the order in which they are organized.
Some theories in the field of cinema and art have defended or implied this position. For example, Lev Kuleshov and Sergei Eisenstein argue that cinematic content is created from its parts and that the same parts can generate different meanings depending on how they are organized (cf. Kuleshov, 1929/1974, pp. 42-55; Eisenstein, 1949, pp. 52-60). Pudovkin, Kuleshov’s disciple, highlights that Kuleshov was the first to describe cinema as an "inarticulate alphabet" and the filmmaker as someone who "speaks" through images.
In art criticism, John Berger starts from the same premise, asserting that images are like language because the order in which they are presented produces different meanings (cf. Berger, 1972, p. 23-29).
The main problem with the compositionalist position is the asymmetries between language and images: for instance, images do not have discrete parts with defined functions, nor do they have inviolable rules for ‘forming well-formed images.’ Carroll argues that compositionalism has not been seriously defended given these differences and has only been taken as a metaphor (Carroll, 2003, p. 14).
However, recent proposals have defended this position in philosophy. Ben Blumson (2014), for instance, suggests that some images, such as maps, can be interpreted through a compositionalist theory according to the content stipulated for its parts. From his perspective, what a map represents depends on what its atomic regions represent, and the meaning of these atomic regions is based on what the colors and shapes that compose them represent. Thus, the colors on the map function as predicates in a sentence, and the regions are interpreted as names.
Blumson responds to the objection that no defined method determines the constituent parts of images by arguing that some of the criteria used to determine the most appropriate theories for segmenting sentences in languages can also help decide between competing theories on how to divide images into parts and reveal how these parts contribute to the complete representation of an image (Blumson, 2014, p. 100).
One of the criteria Blumson finds helpful in deciding between theories that divide images into different parts is "structural constraint." In the context of images, this constraint suggests that the axioms of an image P imply what P represents if and only if what P represents can be rationally inferred from what the axioms represent. Blumson does not explicitly define "rational means," but from his text, we can assume that they include deductive, inductive, and analogical reasoning.
Due to space constraints, I will not elaborate further on Blumson's argument, but I want to emphasize that the "compositionalist" position has not only been used as a metaphor or superficially in film and visual arts theories but has also been proposed in philosophy with explanations that aim to be more comprehensive.
The second “non-naturalist” position distinguished by Carroll is the conventionalist one. Conventionalist theories emphasize that images acquire content or are understood through conventions, which correlate visual configurations with objects or events and can vary over time or from place to place (Carroll, 1999,
p. 40). This position explains different pictorial styles and interpretations of the same images by appealing to the variability of codes for creating and interpreting images. Conventionalism can also explain why many images are understood similarly, asserting that this is due to the cultural diffusion or inculcation of specific codes and conventions worldwide (Carroll, 2003, pp. 17-18).
The conventionalist stance can be traced to art theorists such as Roman Jakobson and Leo Steinberg in the early 20th century (Hyman, 2017) and semioticians like Umberto Eco.
In analytic philosophy, Nelson Goodman (1968) advocates this position. Goodman argues that people can describe objects in as many correct ways as there are proper ways to describe them (e.g., a cluster of atoms, a complex of cells, a violinist) (Goodman, 1968, p. 6). He also explains that descriptions, whether linguistic or pictorial 'labels,' are correct or incorrect according to the system of representation to which they belong (Goodman, 1968, p. 37-38). For Goodman, when we assert that an image is, for example, of a mermaid and not Pegasus, we are indicating that it is included in the extension of the terms ‘images of mermaids’ and not in ‘images of Pegasus,’ although, strictly speaking, these two images represent the same thing: the empty set (Goodman, 1968, p. 21). Goodman supports that the same applies to images representing indeterminate entities or those that do exist: we categorize them with labels not because they belong to a natural class but because these categories are consistent with our theoretical and pragmatic interests or with our systems of beliefs, which may be generated by stipulation or habit, in the same way that furniture is classified as desks, tables, chairs (Goodman, 1968, pp. 36-38).
Ernst Gombrich (1960) and Catherine Abell (2005) have defended and described, respectively, different conventionalist theories from Goodman’s.
Although the distinction between naturalist and non-naturalist theories can help us differentiate between theories like Goodman’s (non-naturalist) and Plato's (naturalist), it also generates some questionable groupings.
For example, according to this classification, Gombrich's and Goodman's theories would fall into the same category. Both emphasize the role of conventions in determining the pictorial content of images.
However, these theories differ significantly in aspects that a classification criterion for theories of PI should consider. While Gombrich proposes that interpretations are guided by objective factors such as the limitations inherent in the psychology of perception or the visual properties of the image, Goodman's theory suggests that the correctness of an interpretation depends on the type of (stipulated) system to which it belongs. If we consider it is important to distinguish between theories that assert that interpretation is constrained only by cultural factors from those in which the visual properties of the image play a role, proposing a third criterion for classification is necessary.
The criterion I propose is to differentiate theories based on the type of TLPA they support. In general terms, TLPA makes two claims: first, that viewers of images can have different propositional attitudes, and second, that these attitudes "influence" the viewer's perception of the image. I propose classifying theories based on whether they use a weak or strong TLPA. The difference between the types of TLPA lies in how the argument describes the nature of the influence that propositional attitudes exert on the perception of images.
The idea behind the strong TLPA is that the viewer's attitudes influence the phenomenic visual experience of the image (from now on, visual experience) and, consequently, its interpretation. For example, Daniel T. Levin and Mahzarin R. Banaji (2006) designed a series of experiments whose results suggest that racial categorization can impact the perception of lighting in facial images.
In one of their experiments, they created a continuous map of faces, ranging from a prototypical white face to a prototypical African American face, while matching the lighting and reflectance of all. When “asking” participants which stimulus was darker and which was lighter through tasks involving matching faces to adjustable colors, colors to adjustable faces, and faces to adjustable faces, participants “responded” that the images of prototypical black faces appeared darker than they were, and the pictures of prototypical white faces appeared lighter than they were. The bias was also reported with uniformly colored faces, suggesting that it was not due to participants focusing on certain parts of the images.
Translating the conclusion of these experiments to the issue of PI, strong TLPA argues that if it is conceded that the interpretation of an image is determined by the visual experience one has of it, and if it is also accepted that visual experience changes according to the propositional attitudes of the interpreter, then it must be accepted that interpretation changes according to the propositional attitudes of the interpreter.
We should note that strong TLPA claims that propositional attitudes modify the observer's phenomenical experience, even when the observer attends to the same properties of the image or relationships between them. We also refer to this type of argument as the Cognitive Penetration argument.
Following Athanassios Raftopoulos (cf. 2015, p. 89-90), we consider the Cognitive Penetration argument a type of Theory-Ladenness of Perception Argument (TLPA), as it asserts that propositional attitudes influence observation. I believe it is a strong TLPA because it maintains that this influence occurs without necessarily affecting the states of the vehicles of perception, such as eye movement or attention (Raftopoulos, 2015, p. 89).
Various authors have used strong TLPA to discuss pictorial interpretation (PI) in art and science. Thomas Kuhn (1962) drew on the New Look psychology literature to point out that beliefs can modify the content of what is perceived. Before Kuhn, Norwood Hanson (1958) suggested that scientists' visual experience
—whether of images in scientific texts or what they observe through microscopes and telescopes— is influenced by their propositional attitudes.
One of Hanson's arguments is that when there is a difference in the reports of two scientists about their observations (including the observation of images), this difference is often not attributable to post- perceptual judgments for the following reason: when, for example, observing a Necker cube, the cube is seen one way or another instantaneously (Hanson, 1958, pp. 9-10). Since, for Hanson, post-perceptual judgments take time, he concludes that there are cases where the difference in what is perceived cannot be attributed to such judgments. Therefore, from his perspective, we must accept that in some cases, differences in the image we see are due to having a different visual experience of the same stimulus (Hanson, 1958, pp. 11-12). Hanson proposes that what determines visual experience is the organization of the elements, which depends on the context and the propositional attitudes of the perceiver (cf. Hanson, 1958, pp. 15-16, 20-21).
In addition to Hanson, Kuhn, or Levin and Banaji (2006), other positions in psychology and philosophy that reflect on images use a strong TLPA to explain their interpretation (e.g., Stokes, 2012; Nanay, 2015; Witzel, 2016). These positions share the assumption or use the argument that the interpreter's attitudes modify the visual experience of the image and, consequently, the interpretation of it.
David Hume proposes a weaker TLPA. Reading "Of the Standard of Taste" (1909) as a hypothesis about pictorial interpretation, his proposal explains why different people can interpret the same image differently due to the theory-ladenness of perception. According to Hume, there are no objective criteria for judging the beauty or ugliness of a work of art, but there are certain standard practices and general principles that have proven effective in obtaining positive evaluations from most people. These principles include not allowing sympathy or antipathy for the author to affect the appreciation of the image, taking time to perceive the details, comparing the work with others of the same genre, attempting to create similar works, and considering the historical context in which the work was created. Hume also suggests that "common sense" can help understand what the artist wants to communicate with their work.
Viewing Hume’s proposal as a theory of interpretation, he provides a list of recommendations for the perceiver of an image to focus their attention (direct their perception) on the appropriate elements to interpret it correctly. For example, by recommending setting aside our antipathy towards the author of an image, he suggests that our emotions might lead us to focus on specific visual properties of the image that could cause us to interpret it in a way inconsistent with the author's intentions. Read in this way, Hume’s thesis is a TLPA because it proposes that various factors can bias the interpreter's observation and that only through education and training can one avoid focusing on irrelevant aspects for the appreciation of the image. However, it is a weak TLPA because his position does not imply that attitudes modify visual experience without modifying attention (direction of perception).
Kendall Walton (1970) also offers a weak TLPA in his explanation of image interpretation, as opposed to radical anti-intentionalist theories. Walton’s argument is that if we accept that the aesthetic properties of a work of art depend on its perceptual properties, and if categorizing an image (by knowledge of art) one way or another promotes attention to specific perceptual properties over others, then we must recognize that art knowledge promotes attention to some aesthetic properties over others. Walton provides the example of a static image of a human being in apparent movement: if viewed from the category of photographs, it might be thought to represent a person in motion, but it would not be interpreted this way if viewed from the category of cinema, where it would be thought to represent a human in a static posture (Walton, 1970, p. 346).
Walton’s position is a weak TLPA because it does not comprise the view that conventions modify the visual experience of what is observed without first modifying attention.
The distinctions between mimetic and subjectivist theories, and naturalist and non-naturalist theories, fail to accurately classify certain specific theories that can be characterized by the type of TLPA they employ.
For example, from a mimetic-illusionist perspective (Hyman, 2006), Gombrich’s proposal (as argued in 2.4) is not easily distinguishable from proposals like those of Goodman (1968), Hanson (1958), Kuhn (1962), or Nanay (2015). Moreover, from the naturalist/non-naturalist criterion, it is not easily classifiable, as although Gombrich acknowledges the cultural weight in the creation of the “schemata” that guide interpretation, he also gives substantial weight to the constraints inherent to perception, such as "gestalt" principles (Gombrich, 1984, p. 222).
In contrast, using the strong/weak TLPA criterion makes it easier to classify and distinguish it from similar proposals. Since Gombrich does not argue that interpreters' perception or visual experience changes according to the schema they follow but only asserts that this schema guides the interpreter's observation, his position can be classified as a weak TLPA, in contrast to those of Hanson or Goodman, which use a strong TLPA.
Perini’s (2005) approach is also not easily accommodated within the mimetic/illusionist criterion. Her theory might be classified as non-mimetic because it proposes that the perceived similarities between an image and what it represents depend on the interpreters' knowledge of conventions. However, Perini argues that some diagrams, such as Lewis's structures, are objectively similar to what they represent and that we interpret them considering these similarities, so in that sense, her proposal would also be mimetic. In terms of the ‘natural’ or ‘non-natural’ classification, Perini’s proposal is non-natural because it proposes that we need to know conventions to interpret diagrams. Nevertheless, classifying it as non- natural does not allow us to differentiate it from proposals like Goodman’s, which holds that images are not objectively similar to what they represent.
In contrast to the mimetic/illusionist distinction, the weak/strong TLPA classification allows us to distinguish the category to which Perini’s approach belongs because although her proposal suggests that conventions guide the observation of the image, it does not indicate that these conventions modify the perceptual experience beyond directing attention. Furthermore, unlike the naturalist/non-naturalist distinction, classifying Perini’s proposal according to the weak TLPA allows to easily differentiate it from proposals like those of Hanson (1958), Goodman (1968), and Nanay (2015), which defend a strong TLPA.
The same applies to proposals like those of Hume (1909) and Walton (1970). According to the naturalist/non-naturalist criterion, they would be non-naturalist. Moreover, regarding the mimetic/illusionist criterion, they fall into the non-mimetic label. However, classifying them this way would place them alongside theories like those of Hanson (1958), Goodman (1976), and Nanay (2015). In contrast, classifying them using the weak TLPA criterion would distinguish them from those proposals. The same happens with Blumson’s (2014) proposal. His theory is "compositionalist" (non- naturalist) and non-mimetic, as it does not propose that images are objectively similar to what they represent. However, it is confusing to place it alongside theories like Goodman’s (1968), Hanson’s (1958), or Kuhn’s (1962) because it does not suggest that knowledge of the parts modifies the visual experience obtained from perceiving them.
Arguing that the difference between weak and strong TLPA allows to distinguish or group theories that would not be distinguished or grouped by other classifications is not sufficiently substantive, as someone could make the same claim about other classification criteria relative to the criterion I propose. However, the virtue of classifying theories of PI according to the TLPA they use is that it reveals some of the epistemic and ontological commitments of the theories it distinguishes. Understanding these commitments allows us to better identify each theory's potential strengths and weaknesses.
Regarding epistemic commitments, theories that employ a strong TLPA reject, as a matter of principle, the idea that observation can be neutral and that what is observed can serve to decide between competing hypotheses (whether they are content hypotheses or scientific ones). In contrast, theories that advocate a weak TLPA assume that, although observation is indeed not neutral and what is observed does not serve to decide between competing theories, it could, in principle, be neutral (e.g., if attention could be modulated or if it were restricted to an early stage of the visual process).
Regarding ontological commitments, theories that support a strong TLPA hold there is no difference between observation and cognition (or that there is a continuum between them) in the architecture of the mind. In contrast, theories that support a weak TLPA do not have this commitment.
If these assumptions are correct, then to determine which type of theory best illuminates our understanding of pictorial interpretation, it is helpful to refer to literature that explores the role of perceptual evidence in deciding between theories or the architecture of the mind. In the following section, I describe some discussions about the architecture of the mind that could help us understand which type of PI theory might be better equipped to address some of the challenges posed by that literature.
Fiona MacPherson (2012) asserts that to support the thesis that propositional attitudes alter visual experience, evidence must be provided that the influence of cognition on perception occurs in low-level visual content, such as the colors, texture, and shape of stimuli. MacPherson acknowledges that appealing to an early stage of vision (e.g., early vision) and to post-perceptual judgments might discredit the idea that the influence of cognition on high-level properties of a stimulus (e.g., the category to which the stimulus belongs) indicates a penetration of perception into cognition, since it could be argued that in those cases the perceptual experience is not modified, but rather it is the judgments about that experience that change.
For example, one might argue that perceiving a brown stain moving on the ground is a different visual experience depending on whether one has the concept of COCKROACHES or not. However, if one appeals to an early and encapsulated process of vision (cf. Pylyshyn, 1999; cf. Fodor, 1984), it could be stated that the visual experience is the same, and what changes is the post-perceptual judgment about the experience.
According to MacPherson, if it can be demonstrated that the influence of cognition is not over high-level properties (such as the category to which a stimulus belongs), but rather on low-level properties (such as its color or texture), then the influence of perception on cognition could not be discredited using the same strategy.
MacPherson (2011, pp. 38-48) presents that a series of experiments conducted by John L. Delk and Samuel Fillenbaum in 1965 suggest that beliefs about the color of objects affect subjects' visual experience of them. The experimental design was as follows: a solid orange sheet of paper was cut into various shapes, some of which were figures typically associated with the color red, such as a "heart" shape used in Valentine’s celebrations or an apple, while others were geometric shapes or outlines of stimuli not generally associated with the color red, such as a horse's head.
These shapes were placed on a background whose color could be changed, ranging from yellow to red, passing through the orange color of the paper used for the shapes. The participants in the experiment could adjust the background color using a button. The experiment aimed to answer whether participants would match the orange color of all shapes, making the background indistinguishable from the figure, or if there would be a shift for objects typically associated with red.
The results indicated that participants matched the color of shapes representing objects commonly associated with the color red with a background that was redder than the color selected for shapes representing figures not usually red.
According to MacPherson, this case suggests a mechanism that allows the subject's attitudes to penetrate their visual experience (MacPherson, 2012, p. 39). Moreover, she sustains that this case cannot be dismissed by claiming that it confuses a post-perceptual judgment or a shift in attention (direction of perception) with the experience of vision for the following reasons.
Firstly, attempting to dismiss cognitive penetration by attributing a post-perceptual judgment to the participant implies attributing a peculiar mistake: experiencing visually that the paper has a different color from the background but judging (asserting) that they are the same. MacPherson claims that positing this error in the participant seems to serve only to reject cognitive penetration (MacPherson, 2011, p. 41).
Secondly, MacPherson asserts that cognitive penetration cannot be dismissed by claiming that what happens in these cases is a shift in the participant's attention toward the stimulus (direction of perception). In the case of multi-stable images (such as the "duck-rabbit"), such explanations might be appropriate, as the asymmetric properties of the figure make the experience different depending on whether one starts looking from the left side to the right or from the right side to the left. However, this explanation seems inappropriate in the case of symmetric figures on their sides with uniform colors (like a “heart”). Therefore, MacPherson concludes that the best explanation for this phenomenon is accepting that cognitive penetration occurred (cf. MacPherson, 2011; p. 31-33).
Christoph Witzel (2016) conducted a series of experiments following MacPherson's argument, which supports the same conclusion. What is particularly interesting about Witzel's experiments is that he used images, allowing a direct association of the cognitive penetration thesis with the strong TLPA to explain PI.
In one of his experiments, Witzel relied on a previous study where participants were asked to adjust the color of an image of a yellow banana to gray. Participants could adjust the color of the image using knobs that allowed shifting the color along an axis towards blue or yellow and another axis towards green or red. In that experiment, participants adjusted the image to a bluish-gray rather than a neutral gray. Since blue is complementary to yellow, the researchers concluded that participants adjusted the image to blue-gray rather than neutral gray because their knowledge ("memory") of the color of bananas
—i.e., what the image represented— influenced their visual experience of the color.
In a second series of experiments, Witzel used the bluish-gray banana image from the previous experiment (i.e., the "chromatically adjusted" image) and placed it next to an objectively gray banana image. He then asked participants which of the two bananas was objectively gray. The hypothesis was that if memory influences visual experience, then the participants' perception should be affected in such a way that they would perceive the bluish-gray banana as gray, and the objectively gray banana as slightly yellow. The reasoning behind this hypothesis is that the memory of the yellow color of the banana would override the perception of the bluish-gray image, causing the observer to perceive it simply as gray. Conversely, this should not occur with the objectively gray stimulus, which should be perceived as having a yellowish tone rather than a neutral gray. This was Witzel's finding: a statistically significant proportion of participants perceived the bluish-gray banana as gray. Additionally, Witzel observed that this bias did not manifest in perceiving stimuli that did not evoke the memory of a particular color, such as spherical shapes (Witzel, 2016, pp. 2-3).
Witzel's experiments are relevant for reflecting on PI because if it is accepted that visual experience is relevant to PI, the results suggest that propositional attitudes influence PI. Furthermore, these experiments specifically and directly support theories of PI that use a strong TLPA, as they indicate that the influence of propositional attitudes on perception can affect the low-level properties of the image,
which makes it unlikely to appeal to attention or post-perceptual judgments to discredit the strong TLPA. Therefore, theories of PI that support this type of argument may be empirically backed.
J.J. Valenti and Chaz Firestone (2019) challenge the idea that color memory influences perception. First, they argue that the effects of color memory do not seem as evident as suggested by the experiments supporting cognitive penetration and thus should be treated with skepticism and subjected to more rigorous evaluation (Valenti & Firestone, 2019, p. 3). Second, they suggest that what occurs in these cases might be an example of post-perceptual judgments being confused with direct perceptions.
Valenti and Firestone designed an experiment similar to Witzel's (2016). However, instead of asking participants to select a gray object among two options, they asked them to identify the figure with a different color among three figures (i.e., they were shown three figures). The goal of the experiment was to demonstrate that if an image presents (i) a bluish-gray banana, (ii) a bluish-gray disk, and (iii) an objectively gray disk, then if there were an influence of color memory, as suggested by Witzel's experiments, the figure that should appear to have a different color among the three would be the bluish- gray disk (ii), rather than the figure that objectively has a different color (iii), which is the objectively gray disk.
The hypothesis is that if the effect of color memory was real, the bluish-gray banana and the objectively gray disk would be perceived as gray. However, the bluish-gray disk should continue to be perceived as bluish-gray, standing out among the other two figures. Contrary to this prediction, a statistically significant majority of participants identified the figure with a different color as (iii) the objectively gray disk, i.e., the figure that is objectively different from the other two.
Ned Block also questions the validity of the conclusions of experiments such as those by Witzel. According to Block, the so-called "color memory" primarily affects yellow and blue colors and, to a lesser extent, red and green. In one experiment, for instance, when adjusting the color of a Coca-Cola logo or a typical strawberry, there were barely noticeable effects of "memory over perception." Similarly, no such effects were recorded with objects like a ping-pong table. However, significant effects were observed with paradigmatically blue objects, such as the "Smurfs" or Nivea cans, and with paradigmatically yellow objects, like the German mailbox (Block, 2023, pp. 386-387).
These findings are consistent with other experiments cited by Block, which show that discrimination along the blue/yellow axis is more complex than along the red/green axis. The explanation is that the human perceptual system has a limited capacity to discriminate how much blue needs to be added to transform a yellow image into gray. When participants are asked to adjust a yellow object to look gray, they tend to add more blue than necessary to eliminate any yellow trace. In this case, the effects observed in experiments such as Witzel's can be explained by referring to the properties of the visual system without needing to invoke cognitive penetration (Block, 2023, p. 389).
Block describes other experiments suggesting that the effect of perceiving a gray image as yellow is not due to cognitive penetration. Studies evaluating the effect of color memory using textureless images (e.g., images of bananas without the characteristic spots on the fruit’s skin) found that the effect was considerably smaller. Since these images still had a recognizable shape, if the effect were due to the influence of propositional attitudes on visual experience, it would be expected to manifest with the same intensity (Block, 2023, p. 387).
Chaz Firestone and Brian Scholl (2015) also criticize the methodology of several experiments that suggest a continuity between vision and cognition, arguing that many commit at least one of seven methodological errors. For example, in many cases, experimental design does not adequately conceal the study's objective, which can induce biases in participants' responses. Firestone and Scholl illustrate this point by citing an experiment that concluded that participants carrying a heavy backpack perceived
a slope as steeper than it was. However, when participants were provided with a false but plausible story about the reason for carrying the backpack, the effects of the biased perception disappeared.
Firestone and Scholl describe the methodological errors of several additional experiments, suggesting that the experimental literature supporting cognitive penetration should be critically reviewed.
Block, Valenti, Firestone, and Scholl's arguments are far from conclusive, but when considered from the perspective of the discussion of PI theories, they suggest that theories employing a strong TLPA are not empirically grounded. This is because one could explain the cases they are based on by appealing to an early process of vision, post-perceptual judgments, attention effects during observation, poor experimental design, and/or perceptual constraints.
The discussion on cognitive penetration is very complex, and there is a substantial amount of literature addressing it at both conceptual and empirical levels, both in support of and against it. Some arguments reject it by focusing on discerning more processes in attention (Gross, 2017), while others analyze the physical processes involved in perception more specifically, either to reject it, as Raftopoulos (2019) does, or to defend it, as Andy Clark (2023) does.
Beyond defining this issue, the goal of this text is to highlight that if the classification I propose to distinguish theories of pictorial interpretation is not trivial and is informative, then it may reveal which other empirical and conceptual discussions we should explore to evaluate better and distinguish theories of pictorial interpretation.
To understand pictorial interpretation (PI), it is essential not only to analyze the various theories that address this phenomenon but also to compare them in order to identify their similarities, differences, strengths, and weaknesses. This comparative approach allows a more detailed and nuanced evaluation of existing theories. In addition to the criterion distinguishing between mimetic and illusionist theories or naturalist and non-naturalist theories, examining the type of theory-ladenness perceptual argument each theory employs can provide deeper insight into its epistemic and ontological commitments. This perspective helps consider conceptual arguments and empirical evidence that might guide the selection of the most suitable theory for understanding the phenomenon of pictorial interpretation.
The classification I propose aims to complement the existing ones and help refine the taxonomy of PI theories. It would be beneficial to compare this classification with other taxonomies not considered here to explore their interactions. For example, one could investigate the relationship between theories that adopt a weak or strong TLPA and those that support or reject an "integrative" view of pictorial interpretation (those that argue for or against the idea that one can be simultaneously aware of the surface of the image and what it represents) (cf. Voltolini, 2024). Another area of interest could be the contrast between theories supporting a strong or weak TLPA and those considering pragmatic interpretation, where intention or use determines interpretation (cf. Abell, 2009; Barceló, 2021). Similarly, exploring the relationship between strong or weak TLPA theories and those addressing images from a perspective of local, global, or holistic underdetermination argument (Aguilar, 2022) could be fruitful. Comparing and contrasting these different classifications could more clearly reveal the weaknesses and strengths of each theory, as well as better guide us in evaluating them using tools derived from other literature, such as the mind architecture literature outlined here.
Finally, it is important to emphasize that the analysis of the different classifications and their interactions not only clarifies the weaknesses, virtues, and ontological and epistemic commitments of the theories of pictorial interpretation in the abstract but also has significant implications for theories of art and cinematography that adopt them explicitly or implicitly in concrete discussions.
Aguilar, J. (2022) Tres versiones de argumentos de subdeterminación contrastiva para explicar la competencia pictórica. Revista Xipe Totek, Año 31, Vol. II, No. 118, 85–116.
Atencia-Linares, P. (2018). Why Grey is the New Black en The New Theory of Photography: Critical Examination and Responses. - Aisthesis: Pratiche, Linguaggi E Saperi Dell’Estetico 11 (2), 207–234.
Abell, C. (2009). Canny Resemblance. Philosophical Review, 118, (2009), 183-223.
Abell, C. (2005). Pictorial Implicature. The Journal of Aesthetics and Art Criticism, Vol. 63, No. 1 (Winter, 2005), pp. 55-66.
Barceló, A. (2022), How to Visually Represent Structure, en Giardino, Valeria, Linker, Sven, Burns, Richard, et al. (Eds.), Diagrammatic Representation and Inference: 13th International Conference, Diagrams. Rome, September 14–16, 2022. Proceedings, Springer, pp. 218–225.
Bazin, A. (1960a). The ontology of photographic Image. En Whats Cinema? University of California Press. 2004, 9-17.
Bazin, A. (1960b). The Evolution of the Language of Cinema. En Whats Cinema? en University of California Press, 23-41.
Berger, J. (1972). Ways of Seeing, Londres, British Broadcasting Corporatoin and Penguin Books.
Block, N. (2023). The Border Between Seeing and Thinking. Oxford, Oxford University Press. Recuperado en https://global.oup.com/academic/product/the-border-between-seeing-and-thinking- 9780197622223?cc=mx&lang=en&
Carroll, N. (1999). Philosophy of Arts: A Contemporary Introduction, London, Routladge. Carroll, N. (2003). Engaging the moving Image, New Haven, Yale University Press.
Caldarola, E. (2011). Pictorial Representation and abstract pictures, Scuola di dottorato di recerca in filosofia teoretica e practica. Ciclo XXII.. Direttore della Scuola: Ch.mo Prof. Giovanni Fiaschi, Padova, Università degli Studi di Padova.
Clark, A (2023). The Experience Machine: How Our Minds Predict and Shape Reality, New York, Pantheon.
Currie, G. (1999). Visible Traces: Documentary and the Contents of Photographs. Journal of Aesthetics and Art Criticism, 57 (3), 285-297.
Dirks, J.R., & Gibson, E. (1977). Infants’ perception of similarity between live people and their photographs. Child Development, 48, 124–130.
Eisenstein, S. (1949). “A Dialectic Approach to Film”. En Film Form: Essays in Film Theory, Harcourt, Brace and Company, New York. Recuperado en http://antigo.casaruibarbosa.gov.br/arquivos/file/A_Dialectic_Approach_to%20_Film_Form_SergeiEis enstein.pdf
Fagot, J. (2010). How to read a picture: Lessons from nonhuman primates. PNAS. January 8, 2010 107 (2) 519-520.
Fodor, J. (1984). Observation reconsidered. Philosophy of Science ,51 (March), 23-43.
Pylyshyn, Z. (1999). Is vision continuous with cognition? The case for cognitive impenetrability of visual perception. Behavioral Brain Science. 22, 341-365.
Gombrich, E. (1960). Art and Illusion, Londres, Phaidon Press.
Goodman, N. (1968). Languages of Art: An approach to a theory of symbols. Indianapolis. The Bobbs- Merrill Company, Inc.
Gross, S. (2017). Cognitive penetration and Attention. Frontiers in Psychology. Hypotehesis and Theory. Doi: 10.3389/fosfató.2017.00221 https://pubmed.ncbi.nlm.nih.gov/28275358/
Hyman, (2006). The Objective Eye, Chicago, The University of Chicago Press. Hanson, N. (1958). Patterns of Discovery, Cambridge, Cambridge University Press.
Hyman, J. & Bantinaki, K. (2021). Depiction. The Stanford Encyclopedia of Philosophy (Fall 2021 Edition), Edward N. Zalta (ed.), URL =
<https://plato.stanford.edu/archives/fall2021/entries/depiction/>.
Hume, D. (1909). The Standard of Taste. Modern History Sourcebook Harvard Classics 1909 David Hume (1711-1776) Of the Standard of Taste, 17601 Recuperado en https://homepages.uc.edu/~martinj/Taste%20Food%20&%20Wine/Aesthetics_of_Food_&_Drink/Hu me%20-%20Of%20the%20Standard%20of%20Taste%201760.pdf
Kuhn, T. (1962/1970). The Structure of Scientific Revolutions. International Encyclopedia of Unified Science. Volumes I and II. Foundations of the Unity of Science Volume II. Number 2. Chicago. The University of Chicago.
Kuvicki, J. (2013). Pictorial Representation. Philosophy Compass 1/6 (2006): 535–546.
Kuleshov, (1974). “Montage as Foundation of Cinematography” en Kuleshov on Film. Film Writings by Lev Kuleshov. pp. 42-45. Recuperado en https://opencourses.ionio.gr/modules/document/file.php/DAVA108/kuleshov.pdf
Levin & Banaji (2006). Distortions in the Perceived Lightness of Faces: The Role of Race Categories. Journal of Experimental Psychology. American Psychological Association. 2006, Vol. 135, No. 4, 501–
512.
MacPherson, F. (2012). Cognitive Penetration of Colour Experience: Rethinking the Issue in Light of an Indirect Mechanism. Philosophy and Phenomenological Research, Vol. 84, No. 1 January 2012, 24– 62.
Nanay, B. (2015). Cognitive penetration and the gallery of indiscernibles. Front. Psychol., 08 January 2015 Sec. Consciousness Research Volume 5 - 2014 | Recuperado en https://www.frontiersin.org/articles/10.3389/fpsyg.2014.01527/full
Perini, L. (2015). Convention, resemblance, and isomorphism. Studies in Multidisciplinarity, Volume 2 Elsevier B.V. p. 37-47.
Raftopoulos, A. (2019). Cognitive Penetrability and the Epistemic Role of Perception. London, Pallgrave Macmillan.
Sontag, S. (1966). Against Interpretation. En Against Interpetation and other Essays. Picador. Scruton, R., (1981). Photography and Representation, Critical Inquiry. 7 (3), pp. 577-603.
Stokes, D. Cognitive Penetration and the Perception of Art. En Dialectica, John Wiley & Sons Ltd., Nueva York, vol. 68, Nº 1, 5 de abril de 2014. 1–34.
Walton, K. (1970). Categories of Art. The Philosophical Review. Vol. 79. No. 3. pp. 334-367\Duke University Press Philosophical Review https://philpapers.org/rec/WALCOA-2
Walton, K. (1984). Transparent Pictures: On the Nature of Photographic Realism. Critical Inquiry. Vol. 11. No. 2. pp. 246-277.
Stanford, K. (2023). "Underdetermination of Scientific Theory", The Stanford Encyclopedia of Philosophy (Summer 2023 Edition), Edward N. Zalta & Uri Nodelman (eds.), URL =
<https://plato.stanford.edu/archives/sum2023/entries/scientific-underdetermination/>.
Voltolini, A. (2024). Is What We See in the Picture the Same as What the Picture Presents. 2024, Journal of Comparative Literature and Aesthetics. Volume: 47. 145-155.
Walton, K. (1970). Categories of Art. The Philosophical Review. Vol. 79. No. 3. 334-367.
Walton, K. (1984). Transparent Pictures: On the Nature of Photographic Realism. Critical Inquiry. Vol. 11, No. 2, 246-277
Wittgenstein, L. (2017). Investigaciones filosóficas, Instituto de Investigaciones Filosóficas. Ciudad de México,Universidad Nacional Autónoma de México.
Wollheim, R. (2012). Danto's Gallery of Indiscernibles. In Ernest Lepore & Mark Rollins (eds.), Danto and his Critics. Oxford, UK: Wiley‐Blackwell. 30–39.