Cognitive Neurology

Disseminating the Elements of Pre-Attentive Processing:
Understanding Visual Perception through Psychophysics and the Neural Processing Mechanisms

October, 2020

Summary: The first interaction a person has with a new environment is visual, it takes place within 250 m sec i.e the duration of a standard eye fixation. Thus in the fraction of this short duration elements are picked up by the eye and patterns are assimilated in the brain, this is defined as pre-attentive processing. Only some elements from the visual scene are processed by the human brain in this short period. Studies in neuroscience base this selective response by neural receptive cells to object properties of shape, color, form (through edge and region identification), spatial positioning and motion. This paper focuses on the relationship between the neural mechanism that contributes to the distinctiveness of a visualization, through form, color, depth, motion, parallel processing and connects it to the learnings from theoretical models provided by psychophysics, to understand how humans perceive their surroundings through pattern seeking behaviour. The paper then underlines the attributes that ‘pop-out’ based on the understanding derived from neuroscience and the findings of psychophysical experiments.

Methodology

This paper is divided into two sections. Thee first section focuses on the pathway of human visual search. It starts with the eye movements that scan a visual system for stimulus, this information then forms an image on the retina, and travels through the optic nerve to the brain. The following section explores the neuro-physiological processes leading from the eye to the visual cortex and the characteristic of the neural cells to process this information.

Part I: Neuro-physiological organization responsible for visual perception

(a) Saccadic Eye movements: In a visual search, human eye moves in quick fixations scanning the scene, the duration of a standard fixation measures between 200-400 m sec, and this causes the visual stimulus to be imagined on the retina. This image is formed on the central region of the retina, called the fovea, the region of high resolution or acuity. These rapid eye movements called ‘saccades’(Kruthiventi et al., 2017) and they take place within a duration of 20-180 m sec and encompasses an angle of 2-5 degrees on the retina. These eye movements are ‘ballistic’(Ware, 2019) i.e once a saccadic movement starts it cannot be suppressed mid-way, the eye has to re-focus, this is called as ‘saccadic suppression’ and hence any moving objects in the visual field of the observer during this period are missed.

(b) Optic Nerve : It carries the decomposed signal from the photoreceptors of the retina to the brain, through its concentric receptive fields distinguishing between red-green, yellow-blue and dark-light signals(Wandell,1995).

(c) Lateral Geniculate Nucleus (LGN) receives an already decomposed signal, it has ‘reveal cells’ (Barlow, 2009) with distinctive concentric receptive fields (Livingston & Hubel, 1988). These cells are tuned to a pattern of black and white, its ‘tuned filters’ respond selectively to a certain kind of stimuli. (Nassi & Callaway, 2009)

(d) Visual Cortex (V1 and V2): The V1 and V2 receive signals from the optic nerves of the two eyes. ‘40% of the visual ‘neural processing’ happens in this zone (Lennie, 1998) (Kravitz et al., 2013). The neural architecture in the V1 and V2 is characterized by tuned receptive fields (Livingstone & Hubel, 1988) i.e they have selective receptive properties of being excited and inhibited by contrasting patterns of black surrounded by white or white surrounded by black and respond selectively to elements of form i.e properties of size and orientation (Wolfe & Gancarz, 1997) (Obermayer & Blasdel, 1993), with respect to luminance, through edge and region identification, spatial positioning, color and motion (Monier et al., 2003).

The cortex has simple, complex and hypercomplex cells. (Hubel & Wiesel, 1968). The simple cells have receptive fields with distinct ‘on’ and ‘off’ areas, activated by the stimulus qualities of size, shape and position forming concentric receptive fields. Some cells responded to ‘opponent-color properties, exhibiting excitement at some-wavelengths while inhibiting at others’. Complex cells are responsive to ‘slits and edges’, whereas the hypercomplex cells display direction selectivity, i.e they respond to a stimulus when it moves in one direction but does not display similar response when it moves in the opposite direction.

Thus a region of cortex analyses a visual filed in terms of the direction light and dark contours, detects movement of contours, types of contours, light against dark or vice versa, detects change in direction/ curvature of contours. This information acts as semi-dependent but ‘interconnected system of layers’ forming ordered regions. These are ‘are mapped in sets of superimposed but independent’ patterns (Hubel & Wiesel, 1968) . The regions of V1 and V2 receive parallel inputs from both eye and it is spontaneously processed (Nassi & Callaway, 2009). The information from the visual field is segregated as components of shape, stereoscopic depth, motion and color, and is formed at dissociable regions cortex. The mapping of brain activity shows patterns of ‘iso-orientation contours’ depicting that the orientation preferences of neurons is created in an highly regular and organized pattern’ (Wandell, 1995). The cortical regions of V1 and V2 process this information as feature maps for each element.

(e) Visual Channel Theory: In a wholistic perspective, information is received from the external world by visual or auditory field. It is processed in the V1 and the V2 as separate channels, divided into visual and acoustic information. The visual channel is divided into two;

(a) Field of luminance and

b) Color and they are processed isolation.

The luminance channel is responsible for perceiving objects depending on the qualities of form & texture (spatial frequency, orientation) and motion (phase & direction), it is processed devoid of color. The second channel processes color differences through Red-Green and Yellow-Blue channels (Trobe, 2001)(Essen, 1988). Experiments from neuroscience and psychophysics depicts via Gabor function, that these neurons filter points in the visual field for shapes (via contour) and texture perception, responding strongly to edges of a certain orientation (Barlow, 1972; Cope et al,2009) and this information is processed as parallel feature maps, based on each feature of color, vertical or horizontal orientation,etc.

In conclusion the neurological understanding determines, during the early stage of visual processing which elements will pop-out from their surroundings. The forthcoming section focuses on the theoretical models provided by experiments with human observers in psychophysics that aid our understanding on how humans encode the information in pre-attentive processing as they perceive it with just-noticeable differences.

Part II: Theoretical models of visual interpretation: Psychophysics

Psychophysics bases its understanding on pre-attentive processes through experiments on human subjects “varying the properties of a stimulus and studying their reaction time” (Bruce et al, 2003). The prominent contributions to the pre-attentive research were by Neisser, Marr, Biederman, Triesman, Gancarz, Wolfe, and Kahneman. In 1967, Neisser determined that pre attentive processes ‘could perform tasks across an entire visual filed in parallel, and segregate regions of a scene into figures and ground, which are then identified as objects in a later stage. This idea was applied by Triesman to visual search through the Feature Integration theory.’(Wolfe & Gancarz, 1997)

(1) Feature Integration theory

Feature integration theory distinguishes between two tasks in visual search, it states that first stage of visual processing/pre-attentive includes search for basic features of shape, color and movement and the second stage involves combining this information to perceive as a whole also referred as conjunction search. The visual system begins by ‘coding a simple and useful properties into a stack of maps (Treisman, 1986).' Triesman stated that ‘pre attentive feature processes’ could perform searches in parallel, while attention from item to item was required to do all other searches(Wolfe & Gancarz, 1997) (Cavanagh & Chase, 1971) (Mounts, 2012).

Feature Integration theory (Treisman & Gelade, 1980) suggested that ‘attention must be directed serially to each stimulus when conjunction of more than one feature separable features’ have to be distinguished. The guiding attributes were identified by conducting experiments that measured the reaction time in identifying objects in a visual search and were defined as ‘properties of orientation, color, size and motion are mapped in different regions of the brain’. Physical attributes are not only noted in absolute state but in a relative aspect, i.e with respect to other object in proximity and symmetry. When these images are processed in too short intervals illusionary conjunctions (Kanizsa, 1976) are observed as they are processed as combination of features. It concluded that properties of ‘figure ground’ and ‘texture segregation’ are pre attentively processed.

(2) Conjunction-matching model

Triesman’s model proposes that ‘early vision encodes properties of a scene as feature maps’ which preserves spatial relations to the visual world. Subjects perceive information of color, orientation, size and stereoscopic distance to create a ‘attention spotlight’, these are stored as descriptions of objects (Treisman, 1986).

(3) Attributes that guide pre-attentive processing

In pre-attentive process when multiple stimuli are projected, they are processed parallel and some are suppressed over the stronger stimuli. (Wang & Theeuwes, 2020). In a visual field having competition between targets, factors that aid the object to stand out are the degree of difference of target from non-targets (Duncan & Humphreys, 1989) and the degree of difference of non-targets target from each other. Attributes that can stand out in visual search (Wolfe & Gancarz, 1997) are orientation, size, motion and color(opponent channel)(Wolfe & Horowitz, 2004). The probable attributes (Wolfe & Horowitz, 2004) are ‘luminance onset (flicker), luminance polarity, stereoscopic depth and tilt, convexity, concavity, line termination, closure and curvature.’

(4) Theory of Integral and Separable dimensions

Garner’s theory, 1974 addresses aspects of visualization application important for multidimensional information. Observations of this theory can be co-related to findings from visual channel theory. It explores the perception of use of two elements such as proportion and color to represent a single variable. In ‘integral display dimensions’, two or more attributes of a visual object are perceived holistically and not independently. For example for data values to be read on different dimensions, such as, on a graph, mapping one element to color and another to size, will make them understandable. In ‘separable dimensions’ observers tend to make separable judgement about each graphical dimension such as size, orientation and color. These understandings are used to for data visualization to represent varying quantities in maps i.e simple visual properties will stand out in visual search than a complex combination of attributes.

(5) Grouping: Principles of Pattern Perception

The principles of pattern grouping can be explained using a set of experiments by Field, Hayes and Hess (1993), investing the rules behind the integration in early stage perceptual grouping explored the perception of contours as an evidence of continuity. In an experiment, subjects were asked to identify elements distributed along a plane using Gabor patches. It was observed that subjects could identify the elements along the path when the distance between the elements was greater than the size of an individual element.

The experiment identified the alignment of Gabor patches to play a major role in the ability to unify elements through integration, the ability to perceive contours was stronger when the patches were aligned and was weaker when they were non-aligned. The ability of a line to ‘pop-out’ was characterized by the spacing of dots (proximity) and their collinearity, this feature allowed the dots to combine and be read as a line and allowed the edge to be seen as whole. Perceived continuity can be co-related to the ‘selective array of cells for orientation and orientation and spatial frequency’ (Field et al., 1993).

Pattern perception is governed by principles of spatial proximity, it determines that things that are placed close together are grouped together.
Similarity principle, suggests that individual elements having same size, color or shape are grouped together.
Symmetrical patterns attuned to the foveal and parafoveal vision are more readily perceived.
Alignment or edge, indicates that elements distributed along the horizontal or vertical alignment in a space indicates a relationship between alignment items.
Common region, a continuous closed contour determines a region, the perceptual tendency divides it into regions of ‘inside’ and ‘outside’(Palmer, 1981).
Figure and ground, an object that is being perceived will be read in the foreground and the rest of the surface on which it is being perceived will be the background. (Wandell,1995).

Information that is depicted using these principles is readily perceived.

Conclusion

The learning from neurological and psychophysics indicates the attributes detected in the pre-attentive processing stage are shape, color, texture or motion. As these properties are processed as separate channels they will be read separately and like elements will be grouped together based on principles of alignment, proximity, similarity, symmetry and region.

Visual noise is perceived when objects from varying layers of information are placed in the same range of frequencies(Greenwood et al., 2009).This is an important consideration for information displays, as it will determine which elements can be read in a glance and having the knowledge of these attributes, the information can be displayed such that cab be identified instantaneously.

BIBLIOGRAPHY

Atkinson, R. C., & Shiffrin, R. M. (1968). Human memory: A proposed system and its control processes. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning and motivation (Vol. 2, pp. 89–195). Oxford, England: Academic.
Barlow, H. B. (2009). Single units and sensation: A neuron doctrine for perceptual psychology? Perception, 38(6), 371–394. https://doi.org/10.1068/pmkbar
Cavanagh & Chase. (1971). the equivalence of target and non target processing in visual search_Cavanagh Chase 1971.pdf.
Duncan, J., & Humphreys, G. W. (1989). Visual Search and Stimulus Similarity. Psychological Review, 96(3), 433–458. https://doi.org/10.1037/0033-295X.96.3.433
Fang, Y., Wang, J., Narwaria, M., Callet, P. Le, Lin, W., & Member, S. (2014). Saliency Detection for Stereoscopic Images. 23(6), 2625–2636.
Field, D. J., Hayes, A., & Hess, R. F. (1993). Contour integration by the human visual system: Evidence for a local “association field.” Vision Research, 33(2), 173–193. https://doi.org/10.1016/0042-6989(93)90156-Q
Hubel, D. H., & Wiesel, T. N. (1968). Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology, 195(1), 215–243. https://doi.org/10.1113/jphysiol.1968.sp008455
Kanizsa, G. (1976). Subjective Contours. Scientific American, 234(4), 48–53. http://www.jstor.org/stable/24950327
Kravitz, D. J., Saleem, K. S., Baker, C. I., Ungerleider, L. G., & Mishkin, M. (2013). The ventral visual pathway: an expanded neural framework for the processing of object quality. Trends in Cognitive Sciences, 17(1), 26–49. https://doi.org/https://doi.org/10.1016/j.tics.2012.10.011
Kruthiventi, S. S. S., Ayush, K., & Babu, R. V. (2017). DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations. IEEE Transactions on Image Processing, 26(9), 4446–4456. https://doi.org/10.1109/TIP.2017.2710620
Le Meur, O., Le Callet, P., Barba, D., & Thoreau, D. (2006). A coherent computational approach to model bottom-up visual attention. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(5), 802–817. https://doi.org/10.1109/TPAMI.2006.86
Livingstone, M., & Hubel, D. (1988). Segregation of form, color, movement, and depth: Anatomy, physiology, and perception. Science, 240(4853), 740–749. https://doi.org/10.1126/science.3283936
Monier, C., Chavane, F., Baudot, P., Graham, L. J., & Frégnac, Y. (2003). Orientation and Direction Selectivity of Synaptic Inputs in Visual Cortical Neurons. Neuron, 37(4), 663–680. https://doi.org/10.1016/s0896-6273(03)00064-3
Mounts, J. R. W. (2012). From classic to current: A look back on attention research in the American journal of psychology. American Journal of Psychology, 125(4), 423–434. https://doi.org/10.5406/amerjpsyc.125.4.0423
Nassi, J. J., & Callaway, E. M. (2009). Parallel processing strategies of the primate visual system. In Nature Reviews Neuroscience (Vol. 10, Issue 5, pp. 360–372). https://doi.org/10.1038/nrn2619
Obermayer, K., & Blasdel, G. G. (1993). Geometry of orientation and ocular dominance columns in monkey striate cortex. Journal of Neuroscience, 13(10), 4114–4129. https://doi.org/10.1523/jneurosci.13-10-04114.1993
Read, J. C. A. (2015). The place of human psychophysics in modern neuroscience. In Neuroscience (Vol. 296, pp. 116–129). https://doi.org/10.1016/j.neuroscience.2014.05.036
Treisman, A. (1986). Features and Objects in Visual Processing. Scientific American, 255(5), 114–125. https://doi.org/10.1038/scientificamerican1186-114b
Vienne), C. international de médecine du travail (15 ; 1966 ; (1966). Visual Perception Physiology, Psychology and Ecology.
Wang, B., & Theeuwes, J. (2020). Salience Determines Attentional Orienting in Visual Selection. Journal of Experimental Psychology: Human Perception and Performance, 46(10), 1051–1057. https://doi.org/10.1037/xhp0000796
Wolfe, J. M., & Gancarz, G. (1997). Guided search 3.0: A model of visual search catches up with jay enoch 40 years later. Basic and Clinical Applications of Vision Science, 60, 189–192.
Wolfe, J. M., & Horowitz, T. S. (2004). What attributes guide the deployment of visual attention and how do they do it? Human Perception: Institutional Performance and Reform in Australia, 5(June), 73–79. https://doi.org/10.4324/9781351156288-12
Palmer, S. E., Rosch, E., & Chase, P. (1981). Canonical perspective and the perception of objects. Dans J., Long, A. Baddeley,(Éds), Attention and Peformance (p. 135-151).
DeYoe, E. A., & Van Essen, D. C. (1988). Concurrent processing streams in monkey visual cortex. Trends in neurosciences, 11(5), 219-226.
Trobe, J. D. (2001). The neurology of vision. Oxford university press.
Bruce, V., Green, P. R., & Georgeson, M. (2003). Visual perception : Physiology, psychology and ecology. (4th ed.) Psychology Press.
Garner, W. R. (1974). The Processing of Information and Structure. Psychology Press, United Kingdom: Taylor & Francis.
Wandell, B. A. (1995). Foundations of vision. Sunderland, Chapter 6, Pg.Mass: Sinauer Associates.
Ware, C. (2019). Information visualization: perception for design. Morgan Kaufmann.

Disseminating the Elements of Pre-Attentive Processing: Understanding Visual Perception through Psychophysics and the Neural Processing Mechanisms

Disseminating the Elements of Pre-Attentive Processing:
Understanding Visual Perception through Psychophysics and the Neural Processing Mechanisms