A new framework for vision: encoding, selection, and decoding.
•
A saliency map in V1 for primates guides attentional selection exogenously.
•
Massive loss of input information, i.e., the attentional bottleneck, starts at V1.
•
Peripheral/central vision mainly for looking (selection)/seeing (decoding).
•
Feedback from higher visual areas to V1 is mainly directed at fovea for decoding.
Visual attention selects only a tiny fraction of visual input information for further processing. Selection starts in the primary visual cortex (V1), which creates a bottom-up saliency map to guide the fovea to selected visual locations via gaze shifts. This motivates a new framework that views vision as consisting of encoding, selection, and decoding stages, placing selection on center stage. It suggests a massive loss of non-selected information from V1 downstream along the visual pathway. Hence, feedback from downstream visual cortical areas to V1 for better decoding (recognition), through analysis-by-synthesis, should query for additional information and be mainly directed at the foveal region. Accordingly, non-foveal vision is not only poorer in spatial resolution, but also more susceptible to many illusions.