PLoS computational biology

Objects influence where people look in moving real-life scenes

Updated

Abstract

A model using object-based attention may better represent human gaze behavior in dynamic scenes.

  • Eye movements may not be driven primarily by distinct features in the environment, but rather by objects within the scene.
  • A new computational framework simulates realistic eye movement patterns, including and smooth pursuit.
  • Five different models were created to analyze gaze behavior: two based on spatial saliency, two on object-based attention, and one mixed model.
  • The best-performing model combined object-based attention with saliency to prioritize objects during gaze shifts.
  • This suggests that attention is guided more by the objects in a scene than by the locations of salient features.

Simplified

Key figures

Fig 1
Space-based model components and process for predicting gaze positions in dynamic scenes
Frames how spatial saliency, , and inhibition combine to guide gaze shifts in dynamic scenes
pcbi.1011512.g001
  • Panels a (I-III)
    Scene features (I) show saliency maps split between low-level (left) and high-level (right) saliency with color indicating relative saliency; visual sensitivity (II) is a bright Gaussian centered on current gaze (green cross); (III) shows dark areas around previous gaze points with decreasing inhibition over time (inset graph)
  • Panel b (IV)
    Decision making map shows evidence accumulation for potential targets with red indicating higher decision variable values
  • Panel c (V)
    Gaze update map displays with color-coded direction and velocity, and current gaze marked by a green cross
Fig 2
Object-based model components for predicting gaze behavior in dynamic scenes
Highlights how object-based attention and inhibition shape gaze shifts and tracking in dynamic scenes
pcbi.1011512.g002
  • Panels a (I-III)
    Module (I) shows scene features with and maps; module (II) shows focused on the currently object (marked by a green cross); module (III) shows applied to previously foveated object masks, with inhibition decreasing over time after gaze leaves the object
  • Panel a (Object masks)
    Object masks segment the scene into two persons and using
  • Panel b (IV)
    Decision making module accumulates evidence for saccadic target selection for each object over time, shown as decision variable traces with a threshold line
  • Panel c (V)
    Gaze update module tracks the movement of the foveated object mask, showing gaze position moving from a previous location (dashed circle) to a new location (green cross)
Fig 3
Software architecture and example use cases of the ScanDy gaze behavior framework
Anchors how modular software supports simulating and comparing human gaze models with increasing complexity.
pcbi.1011512.g003
  • Panel a
    Class diagram showing core classes: Dataset provides human data and ; Model base class supports derived LocationModel and ObjectModel classes with methods for loading parameters, updating features, and simulating scanpaths; ObjectFile supports object information integration; gray- classes (BoxSearch, ParameterSpace, Evolution) from neurolib enable parameter exploration and optimization.
  • Panel b
    Workflow diagram illustrating use cases: loading dataset, initializing models, specifying parameters, running single video simulations, visualizing scanpaths, qualitative comparison, overwriting model methods, implementing fitness functions, specifying parameter space, running evolutionary optimization, and functional evaluation; colored boxes group steps by complexity and function.
Fig 4
Human vs simulated eye movement timing and distance statistics during video viewing
Highlights how object-based models better match human eye movement timing and distance than space-based or models
pcbi.1011512.g004
  • Panel a
    Distribution of human durations (time spent looking at one spot) with a fitted log-normal curve
  • Panel b
    Distribution of human amplitudes (eye movement distances) with a fitted Gamma curve
  • Panels c
    Cumulative distribution functions of foveation durations comparing human data (green) to five model types; human data and test set lines are opaque, training set lines are transparent
  • Panels d
    Cumulative distribution functions of saccade amplitudes comparing human data (green) to five model types; human data and test set lines are opaque, training set lines are transparent; center bias only model (pink) appears to have higher saccade amplitudes
Fig 5
Human data vs five models: proportions and timing of four gaze categories
Highlights how models vary in capturing the timing and proportion of gaze behaviors compared to human data, especially foveations.
pcbi.1011512.g005
  • Panel a
    Percentage of foveation events over time for Background (maroon), (orange), (yellow), and (khaki) in human data and five models; human data shows a visibly higher and more sustained Background proportion than most models.
  • Panel b
    Average proportion of time spent in each foveation category across all for human data and five models; Background occupies the largest proportion in human data and models, with Detection and Inspection varying across models.
1 / 5

Full Text

What this is

  • This research investigates how objects influence human gaze behavior in dynamic real-world scenes.
  • It presents a computational framework for simulating eye movements based on attentional mechanisms.
  • The study compares various models of gaze behavior, emphasizing the role of object-based attention.

Essence

  • Object-based attention significantly guides human gaze behavior in dynamic scenes. A model incorporating object-based mechanisms outperforms traditional space-based models, demonstrating the importance of objects in attentional selection.

Key takeaways

  • Object-based models lead to more human-like gaze behavior compared to space-based models. The best-performing model prioritizes objects based on low-level saliency, effectively simulating human exploration patterns.
  • The study's framework allows for systematic testing of hypotheses regarding attentional mechanisms. It integrates psychophysical insights into a modular design that can adapt to various attentional models.
  • Functional analysis reveals that human gaze behavior is influenced by the sequential nature of eye movements. The models effectively capture the dynamics of gaze transitions and object interactions.

Caveats

  • The framework simplifies complex attentional mechanisms, which may limit its ability to capture all nuances of human gaze behavior. Future work should explore additional factors influencing attention.
  • The model's parameters were optimized based on general gaze statistics, which may not fully account for individual differences in exploration behavior. More diverse datasets could enhance its applicability.

Definitions

  • scanpath: The sequence of eye movements made during visual exploration, reflecting where attention is allocated.
  • saccade: A rapid eye movement between fixations, crucial for shifting gaze to new visual targets.
  • fixation: A period during which the gaze remains stationary on a specific location, allowing for detailed visual processing.

Simplified

what lands in your inbox each week:

  • 📚7 fresh studies
  • 📝plain-language summaries
  • direct links to original studies
  • 🏅top journal indicators
  • 📅weekly delivery
  • 🧘‍♂️always free