Left: Vision and Language Navigation task. The agent uses verbal instructions and images to navigate through a 3D environment. Right: Performance of the adversarial path sampler (APS) based on human-inspired counterfactual reasoning compared against a randomly sampled data augmentation technique (rand).
The overarching goal of this project is to explore the fundamental differences between how deep neural networks and humans “see” and how machines achieve human-level linguistic capabilities. The aim of these studies is to provide an understanding of the computational differences between humans and machines, the limitations and strengths of humans and machines, and how their interactions influence cognitive functions, leading to new AI applications for human/robot team engineering. Specifically, the aims are to:
Specify visuo-linguistic tasks for which humans are unmatched by AI performance. These tasks will include complex visual search in clutter and language guided search, situational awareness and inferences about intentions from scenes and their mapping to language, and visuo-linguistic navigation.
Characterize differences in human and AI capabilities, decision error types, the internal representations of task information and stimulus space, and robustness to perturbations in the information.
Utilize functional magnetic resonance imaging in conjunction with computational techniques (encoding models, multivariate pattern classifiers and representational similarity) to identify brain areas, large-scale neural architecture, and neural representations mediating visual common sense reasoning.
Investigate how DNNs can learn augmented knowledge structures about visuo-linguistic stimulus space utilizing self-supervising/representational learning. Assess the effects of self-supervised learning on improving DNNs performance in the challenging visuo-linguistic tasks.
Assess the influence of DNNs on human perceptual and cognitive decisions. Explore methods to best integrate humans and AI for complex visuo-linguistic tasks. Investigate the strengths and weaknesses of humans and DNNs for scenarios that introduce bottlenecks to human perceptual and cognitive systems and diminish human performance.