Physical and virtual objects in immersive virtual reality: are there differences in memory associations?


It is long established that people memorise words associated with objects (e.g. nouns) better when they are exposed to the physical objects, rather than a representation of them (Kirkpatrick, 1894). Studies over the following century have repeatedly and almost exclusively demonstrated that word memorisations are best-formed with objects, followed by high resolution representations (colour pictures), then lower resolution representations (monochrome pictures), and finally, spoken or written representations (Moore, 1919; Bousfield, Esterson, and Whitmarsh, 1957; Cole Frankel and Sharp, 1969; Deregowski).

There is surprisingly little literature, however, regarding how the memorization of nouns using digital or virtual objects compares with their physical counterparts or different types of existing representation (images, words), nor how aspects of the virtual medium can be leveraged to potentially provided better-than-reality noun memorization outcomes. This is an interesting gap, as the use of realia (real, culturally correct artefacts) is well-established in second language classrooms, and hte need for virtual realia was established over two decades ago (Smith, 1997).

Away from nouns, the majority of the literature directly comparing the impact of physical and virtual representations on learning is concerned with both representation and interaction. Studies comparing cognitive distinctions between representation and interaction with physical and virtual objects have uncovered numerous notable efficacy outcomes. Virtual interactions have been shown to be slower and less accruate for manipulating objects (Chen), equally effective for polyominoes manipulation (???), and for learning physics using manipulatives (???, ???), faster for model assembley (???) and promoting distinct learning outcomes for understanding pulley system dynamics (Gire).

There is also evidence of different psychophysical effects of direct interaction with physical and virtual artefacts. The Figural After Effects - a change in understanding of the physical size of an object after a period of exposure to an object of different size - has been noted when using physical artefacts, but not when manipulating virtual ones.

Research from the simulator space could also be useful here, as simulators with high physical fidelity have often proven more effective than those with low physical fidelities (???), which suggests the physical affordances play an important role in learning even when outcomes are digitally-mediated.

Against a backdrop of increasingly immersive computing interfaces, and an increase in the use of immersive virtual environments for language education, we believe it is important to understand and quantify any differences between physical and virtual objects in noun memorisation. We also propose to explore one of the affordances of using virtual reality for learning by enhancing the novelty of a virtual-object condition. We hope to be able to understand is unrealistic-yet-strange presentations of objects in immersive virtual realities create stronger memorisations than realistic virtual reproductions or the real-world physical objects.

Our experiment is a 3×2, within-subject design, comparing participant memorisation of Japanese language words for real-world physical objects, high-quality 3D scans of the objects (experienced through an immersive virtual environment), and scanned objects scaled to unrealistic sizes (experienced through an immersive virtual environment). We will also vary whether participants can interact with the objects or not in all conditions. We will monitor pre- and post- exposure scores to understand immediate learning from the experiences.

Literature Review

Physical objects and noun memorisation

The observed relationship between objects and memorization go back a long time (Kirkpatrick, 1894). It has repeatedly been demonstrated that humans form stronger memories around physical objects, rather than representations such as a written words or images (Calkins, 1898; Moore, 1919; Cole Frankel and Sharp, 1969; Deregowski, 2007). We have also seen evidence for a hierarchy in object memorisation, with more accurate or detailed imagery causing stronger memorisation outcomes than less accurate or detailed (Bousfield, Esterson and Whitmarsh, 1957).

Why objects create better memorisation outcomes is still unclear. Paivio (1968) suggests the cause could be either some physical stimulus characterisic (such as concreteness) or a coding process around the stimulus (such as embodied cognition). Two physical stimulus characterisics strongly associated with word memorisation are concreteness and imageability. Concreteness refers to the tangibility of a word; highly concrete words are often physical objects (e.g. piano, chair, bat) while low concrete words are intangible (e.g. justice, honour). Words that are considered more concrete are easier to recall in memory tests (Paivio, 1967). For memorising nouns, concrete representations (e.g. physical objects) may therefore have better learning outcomes than abstracted representations (e.g. pictures).

Imageablility is the ease with which a concept evokes a mental image (Paivio, Yuille & Madigan, 1968). While this is highly correlated with concreteness, there are plenty of examples of highly imageable, but low non-concrete concepts, such as having fun or dancing (D’Angiulli, 2003–2004; Paivio et al., 1968). However, memorisation benefits from imageability may only relate to abstract items (Richardson, 1975), and therefore has limited applications for object-based learning.

A difference in coding process between objects and representations could be explained by embodied cognition. By considering cognition as a means for action (Wilson, 2002), and therefore rooted in motor control (Sejnowski, 1994), objects provide a much richer potentiality for physical engagement. This reflects Glenberg's view (1997) that memory “evolved in service of perception and action in a three-dimensional environment”.

There is evidence for this perspective in neural imaging studies. In a study of monkeys, motor neurons involved in tool use also respond to when the monkeys see - but do not use - the tools (Grafton, Fadiga, Arbib, & Rizzolatti, 1997; Murata et al.,1997). We have also recorded differences in cognitive pathways between processing information related to the identity of objects, and action-relevant properties of objects (Mishkin, Ungerleider, & Macko, 1983; Milner & Goodale, 1995; Shmuelof & Zohary, 2005). Therefore if an object is the potential target of a motor action, its location, orientation and other action-relevant properties get special attention. Objects, then, can be seen as a better method for “encoding of patterns of possible physical interaction with a three-dimensional world” (Jeannerod, 2006), which could explain the better memorisation.

This approach is also evidenced by linguistic body–object interaction studies, a field concerned with understanding the effects of sensorimotor experience. Research has demonstrated that objects that are considered more able to be interacted with at a human level (e.g. a mask, rather than a ship) facilitate faster respone rates (Siakaluk, Pexman, Aguilera, Owena and Sears, 2008). This further suggests that it is the potential opportunity for human agency on an object that encourages better memory-associations.

If Jeannerod's assertion is correct, and that memorisation benefits from the “possible physical interaction with a three-dimensional world”, then we should see similar learning benefits between both physical objects and ones presented in immersive virtual objects, where gesture-controllers are used for interaction. If not, then there may be something else that causes us to create stronger memories with physical objects.

Virtual objects, immersive virtual objects

Whether it is a physical stimulus characterisic or the coding process around the stimulus that causes enhanced object memorisation, there have been limited studies into how virtual objects are perceived and perform compared with physical ones. There is little research comparing noun acquisition between the two conditions, and related investigations often involve an interaction-led learning process, focussing on understanding processes or concepts, in desktop-based environments. For these studies, there is evidence for positive cognitive outcomes for virtual objects compared with physical ones. Kealy found that for model assembly tasks, students studied virtual models for significantly longer but assembled then faster than real ones (Kealy, 2006). Finkelstein et al. (2005) noted that students who learned in a virtual state could build physical circuits quicker than students who had previously used the physical manipulatives (students also provided better explanations and scored better when questioned).

Research has also demonstrated distinct, rather than better, cognitive outcomes between the two conditions. In studying pulleys, students obtain a better understanding as measured by the conceptual assessment of the concept of work with the computer simulation and a better understanding of effort force with the real pulleys (Gire, 2010). In studying direct interaction on physical and virtual objects, Alzayat, Hancock & Nacenta found that Kinesthetic Figural After Effects (a change in understanding of the physical size of an object after a period of exposure to an object of different size) occured with physical artefacts, but not for virtual ones (Alzayat, Hancock & Nacenta, 2014).

Others, however, have found no significant difference between physical and virtual manipulation learning (Klahr, 2007; Zacharia, 2008; Zacharia, 2011; Yuan, Lee & Wang, 2010). Slightly related, Chen that for an virtual object-location task, targeting and errors were higher than a similar physical object-finding task (Chen, 2014).

Learnings from the simulator space, however, suggest that virtual objects are worse for training purposes than physical ones - at least as far as input hardware manipulation. Better learning outcomes have frequently been linked with a higher physical simulator fidelty, such as more realistic controls or conditions for the user (Klauer, 1997; Allen, 2010; Park, Allen, Rosenthal; 2005). This suggests that tactile, realistic physical control aids skill-acquisition compared with more abstracted interactions, and is evidenced in why simulator designers aim to make their training simulators replicate the real-world as much as possible (Liu, Blickensderfer, Macchiarella, & Vincenzi, 2009; Proteau, Marteniuk, & Lévesque, 1992).

If we consider the potential for human agency as the most impactful aspect of objects for cognition, then there are large implications for objects in IVEs, which are virtual representations but also offer the potential for rich embodied human agency. Comparative studies between physical objects and those in immersive virtual realities are limited, although there have been some related experiments on interaction versus non-interaction on objects, and verb and noun memorisation.

Fuhrman et al. (2020) found that allowing hand-controlled manipulations of objects in an IVE provided almost no noun memorization difference over the not-handling condition. Ratcliffe (2020) found that enhanced memorisation occured only for verbs, not nouns, when adding an interaction condition (2020). Comparing these, it seems that for the memorisation of nouns, it does not matter if objects in IVEs are interacted with or not. Whether this means the potential for movement, rather than actual movement, is in important, remains to be understood.

There is evidence that adding stereoscopic 3D to an image, like IVEs compared with monitor displays, does not aid memorisation (R Kaplan-Rakowski, 2016), which suggests mere 3D presentation is not affording much of a benefit. If IVEs are strongly embodying, and thus enabling cognition, then according to Sejnowski (1994) and Glenberg's (???) viewpoints, we should see results either similar to the real world, or enhanced through other beneficial properties of IVE, such as motivation, presence and distraction reduction (Howard, 2019).


Virtual environments provide us with the opportunity to not just reproduce our normal environment, but to extend it in ways that could prove more beneficial for memorisation than could be feasible in the physical world. One potential vector for this is novelty, as there is a well-evidenced link between novelty and memorisation (Bunzeck, Düzel, Neuron, 2006).

Inside studies of novelty and memorisation, there is much discussion about designing to enagage the positive benefits of novelty, without leading to distraction or alienation. There is some evidence that experiences that share a commonality with past experiences ('common novelty') provide the best learning outcomes (Duszkiewicz, McNamara, Takeuchi, 2019). The benefits of a commonality has also been noted when learning in IVEs, with objects needing to be consistent with a learning 'schema' (e.g. the environment they are presented in) for memorisation to be more efficient (Mania, Robinson, Brandt, 2005).

The benefits of novelty on learning also extend beyond the time of encoding. Ballarini et al found that, as long as a novel experience happens within a few hours of a learning occasion, memorisation is improved (Ballarini, Martínez, Perez, Moncada, 2013).

By incorporating novelty into this research experiment, we can understand if it can help us answer one of the ultimate questions of IVE learning: how do we improve on real-world tuition?