Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Open access
  • Published: 25 October 2021

Augmented reality and virtual reality displays: emerging technologies and future perspectives

  • Jianghao Xiong 1 ,
  • En-Lin Hsiang 1 ,
  • Ziqian He 1 ,
  • Tao Zhan   ORCID: orcid.org/0000-0001-5511-6666 1 &
  • Shin-Tson Wu   ORCID: orcid.org/0000-0002-0943-0440 1  

Light: Science & Applications volume  10 , Article number:  216 ( 2021 ) Cite this article

116k Accesses

455 Citations

36 Altmetric

Metrics details

  • Liquid crystals

With rapid advances in high-speed communication and computation, augmented reality (AR) and virtual reality (VR) are emerging as next-generation display platforms for deeper human-digital interactions. Nonetheless, to simultaneously match the exceptional performance of human vision and keep the near-eye display module compact and lightweight imposes unprecedented challenges on optical engineering. Fortunately, recent progress in holographic optical elements (HOEs) and lithography-enabled devices provide innovative ways to tackle these obstacles in AR and VR that are otherwise difficult with traditional optics. In this review, we begin with introducing the basic structures of AR and VR headsets, and then describing the operation principles of various HOEs and lithography-enabled devices. Their properties are analyzed in detail, including strong selectivity on wavelength and incident angle, and multiplexing ability of volume HOEs, polarization dependency and active switching of liquid crystal HOEs, device fabrication, and properties of micro-LEDs (light-emitting diodes), and large design freedoms of metasurfaces. Afterwards, we discuss how these devices help enhance the AR and VR performance, with detailed description and analysis of some state-of-the-art architectures. Finally, we cast a perspective on potential developments and research directions of these photonic devices for future AR and VR displays.

Similar content being viewed by others

research paper on virtual environment

Advanced liquid crystal devices for augmented reality and virtual reality displays: principles and applications

research paper on virtual environment

Achromatic diffractive liquid-crystal optics for virtual reality displays

research paper on virtual environment

Metasurface wavefront control for high-performance user-natural augmented reality waveguide glasses

Introduction.

Recent advances in high-speed communication and miniature mobile computing platforms have escalated a strong demand for deeper human-digital interactions beyond traditional flat panel displays. Augmented reality (AR) and virtual reality (VR) headsets 1 , 2 are emerging as next-generation interactive displays with the ability to provide vivid three-dimensional (3D) visual experiences. Their useful applications include education, healthcare, engineering, and gaming, just to name a few 3 , 4 , 5 . VR embraces a total immersive experience, while AR promotes the interaction between user, digital contents, and real world, therefore displaying virtual images while remaining see-through capability. In terms of display performance, AR and VR face several common challenges to satisfy demanding human vision requirements, including field of view (FoV), eyebox, angular resolution, dynamic range, and correct depth cue, etc. Another pressing demand, although not directly related to optical performance, is ergonomics. To provide a user-friendly wearing experience, AR and VR should be lightweight and ideally have a compact, glasses-like form factor. The above-mentioned requirements, nonetheless, often entail several tradeoff relations with one another, which makes the design of high-performance AR/VR glasses/headsets particularly challenging.

In the 1990s, AR/VR experienced the first boom, which quickly subsided due to the lack of eligible hardware and digital content 6 . Over the past decade, the concept of immersive displays was revisited and received a new round of excitement. Emerging technologies like holography and lithography have greatly reshaped the AR/VR display systems. In this article, we firstly review the basic requirements of AR/VR displays and their associated challenges. Then, we briefly describe the properties of two emerging technologies: holographic optical elements (HOEs) and lithography-based devices (Fig. 1 ). Next, we separately introduce VR and AR systems because of their different device structures and requirements. For the immersive VR system, the major challenges and how these emerging technologies help mitigate the problems will be discussed. For the see-through AR system, we firstly review the present status of light engines and introduce some architectures for the optical combiners. Performance summaries on microdisplay light engines and optical combiners will be provided, that serve as a comprehensive overview of the current AR display systems.

figure 1

The left side illustrates HOEs and lithography-based devices. The right side shows the challenges in VR and architectures in AR, and how the emerging technologies can be applied

Key parameters of AR and VR displays

AR and VR displays face several common challenges to satisfy the demanding human vision requirements, such as FoV, eyebox, angular resolution, dynamic range, and correct depth cue, etc. These requirements often exhibit tradeoffs with one another. Before diving into detailed relations, it is beneficial to review the basic definitions of the above-mentioned display parameters.

Definition of parameters

Taking a VR system (Fig. 2a ) as an example. The light emitting from the display module is projected to a FoV, which can be translated to the size of the image perceived by the viewer. For reference, human vision’s horizontal FoV can be as large as 160° for monocular vision and 120° for overlapped binocular vision 6 . The intersection area of ray bundles forms the exit pupil, which is usually correlated with another parameter called eyebox. The eyebox defines the region within which the whole image FoV can be viewed without vignetting. It therefore generally manifests a 3D geometry 7 , whose volume is strongly dependent on the exit pupil size. A larger eyebox offers more tolerance to accommodate the user’s diversified interpupillary distance (IPD) and wiggling of headset when in use. Angular resolution is defined by dividing the total resolution of the display panel by FoV, which measures the sharpness of a perceived image. For reference, a human visual acuity of 20/20 amounts to 1 arcmin angular resolution, or 60 pixels per degree (PPD), which is considered as a common goal for AR and VR displays. Another important feature of a 3D display is depth cue. Depth cue can be induced by displaying two separate images to the left eye and the right eye, which forms the vergence cue. But the fixed depth of the displayed image often mismatches with the actual depth of the intended 3D image, which leads to incorrect accommodation cues. This mismatch causes the so-called vergence-accommodation conflict (VAC), which will be discussed in detail later. One important observation is that the VAC issue may be more serious in AR than VR, because the image in an AR display is directly superimposed onto the real-world with correct depth cues. The image contrast is dependent on the display panel and stray light. To achieve a high dynamic range, the display panel should exhibit high brightness, low dark level, and more than 10-bits of gray levels. Nowadays, the display brightness of a typical VR headset is about 150–200 cd/m 2 (or nits).

figure 2

a Schematic of a VR display defining FoV, exit pupil, eyebox, angular resolution, and accommodation cue mismatch. b Sketch of an AR display illustrating ACR

Figure 2b depicts a generic structure of an AR display. The definition of above parameters remains the same. One major difference is the influence of ambient light on the image contrast. For a see-through AR display, ambient contrast ratio (ACR) 8 is commonly used to quantify the image contrast:

where L on ( L off ) represents the on (off)-state luminance (unit: nit), L am is the ambient luminance, and T is the see-through transmittance. In general, ambient light is measured in illuminance (lux). For the convenience of comparison, we convert illuminance to luminance by dividing a factor of π, assuming the emission profile is Lambertian. In a normal living room, the illuminance is about 100 lux (i.e., L am  ≈ 30 nits), while in a typical office lighting condition, L am  ≈ 150 nits. For outdoors, on an overcast day, L am  ≈ 300 nits, and L am  ≈ 3000 nits on a sunny day. For AR displays, a minimum ACR should be 3:1 for recognizable images, 5:1 for adequate readability, and ≥10:1 for outstanding readability. To make a simple estimate without considering all the optical losses, to achieve ACR = 10:1 in a sunny day (~3000 nits), the display needs to deliver a brightness of at least 30,000 nits. This imposes big challenges in finding a high brightness microdisplay and designing a low loss optical combiner.

Tradeoffs and potential solutions

Next, let us briefly review the tradeoff relations mentioned earlier. To begin with, a larger FoV leads to a lower angular resolution for a given display resolution. In theory, to overcome this tradeoff only requires a high-resolution-display source, along with high-quality optics to support the corresponding modulation transfer function (MTF). To attain 60 PPD across 100° FoV requires a 6K resolution for each eye. This may be realizable in VR headsets because a large display panel, say 2–3 inches, can still accommodate a high resolution with acceptable manufacture cost. However, for a glasses-like wearable AR display, the conflict between small display size and the high solution becomes obvious as further shrinking the pixel size of a microdisplay is challenging.

To circumvent this issue, the concept of the foveated display is proposed 9 , 10 , 11 , 12 , 13 . The idea is based on that the human eye only has high visual acuity in the central fovea region, which accounts for about 10° FoV. If the high-resolution image is only projected to fovea while the peripheral image remains low resolution, then a microdisplay with 2K resolution can satisfy the need. Regarding the implementation method of foveated display, a straightforward way is to optically combine two display sources 9 , 10 , 11 : one for foveal and one for peripheral FoV. This approach can be regarded as spatial multiplexing of displays. Alternatively, time-multiplexing can also be adopted, by temporally changing the optical path to produce different magnification factors for the corresponding FoV 12 . Finally, another approach without multiplexing is to use a specially designed lens with intended distortion to achieve non-uniform resolution density 13 . Aside from the implementation of foveation, another great challenge is to dynamically steer the foveated region as the viewer’s eye moves. This task is strongly related to pupil steering, which will be discussed in detail later.

A larger eyebox or FoV usually decreases the image brightness, which often lowers the ACR. This is exactly the case for a waveguide AR system with exit pupil expansion (EPE) while operating under a strong ambient light. To improve ACR, one approach is to dynamically adjust the transmittance with a tunable dimmer 14 , 15 . Another solution is to directly boost the image brightness with a high luminance microdisplay and an efficient combiner optics. Details of this topic will be discussed in the light engine section.

Another tradeoff of FoV and eyebox in geometric optical systems results from the conservation of etendue (or optical invariant). To increase the system etendue requires a larger optics, which in turn compromises the form factor. Finally, to address the VAC issue, the display system needs to generate a proper accommodation cue, which often requires the modulation of image depth or wavefront, neither of which can be easily achieved in a traditional geometric optical system. While remarkable progresses have been made to adopt freeform surfaces 16 , 17 , 18 , to further advance AR and VR systems requires additional novel optics with a higher degree of freedom in structure design and light modulation. Moreover, the employed optics should be thin and lightweight. To mitigate the above-mentioned challenges, diffractive optics is a strong contender. Unlike geometric optics relying on curved surfaces to refract or reflect light, diffractive optics only requires a thin layer of several micrometers to establish efficient light diffractions. Two major types of diffractive optics are HOEs based on wavefront recording and manually written devices like surface relief gratings (SRGs) based on lithography. While SRGs have large design freedoms of local grating geometry, a recent publication 19 indicates the combination of HOE and freeform optics can also offer a great potential for arbitrary wavefront generation. Furthermore, the advances in lithography have also enabled optical metasurfaces beyond diffractive and refractive optics, and miniature display panels like micro-LED (light-emitting diode). These devices hold the potential to boost the performance of current AR/VR displays, while keeping a lightweight and compact form factor.

Formation and properties of HOEs

HOE generally refers to a recorded hologram that reproduces the original light wavefront. The concept of holography is proposed by Dennis Gabor 20 , which refers to the process of recording a wavefront in a medium (hologram) and later reconstructing it with a reference beam. Early holography uses intensity-sensitive recording materials like silver halide emulsion, dichromated gelatin, and photopolymer 21 . Among them, photopolymer stands out due to its easy fabrication and ability to capture high-fidelity patterns 22 , 23 . It has therefore found extensive applications like holographic data storage 23 and display 24 , 25 . Photopolymer HOEs (PPHOEs) have a relatively small refractive index modulation and therefore exhibits a strong selectivity on the wavelength and incident angle. Another feature of PPHOE is that several holograms can be recorded into a photopolymer film by consecutive exposures. Later, liquid-crystal holographic optical elements (LCHOEs) based on photoalignment polarization holography have also been developed 25 , 26 . Due to the inherent anisotropic property of liquid crystals, LCHOEs are extremely sensitive to the polarization state of the input light. This feature, combined with the polarization modulation ability of liquid crystal devices, offers a new possibility for dynamic wavefront modulation in display systems.

The formation of PPHOE is illustrated in Fig. 3a . When exposed to an interfering field with high-and-low intensity fringes, monomers tend to move toward bright fringes due to the higher local monomer-consumption rate. As a result, the density and refractive index is slightly larger in bright regions. Note the index modulation δ n here is defined as the difference between the maximum and minimum refractive indices, which may be twice the value in other definitions 27 . The index modulation δ n is typically in the range of 0–0.06. To understand the optical properties of PPHOE, we simulate a transmissive grating and a reflective grating using rigorous coupled-wave analysis (RCWA) 28 , 29 and plot the results in Fig. 3b . Details of grating configuration can be found in Table S1 . Here, the reason for only simulating gratings is that for a general HOE, the local region can be treated as a grating. The observation of gratings can therefore offer a general insight of HOEs. For a transmissive grating, its angular bandwidth (efficiency > 80%) is around 5° ( λ  = 550 nm), while the spectral band is relatively broad, with bandwidth around 175 nm (7° incidence). For a reflective grating, its spectral band is narrow, with bandwidth around 10 nm. The angular bandwidth varies with the wavelength, ranging from 2° to 20°. The strong selectivity of PPHOE on wavelength and incident angle is directly related to its small δ n , which can be adjusted by controlling the exposure dosage.

figure 3

a Schematic of the formation of PPHOE. Simulated efficiency plots for b1 transmissive and b2 reflective PPHOEs. c Working principle of multiplexed PPHOE. d Formation and molecular configurations of LCHOEs. Simulated efficiency plots for e1 transmissive and e2 reflective LCHOEs. f Illustration of polarization dependency of LCHOEs

A distinctive feature of PPHOE is the ability to multiplex several holograms into one film sample. If the exposure dosage of a recording process is controlled so that the monomers are not completely depleted in the first exposure, the remaining monomers can continue to form another hologram in the following recording process. Because the total amount of monomer is fixed, there is usually an efficiency tradeoff between multiplexed holograms. The final film sample would exhibit the wavefront modulation functions of multiple holograms (Fig. 3c ).

Liquid crystals have also been used to form HOEs. LCHOEs can generally be categorized into volume-recording type and surface-alignment type. Volume-recording type LCHOEs are either based on early polarization holography recordings with azo-polymer 30 , 31 , or holographic polymer-dispersed liquid crystals (HPDLCs) 32 , 33 formed by liquid-crystal-doped photopolymer. Surface-alignment type LCHOEs are based on photoalignment polarization holography (PAPH) 34 . The first step is to record the desired polarization pattern in a thin photoalignment layer, and the second step is to use it to align the bulk liquid crystal 25 , 35 . Due to the simple fabrication process, high efficiency, and low scattering from liquid crystal’s self-assembly nature, surface-alignment type LCHOEs based on PAPH have recently attracted increasing interest in applications like near-eye displays. Here, we shall focus on this type of surface-alignment LCHOE and refer to it as LCHOE thereafter for simplicity.

The formation of LCHOEs is illustrated in Fig. 3d . The information of the wavefront and the local diffraction pattern is recorded in a thin photoalignment layer. The volume liquid crystal deposited on the photoalignment layer, depending on whether it is nematic liquid crystal or cholesteric liquid crystal (CLC), forms a transmissive or a reflective LCHOE. In a transmissive LCHOE, the bulk nematic liquid crystal molecules generally follow the pattern of the bottom alignment layer. The smallest allowable pattern period is governed by the liquid crystal distortion-free energy model, which predicts the pattern period should generally be larger than sample thickness 36 , 37 . This results in a maximum diffraction angle under 20°. On the other hand, in a reflective LCHOE 38 , 39 , the bulk CLC molecules form a stable helical structure, which is tilted to match the k -vector of the bottom pattern. The structure exhibits a very low distorted free energy 40 , 41 and can accommodate a pattern period that is small enough to diffract light into the total internal reflection (TIR) of a glass substrate.

The diffraction property of LCHOEs is shown in Fig. 3e . The maximum refractive index modulation of LCHOE is equal to the liquid crystal birefringence (Δ n ), which may vary from 0.04 to 0.5, depending on the molecular conjugation 42 , 43 . The birefringence used in our simulation is Δ n  = 0.15. Compared to PPHOEs, the angular and spectral bandwidths are significantly larger for both transmissive and reflective LCHOEs. For a transmissive LCHOE, its angular bandwidth is around 20° ( λ  = 550 nm), while the spectral bandwidth is around 300 nm (7° incidence). For a reflective LCHOE, its spectral bandwidth is around 80 nm and angular bandwidth could vary from 15° to 50°, depending on the wavelength.

The anisotropic nature of liquid crystal leads to LCHOE’s unique polarization-dependent response to an incident light. As depicted in Fig. 3f , for a transmissive LCHOE the accumulated phase is opposite for the conjugated left-handed circular polarization (LCP) and right-handed circular polarization (RCP) states, leading to reversed diffraction directions. For a reflective LCHOE, the polarization dependency is similar to that of a normal CLC. For the circular polarization with the same handedness as the helical structure of CLC, the diffraction is strong. For the opposite circular polarization, the diffraction is negligible.

Another distinctive property of liquid crystal is its dynamic response to an external voltage. The LC reorientation can be controlled with a relatively low voltage (<10 V rms ) and the response time is on the order of milliseconds, depending mainly on the LC viscosity and layer thickness. Methods to dynamically control LCHOEs can be categorized as active addressing and passive addressing, which can be achieved by either directly switching the LCHOE or modulating the polarization state with an active waveplate. Detailed addressing methods will be described in the VAC section.

Lithography-enabled devices

Lithography technologies are used to create arbitrary patterns on wafers, which lays the foundation of the modern integrated circuit industry 44 . Photolithography is suitable for mass production while electron/ion beam lithography is usually used to create photomask for photolithography or to write structures with nanometer-scale feature size. Recent advances in lithography have enabled engineered structures like optical metasurfaces 45 , SRGs 46 , as well as micro-LED displays 47 . Metasurfaces exhibit a remarkable design freedom by varying the shape of meta-atoms, which can be utilized to achieve novel functions like achromatic focus 48 and beam steering 49 . Similarly, SRGs also offer a large design freedom by manipulating the geometry of local grating regions to realize desired optical properties. On the other hand, micro-LED exhibits several unique features, such as ultrahigh peak brightness, small aperture ratio, excellent stability, and nanosecond response time, etc. As a result, micro-LED is a promising candidate for AR and VR systems for achieving high ACR and high frame rate for suppressing motion image blurs. In the following section, we will briefly review the fabrication and properties of micro-LEDs and optical modulators like metasurfaces and SRGs.

Fabrication and properties of micro-LEDs

LEDs with a chip size larger than 300 μm have been widely used in solid-state lighting and public information displays. Recently, micro-LEDs with chip sizes <5 μm have been demonstrated 50 . The first micro-LED disc with a diameter of about 12 µm was demonstrated in 2000 51 . After that, a single color (blue or green) LED microdisplay was demonstrated in 2012 52 . The high peak brightness, fast response time, true dark state, and long lifetime of micro-LEDs are attractive for display applications. Therefore, many companies have since released their micro-LED prototypes or products, ranging from large-size TVs to small-size microdisplays for AR/VR applications 53 , 54 . Here, we focus on micro-LEDs for near-eye display applications. Regarding the fabrication of micro-LEDs, through the metal-organic chemical vapor deposition (MOCVD) method, the AlGaInP epitaxial layer is grown on GaAs substrate for red LEDs, and GaN epitaxial layers on sapphire substrate for green and blue LEDs. Next, a photolithography process is applied to define the mesa and deposit electrodes. To drive the LED array, the fabricated micro-LEDs are transferred to a CMOS (complementary metal oxide semiconductor) driver board. For a small size (<2 inches) microdisplay used in AR or VR, the precision of the pick-and-place transfer process is hard to meet the high-resolution-density (>1000 pixel per inch) requirement. Thus, the main approach to assemble LED chips with driving circuits is flip-chip bonding 50 , 55 , 56 , 57 , as Fig. 4a depicts. In flip-chip bonding, the mesa and electrode pads should be defined and deposited before the transfer process, while metal bonding balls should be preprocessed on the CMOS substrate. After that, thermal-compression method is used to bond the two wafers together. However, due to the thermal mismatch of LED chip and driving board, as the pixel size decreases, the misalignment between the LED chip and the metal bonding ball on the CMOS substrate becomes serious. In addition, the common n-GaN layer may cause optical crosstalk between pixels, which degrades the image quality. To overcome these issues, the LED epitaxial layer can be firstly metal-bonded with the silicon driver board, followed by the photolithography process to define the LED mesas and electrodes. Without the need for an alignment process, the pixel size can be reduced to <5 µm 50 .

figure 4

a Illustration of flip-chip bonding technology. b Simulated IQE-LED size relations for red and blue LEDs based on ABC model. c Comparison of EQE of different LED sizes with and without KOH and ALD side wall treatment. d Angular emission profiles of LEDs with different sizes. Metasurfaces based on e resonance-tuning, f non-resonance tuning and g combination of both. h Replication master and i replicated SRG based on nanoimprint lithography. Reproduced from a ref. 55 with permission from AIP Publishing, b ref. 61 with permission from PNAS, c ref. 66 with permission from IOP Publishing, d ref. 67 with permission from AIP Publishing, e ref. 69 with permission from OSA Publishing f ref. 48 with permission from AAAS g ref. 70 with permission from AAAS and h , i ref. 85 with permission from OSA Publishing

In addition to manufacturing process, the electrical and optical characteristics of LED also depend on the chip size. Generally, due to Shockley-Read-Hall (SRH) non-radiative recombination on the sidewall of active area, a smaller LED chip size results in a lower internal quantum efficiency (IQE), so that the peak IQE driving point will move toward a higher current density due to increased ratio of sidewall surface to active volume 58 , 59 , 60 . In addition, compared to the GaN-based green and blue LEDs, the AlGaInP-based red LEDs with a larger surface recombination and carrier diffusion length suffer a more severe efficiency drop 61 , 62 . Figure 4b shows the simulated result of IQE drop in relation with the LED chip size of blue and red LEDs based on ABC model 63 . To alleviate the efficiency drop caused by sidewall defects, depositing passivation materials by atomic layer deposition (ALD) or plasma enhanced chemical vapor deposition (PECVD) is proven to be helpful for both GaN and AlGaInP based LEDs 64 , 65 . In addition, applying KOH (Potassium hydroxide) treatment after ALD can further reduce the EQE drop of micro-LEDs 66 (Fig. 4c ). Small-size LEDs also exhibit some advantages, such as higher light extraction efficiency (LEE). Compared to an 100-µm LED, the LEE of a 2-µm LED increases from 12.2 to 25.1% 67 . Moreover, the radiation pattern of micro-LED is more directional than that of a large-size LED (Fig. 4d ). This helps to improve the lens collection efficiency in AR/VR display systems.

Metasurfaces and SGs

Thanks to the advances in lithography technology, low-loss dielectric metasurfaces working in the visible band have recently emerged as a platform for wavefront shaping 45 , 48 , 68 . They consist of an array of subwavelength-spaced structures with individually engineered wavelength-dependent polarization/phase/ amplitude response. In general, the light modulation mechanisms can be classified into resonant tuning 69 (Fig. 4e ), non-resonant tuning 48 (Fig. 4f ), and combination of both 70 (Fig. 4g ). In comparison with non-resonant tuning (based on geometric phase and/or dynamic propagation phase), the resonant tuning (such as Fabry–Pérot resonance, Mie resonance, etc.) is usually associated with a narrower operating bandwidth and a smaller out-of-plane aspect ratio (height/width) of nanostructures. As a result, they are easier to fabricate but more sensitive to fabrication tolerances. For both types, materials with a higher refractive index and lower absorption loss are beneficial to reduce the aspect ratio of nanostructure and improve the device efficiency. To this end, titanium dioxide (TiO 2 ) and gallium nitride (GaN) are the major choices for operating in the entire visible band 68 , 71 . While small-sized metasurfaces (diameter <1 mm) are usually fabricated via electron-beam lithography or focused ion beam milling in the labs, the ability of mass production is the key to their practical adoption. The deep ultraviolet (UV) photolithography has proven its feasibility for reproducing centimeter-size metalenses with decent imaging performance, while it requires multiple steps of etching 72 . Interestingly, the recently developed UV nanoimprint lithography based on a high-index nanocomposite only takes a single step and can obtain an aspect ratio larger than 10, which shows great promise for high-volume production 73 .

The arbitrary wavefront shaping capability and the thinness of the metasurfaces have aroused strong research interests in the development of novel AR/VR prototypes with improved performance. Lee et al. employed nanoimprint lithography to fabricate a centimeter-size, geometric-phase metalens eyepiece for full-color AR displays 74 . Through tailoring its polarization conversion efficiency and stacking with a circular polarizer, the virtual image can be superimposed with the surrounding scene. The large numerical aperture (NA~0.5) of the metalens eyepiece enables a wide FoV (>76°) that conventional optics are difficult to obtain. However, the geometric phase metalens is intrinsically a diffractive lens that also suffers from strong chromatic aberrations. To overcome this issue, an achromatic lens can be designed via simultaneously engineering the group delay and the group delay dispersion 75 , 76 , which will be described in detail later. Other novel and/or improved near-eye display architectures include metasurface-based contact lens-type AR 77 , achromatic metalens array enabled integral-imaging light field displays 78 , wide FoV lightguide AR with polarization-dependent metagratings 79 , and off-axis projection-type AR with an aberration-corrected metasurface combiner 80 , 81 , 82 . Nevertheless, from the existing AR/VR prototypes, metasurfaces still face a strong tradeoff between numerical aperture (for metalenses), chromatic aberration, monochromatic aberration, efficiency, aperture size, and fabrication complexity.

On the other hand, SRGs are diffractive gratings that have been researched for decades as input/output couplers of waveguides 83 , 84 . Their surface is composed of corrugated microstructures, and different shapes including binary, blazed, slanted, and even analogue can be designed. The parameters of the corrugated microstructures are determined by the target diffraction order, operation spectral bandwidth, and angular bandwidth. Compared to metasurfaces, SRGs have a much larger feature size and thus can be fabricated via UV photolithography and subsequent etching. They are usually replicated by nanoimprint lithography with appropriate heating and surface treatment. According to a report published a decade ago, SRGs with a height of 300 nm and a slant angle of up to 50° can be faithfully replicated with high yield and reproducibility 85 (Fig. 4g, h ).

Challenges and solutions of VR displays

The fully immersive nature of VR headset leads to a relatively fixed configuration where the display panel is placed in front of the viewer’s eye and an imaging optics is placed in-between. Regarding the system performance, although inadequate angular resolution still exists in some current VR headsets, the improvement of display panel resolution with advanced fabrication process is expected to solve this issue progressively. Therefore, in the following discussion, we will mainly focus on two major challenges: form factor and 3D cue generation.

Form factor

Compact and lightweight near-eye displays are essential for a comfortable user experience and therefore highly desirable in VR headsets. Current mainstream VR headsets usually have a considerably larger volume than eyeglasses, and most of the volume is just empty. This is because a certain distance is required between the display panel and the viewing optics, which is usually close to the focal length of the lens system as illustrated in Fig. 5a . Conventional VR headsets employ a transmissive lens with ~4 cm focal length to offer a large FoV and eyebox. Fresnel lenses are thinner than conventional ones, but the distance required between the lens and the panel does not change significantly. In addition, the diffraction artifacts and stray light caused by the Fresnel grooves can degrade the image quality, or MTF. Although the resolution density, quantified as pixel per inch (PPI), of current VR headsets is still limited, eventually Fresnel lens will not be an ideal solution when a high PPI display is available. The strong chromatic aberration of Fresnel singlet should also be compensated if a high-quality imaging system is preferred.

figure 5

a Schematic of a basic VR optical configuration. b Achromatic metalens used as VR eyepiece. c VR based on curved display and lenslet array. d Basic working principle of a VR display based on pancake optics. e VR with pancake optics and Fresnel lens array. f VR with pancake optics based on purely HOEs. Reprinted from b ref. 87 under the Creative Commons Attribution 4.0 License. Adapted from c ref. 88 with permission from IEEE, e ref. 91 and f ref. 92 under the Creative Commons Attribution 4.0 License

It is tempting to replace the refractive elements with a single thin diffractive lens like a transmissive LCHOE. However, the diffractive nature of such a lens will result in serious color aberrations. Interestingly, metalenses can fulfil this objective without color issues. To understand how metalenses achieve achromatic focus, let us first take a glance at the general lens phase profile \(\Phi (\omega ,r)\) expanded as a Taylor series 75 :

where \(\varphi _0(\omega )\) is the phase at the lens center, \(F\left( \omega \right)\) is the focal length as a function of frequency ω , r is the radial coordinate, and \(\omega _0\) is the central operation frequency. To realize achromatic focus, \(\partial F{{{\mathrm{/}}}}\partial \omega\) should be zero. With a designed focal length, the group delay \(\partial \Phi (\omega ,r){{{\mathrm{/}}}}\partial \omega\) and the group delay dispersion \(\partial ^2\Phi (\omega ,r){{{\mathrm{/}}}}\partial \omega ^2\) can be determined, and \(\varphi _0(\omega )\) is an auxiliary degree of freedom of the phase profile design. In the design of an achromatic metalens, the group delay is a function of the radial coordinate and monotonically increases with the metalens radius. Many designs have proven that the group delay has a limited variation range 75 , 76 , 78 , 86 . According to Shrestha et al. 86 , there is an inevitable tradeoff between the maximum radius of the metalens, NA, and operation bandwidth. Thus, the reported achromatic metalenses at visible usually have limited lens aperture (e.g., diameter < 250 μm) and NA (e.g., <0.2). Such a tradeoff is undesirable in VR displays, as the eyepiece favors a large clear aperture (inch size) and a reasonably high NA (>0.3) to maintain a wide FoV and a reasonable eye relief 74 .

To overcome this limitation, Li et al. 87 proposed a novel zone lens method. Unlike the traditional phase Fresnel lens where the zones are determined by the phase reset, the new approach divides the zones by the group delay reset. In this way, the lens aperture and NA can be much enlarged, and the group delay limit is bypassed. A notable side effect of this design is the phase discontinuity at zone boundaries that will contribute to higher-order focusing. Therefore, significant efforts have been conducted to find the optimal zone transition locations and to minimize the phase discontinuities. Using this method, they have demonstrated an impressive 2-mm-diameter metalens with NA = 0.7 and nearly diffraction-limited focusing for the designed wavelengths (488, 532, 658 nm) (Fig. 5b ). Such a metalens consists of 681 zones and works for the visible band ranging from 470 to 670 nm, though the focusing efficiency is in the order of 10%. This is a great starting point for the achromatic metalens to be employed as a compact, chromatic-aberration-free eyepiece in near-eye displays. Future challenges are how to further increase the aperture size, correct the off-axis aberrations, and improve the optical efficiency.

Besides replacing the refractive lens with an achromatic metalens, another way to reduce system focal length without decreasing NA is to use a lenslet array 88 . As depicted in Fig. 5c , both the lenslet array and display panel adopt a curved structure. With the latest flexible OLED panel, the display can be easily curved in one dimension. The system exhibits a large diagonal FoV of 180° with an eyebox of 19 by 12 mm. The geometry of each lenslet is optimized separately to achieve an overall performance with high image quality and reduced distortions.

Aside from trying to shorten the system focal length, another way to reduce total track is to fold optical path. Recently, polarization-based folded lenses, also known as pancake optics, are under active development for VR applications 89 , 90 . Figure 5d depicts the structure of an exemplary singlet pancake VR lens system. The pancake lenses can offer better imaging performance with a compact form factor since there are more degrees of freedom in the design and the actual light path is folded thrice. By using a reflective surface with a positive power, the field curvature of positive refractive lenses can be compensated. Also, the reflective surface has no chromatic aberrations and it contributes considerable optical power to the system. Therefore, the optical power of refractive lenses can be smaller, resulting in an even weaker chromatic aberration. Compared to Fresnel lenses, the pancake lenses have smooth surfaces and much fewer diffraction artifacts and stray light. However, such a pancake lens design is not perfect either, whose major shortcoming is low light efficiency. With two incidences of light on the half mirror, the maximum system efficiency is limited to 25% for a polarized input and 12.5% for an unpolarized input light. Moreover, due to the existence of multiple surfaces in the system, stray light caused by surface reflections and polarization leakage may lead to apparent ghost images. As a result, the catadioptric pancake VR headset usually manifests a darker imagery and lower contrast than the corresponding dioptric VR.

Interestingly, the lenslet and pancake optics can be combined to further reduce the system form. Bang et al. 91 demonstrated a compact VR system with a pancake optics and a Fresnel lenslet array. The pancake optics serves to fold the optical path between the display panel and the lenslet array (Fig. 5e ). Another Fresnel lens is used to collect the light from the lenslet array. The system has a decent horizontal FoV of 102° and an eyebox of 8 mm. However, a certain degree of image discontinuity and crosstalk are still present, which can be improved with further optimizations on the Fresnel lens and the lenslet array.

One step further, replacing all conventional optics in catadioptric VR headset with holographic optics can make the whole system even thinner. Maimone and Wang demonstrated such a lightweight, high-resolution, and ultra-compact VR optical system using purely HOEs 92 . This holographic VR optics was made possible by combining several innovative optical components, including a reflective PPHOE, a reflective LCHOE, and a PPHOE-based directional backlight with laser illumination, as shown in Fig. 5f . Since all the optical power is provided by the HOEs with negligible weight and volume, the total physical thickness can be reduced to <10 mm. Also, unlike conventional bulk optics, the optical power of a HOE is independent of its thickness, only subject to the recording process. Another advantage of using holographic optical devices is that they can be engineered to offer distinct phase profiles for different wavelengths and angles of incidence, adding extra degrees of freedom in optical designs for better imaging performance. Although only a single-color backlight has been demonstrated, such a PPHOE has the potential to achieve full-color laser backlight with multiplexing ability. The PPHOE and LCHOE in the pancake optics can also be optimized at different wavelengths for achieving high-quality full-color images.

Vergence-accommodation conflict

Conventional VR displays suffer from VAC, which is a common issue for stereoscopic 3D displays 93 . In current VR display modules, the distance between the display panel and the viewing optics is fixed, which means the VR imagery is displayed at a single depth. However, the image contents are generated by parallax rendering in three dimensions, offering distinct images for two eyes. This approach offers a proper stimulus to vergence but completely ignores the accommodation cue, which leads to the well-known VAC that can cause an uncomfortable user experience. Since the beginning of this century, numerous methods have been proposed to solve this critical issue. Methods to produce accommodation cue include multifocal/varifocal display 94 , holographic display 95 , and integral imaging display 96 . Alternatively, elimination of accommodation cue using a Maxwellian-view display 93 also helps to mitigate the VAC. However, holographic displays and Maxwellian-view displays generally require a totally different optical architecture than current VR systems. They are therefore more suitable for AR displays, which will be discussed later. Integral imaging, on the other hand, has an inherent tradeoff between view number and resolution. For current VR headsets pursuing high resolution to match human visual acuity, it may not be an appealing solution. Therefore, multifocal/varifocal displays that rely on depth modulation is a relatively practical and effective solution for VR headsets. Regarding the working mechanism, multifocal displays present multiple images with different depths to imitate the original 3D scene. Varifocal displays, in contrast, only show one image at each time frame. The image depth matches the viewer’s vergence depth. Nonetheless, the pre-knowledge of the viewer’s vergence depth requires an additional eye-tracking module. Despite different operation principles, a varifocal display can often be converted to a multifocal display as long as the varifocal module has enough modulation bandwidth to support multiple depths in a time frame.

To achieve depth modulation in a VR system, traditional liquid lens 97 , 98 with tunable focus suffers from the small aperture and large aberrations. Alvarez lens 99 is another tunable-focus solution but it requires mechanical adjustment, which adds to system volume and complexity. In comparison, transmissive LCHOEs with polarization dependency can achieve focus adjustment with electronic driving. Its ultra-thinness also satisfies the requirement of small form factors in VR headsets. The diffractive behavior of transmissive LCHOEs is often interpreted by the mechanism of Pancharatnam-Berry phase (also known as geometric phase) 100 . They are therefore often called Pancharatnam-Berry optical elements (PBOEs). The corresponding lens component is referred as Pancharatnam-Berry lens (PBL).

Two main approaches are used to switch the focus of a PBL, active addressing and passive addressing. In active addressing, the PBL itself (made of LC) can be switched by an applied voltage (Fig. 6a ). The optical power of the liquid crystal PBLs can be turned-on and -off by controlling the voltage. Stacking multiple active PBLs can produce 2 N depths, where N is the number of PBLs. The drawback of using active PBLs, however, is the limited spectral bandwidth since their diffraction efficiency is usually optimized at a single wavelength. In passive addressing, the depth modulation is achieved through changing the polarization state of input light by a switchable half-wave plate (HWP) (Fig. 6b ). The focal length can therefore be switched thanks to the polarization sensitivity of PBLs. Although this approach has a slightly more complicated structure, the overall performance can be better than the active one, because the PBLs made of liquid crystal polymer can be designed to manifest high efficiency within the entire visible spectrum 101 , 102 .

figure 6

Working principles of a depth switching PBL module based on a active addressing and b passive addressing. c A four-depth multifocal display based on time multiplexing. d A two-depth multifocal display based on polarization multiplexing. Reproduced from c ref. 103 with permission from OSA Publishing and d ref. 104 with permission from OSA Publishing

With the PBL module, multifocal displays can be built using time-multiplexing technique. Zhan et al. 103 demonstrated a four-depth multifocal display using two actively switchable liquid crystal PBLs (Fig. 6c ). The display is synchronized with the PBL module, which lowers the frame rate by the number of depths. Alternatively, multifocal displays can also be achieved by polarization-multiplexing, as demonstrated by Tan et al. 104 . The basic principle is to adjust the polarization state of local pixels so the image content on two focal planes of a PBL can be arbitrarily controlled (Fig. 6d ). The advantage of polarization multiplexing is that it does not sacrifice the frame rate, but it can only support two planes because only two orthogonal polarization states are available. Still, it can be combined with time-multiplexing to reduce the frame rate sacrifice by half. Naturally, varifocal displays can also be built with a PBL module. A fast-response 64-depth varifocal module with six PBLs has been demonstrated 105 .

The compact structure of PBL module leads to a natural solution of integrating it with above-mentioned pancake optics. A compact VR headset with dynamic depth modulation to solve VAC is therefore possible in practice. Still, due to the inherent diffractive nature of PBL, the PBL module face the issue of chromatic dispersion of focal length. To compensate for different focal depths for RGB colors may require additional digital corrections in image-rendering.

Architectures of AR displays

Unlike VR displays with a relatively fixed optical configuration, there exist a vast number of architectures in AR displays. Therefore, instead of following the narrative of tackling different challenges, a more appropriate way to review AR displays is to separately introduce each architecture and discuss its associated engineering challenges. An AR display usually consists of a light engine and an optical combiner. The light engine serves as display image source, while the combiner delivers the displayed images to viewer’s eye and in the meantime transmits the environment light. Some performance parameters like frame rate and power consumption are mainly determined by the light engine. Parameters like FoV, eyebox and MTF are primarily dependent on the combiner optics. Moreover, attributes like image brightness, overall efficiency, and form factor are influenced by both light engine and combiner. In this section, we will firstly discuss the light engine, where the latest advances in micro-LED on chip are reviewed and compared with existing microdisplay systems. Then, we will introduce two main types of combiners: free-space combiner and waveguide combiner.

Light engine

The light engine determines several essential properties of the AR system like image brightness, power consumption, frame rate, and basic etendue. Several types of microdisplays have been used in AR, including micro-LED, micro-organic-light-emitting-diodes (micro-OLED), liquid-crystal-on-silicon (LCoS), digital micromirror device (DMD), and laser beam scanning (LBS) based on micro-electromechanical system (MEMS). We will firstly describe the working principles of these devices and then analyze their performance. For those who are more interested in final performance parameters than details, Table 1 provides a comprehensive summary.

Working principles

Micro-LED and micro-OLED are self-emissive display devices. They are usually more compact than LCoS and DMD because no illumination optics is required. The fundamentally different material systems of LED and OLED lead to different approaches to achieve full-color displays. Due to the “green gap” in LEDs, red LEDs are manufactured on a different semiconductor material from green and blue LEDs. Therefore, how to achieve full-color display in high-resolution density microdisplays is quite a challenge for micro-LEDs. Among several solutions under research are two main approaches. The first is to combine three separate red, green and blue (RGB) micro-LED microdisplay panels 106 . Three single-color micro-LED microdisplays are manufactured separately through flip-chip transfer technology. Then, the projected images from three microdisplay panels are integrated by a trichroic prism (Fig. 7a ).

figure 7

a RGB micro-LED microdisplays combined by a trichroic prism. b QD-based micro-LED microdisplay. c Micro-OLED display with 4032 PPI. Working principles of d LCoS, e DMD, and f MEMS-LBS display modules. Reprinted from a ref. 106 with permission from IEEE, b ref. 108 with permission from Chinese Laser Press, c ref. 121 with permission from Jon Wiley and Sons, d ref. 124 with permission from Spring Nature, e ref. 126 with permission from Springer and f ref. 128 under the Creative Commons Attribution 4.0 License

Another solution is to assemble color-conversion materials like quantum dot (QD) on top of blue or ultraviolet (UV) micro-LEDs 107 , 108 , 109 (Fig. 7b ). The quantum dot color filter (QDCF) on top of the micro-LED array is mainly fabricated by inkjet printing or photolithography 110 , 111 . However, the display performance of color-conversion micro-LED displays is restricted by the low color-conversion efficiency, blue light leakage, and color crosstalk. Extensive efforts have been conducted to improve the QD-micro-LED performance. To boost QD conversion efficiency, structure designs like nanoring 112 and nanohole 113 , 114 have been proposed, which utilize the Förster resonance energy transfer mechanism to transfer excessive excitons in the LED active region to QD. To prevent blue light leakage, methods using color filters or reflectors like distributed Bragg reflector (DBR) 115 and CLC film 116 on top of QDCF are proposed. Compared to color filters that absorb blue light, DBR and CLC film help recycle the leaked blue light to further excite QDs. Other methods to achieve full-color micro-LED display like vertically stacked RGB micro-LED array 61 , 117 , 118 and monolithic wavelength tunable nanowire LED 119 are also under investigation.

Micro-OLED displays can be generally categorized into RGB OLED and white OLED (WOLED). RGB OLED displays have separate sub-pixel structures and optical cavities, which resonate at the desirable wavelength in RGB channels, respectively. To deposit organic materials onto the separated RGB sub-pixels, a fine metal mask (FMM) that defines the deposition area is required. However, high-resolution RGB OLED microdisplays still face challenges due to the shadow effect during the deposition process through FMM. In order to break the limitation, a silicon nitride film with small shadow has been proposed as a mask for high-resolution deposition above 2000 PPI (9.3 µm) 120 .

WOLED displays use color filters to generate color images. Without the process of depositing patterned organic materials, a high-resolution density up to 4000 PPI has been achieved 121 (Fig. 7c ). However, compared to RGB OLED, the color filters in WOLED absorb about 70% of the emitted light, which limits the maximum brightness of the microdisplay. To improve the efficiency and peak brightness of WOLED microdisplays, in 2019 Sony proposed to apply newly designed cathodes (InZnO) and microlens arrays on OLED microdisplays, which increased the peak brightness from 1600 nits to 5000 nits 120 . In addition, OLEDWORKs has proposed a multi-stacked OLED 122 with optimized microcavities whose emission spectra match the transmission bands of the color filters. The multi-stacked OLED shows a higher luminous efficiency (cd/A), but also requires a higher driving voltage. Recently, by using meta-mirrors as bottom reflective anodes, patterned microcavities with more than 10,000 PPI have been obtained 123 . The high-resolution meta-mirrors generate different reflection phases in the RGB sub-pixels to achieve desirable resonant wavelengths. The narrow emission spectra from the microcavity help to reduce the loss from color filters or even eliminate the need of color filters.

LCoS and DMD are light-modulating displays that generate images by controlling the reflection of each pixel. For LCoS, the light modulation is achieved by manipulating the polarization state of output light through independently controlling the liquid crystal reorientation in each pixel 124 , 125 (Fig. 7d ). Both phase-only and amplitude modulators have been employed. DMD is an amplitude modulation device. The modulation is achieved through controlling the tilt angle of bi-stable micromirrors 126 (Fig. 7e ). To generate an image, both LCoS and DMD rely on the light illumination systems, with LED or laser as light source. For LCoS, the generation of color image can be realized either by RGB color filters on LCoS (with white LEDs) or color-sequential addressing (with RGB LEDs or lasers). However, LCoS requires a linearly polarized light source. For an unpolarized LED light source, usually, a polarization recycling system 127 is implemented to improve the optical efficiency. For a single-panel DMD, the color image is mainly obtained through color-sequential addressing. In addition, DMD does not require a polarized light so that it generally exhibits a higher efficiency than LCoS if an unpolarized light source is employed.

MEMS-based LBS 128 , 129 utilizes micromirrors to directly scan RGB laser beams to form two-dimensional (2D) images (Fig. 7f ). Different gray levels are achieved by pulse width modulation (PWM) of the employed laser diodes. In practice, 2D scanning can be achieved either through a 2D scanning mirror or two 1D scanning mirrors with an additional focusing lens after the first mirror. The small size of MEMS mirror offers a very attractive form factor. At the same time, the output image has a large depth-of-focus (DoF), which is ideal for projection displays. One shortcoming, though, is that the small system etendue often hinders its applications in some traditional display systems.

Comparison of light engine performance

There are several important parameters for a light engine, including image resolution, brightness, frame rate, contrast ratio, and form factor. The resolution requirement (>2K) is similar for all types of light engines. The improvement of resolution is usually accomplished through the manufacturing process. Thus, here we shall focus on other three parameters.

Image brightness usually refers to the measured luminance of a light-emitting object. This measurement, however, may not be accurate for a light engine as the light from engine only forms an intermediate image, which is not directly viewed by the user. On the other hand, to solely focus on the brightness of a light engine could be misleading for a wearable display system like AR. Nowadays, data projectors with thousands of lumens are available. But the power consumption is too high for a battery-powered wearable AR display. Therefore, a more appropriate way to evaluate a light engine’s brightness is to use luminous efficacy (lm/W) measured by dividing the final output luminous flux (lm) by the input electric power (W). For a self-emissive device like micro-LED or micro-OLED, the luminous efficacy is directly determined by the device itself. However, for LCoS and DMD, the overall luminous efficacy should take into consideration the light source luminous efficacy, the efficiency of illumination optics, and the efficiency of the employed spatial light modulator (SLM). For a MEMS LBS engine, the efficiency of MEMS mirror can be considered as unity so that the luminous efficacy basically equals to that of the employed laser sources.

As mentioned earlier, each light engine has a different scheme for generating color images. Therefore, we separately list luminous efficacy of each scheme for a more inclusive comparison. For micro-LEDs, the situation is more complicated because the EQE depends on the chip size. Based on previous studies 130 , 131 , 132 , 133 , we separately calculate the luminous efficacy for RGB micro-LEDs with chip size ≈ 20 µm. For the scheme of direct combination of RGB micro-LEDs, the luminous efficacy is around 5 lm/W. For QD-conversion with blue micro-LEDs, the luminous efficacy is around 10 lm/W with the assumption of 100% color conversion efficiency, which has been demonstrated using structure engineering 114 . For micro-OLEDs, the calculated luminous efficacy is about 4–8 lm/W 120 , 122 . However, the lifetime and EQE of blue OLED materials depend on the driving current. To continuously display an image with brightness higher than 10,000 nits may dramatically shorten the device lifetime. The reason we compare the light engine at 10,000 nits is that it is highly desirable to obtain 1000 nits for the displayed image in order to keep ACR>3:1 with a typical AR combiner whose optical efficiency is lower than 10%.

For an LCoS engine using a white LED as light source, the typical optical efficiency of the whole engine is around 10% 127 , 134 . Then the engine luminous efficacy is estimated to be 12 lm/W with a 120 lm/W white LED source. For a color sequential LCoS using RGB LEDs, the absorption loss from color filters is eliminated, but the luminous efficacy of RGB LED source is also decreased to about 30 lm/W due to lower efficiency of red and green LEDs and higher driving current 135 . Therefore, the final luminous efficacy of the color sequential LCoS engine is also around 10 lm/W. If RGB linearly polarized lasers are employed instead of LEDs, then the LCoS engine efficiency can be quite high due to the high degree of collimation. The luminous efficacy of RGB laser source is around 40 lm/W 136 . Therefore, the laser-based LCoS engine is estimated to have a luminous efficacy of 32 lm/W, assuming the engine optical efficiency is 80%. For a DMD engine with RGB LEDs as light source, the optical efficiency is around 50% 137 , 138 , which leads to a luminous efficacy of 15 lm/W. By switching to laser light sources, the situation is similar to LCoS, with the luminous efficacy of about 32 lm/W. Finally, for MEMS-based LBS engine, there is basically no loss from the optics so that the final luminous efficacy is 40 lm/W. Detailed calculations of luminous efficacy can be found in Supplementary Information .

Another aspect of a light engine is the frame rate, which determines the volume of information it can deliver in a unit time. A high volume of information is vital for the construction of a 3D light field to solve the VAC issue. For micro-LEDs, the device response time is around several nanoseconds, which allows for visible light communication with bandwidth up to 1.5 Gbit/s 139 . For an OLED microdisplay, a fast OLED with ~200 MHz bandwidth has been demonstrated 140 . Therefore, the limitation of frame rate is on the driving circuits for both micro-LED and OLED. Another fact concerning driving circuit is the tradeoff between resolution and frame rate as a higher resolution panel means more scanning lines in each frame. So far, an OLED display with 480 Hz frame rate has been demonstrated 141 . For an LCoS, the frame rate is mainly limited by the LC response time. Depending on the LC material used, the response time is around 1 ms for nematic LC or 200 µs for ferroelectric LC (FLC) 125 . Nematic LC allows analog driving, which accommodates gray levels, typically with 8-bit depth. FLC is bistable so that PWM is used to generate gray levels. DMD is also a binary device. The frame rate can reach 30 kHz, which is mainly constrained by the response time of micromirrors. For MEMS-based LBS, the frame rate is limited by the scanning frequency of MEMS mirrors. A frame rate of 60 Hz with around 1 K resolution already requires a resonance frequency of around 50 kHz, with a Q-factor up to 145,000 128 . A higher frame rate or resolution requires a higher Q-factor and larger laser modulation bandwidth, which may be challenging.

Form factor is another crucial aspect for the light engines of near-eye displays. For self-emissive displays, both micro-OLEDs and QD-based micro-LEDs can achieve full color with a single panel. Thus, they are quite compact. A micro-LED display with separate RGB panels naturally have a larger form factor. In applications requiring direct-view full-color panel, the extra combining optics may also increase the volume. It needs to be pointed out, however, that the combing optics may not be necessary for some applications like waveguide displays, because the EPE process results in system’s insensitivity to the spatial positions of input RGB images. Therefore, the form factor of using three RGB micro-LED panels is medium. For LCoS and DMD with RGB LEDs as light source, the form factor would be larger due to the illumination optics. Still, if a lower luminous efficacy can be accepted, then a smaller form factor can be achieved by using a simpler optics 142 . If RGB lasers are used, the collimation optics can be eliminated, which greatly reduces the form factor 143 . For MEMS-LBS, the form factor can be extremely compact due to the tiny size of MEMS mirror and laser module.

Finally, contrast ratio (CR) also plays an important role affecting the observed images 8 . Micro-LEDs and micro-OLEDs are self-emissive so that their CR can be >10 6 :1. For a laser beam scanner, its CR can also achieve 10 6 :1 because the laser can be turned off completely at dark state. On the other hand, LCoS and DMD are reflective displays, and their CR is around 2000:1 to 5000:1 144 , 145 . It is worth pointing out that the CR of a display engine plays a significant role only in the dark ambient. As the ambient brightness increases, the ACR is mainly governed by the display’s peak brightness, as previously discussed.

The performance parameters of different light engines are summarized in Table 1 . Micro-LEDs and micro-OLEDs have similar levels of luminous efficacy. But micro-OLEDs still face the burn-in and lifetime issue when driving at a high current, which hinders its use for a high-brightness image source to some extent. Micro-LEDs are still under active development and the improvement on luminous efficacy from maturing fabrication process could be expected. Both devices have nanosecond response time and can potentially achieve a high frame rate with a well-designed integrated circuit. The frame rate of the driving circuit ultimately determines the motion picture response time 146 . Their self-emissive feature also leads to a small form factor and high contrast ratio. LCoS and DMD engines have similar performance of luminous efficacy, form factor, and contrast ratio. In terms of light modulation, DMD can provide a higher 1-bit frame rate, while LCoS can offer both phase and amplitude modulations. MEMS-based LBS exhibits the highest luminous efficacy so far. It also exhibits an excellent form factor and contrast ratio, but the presently demonstrated 60-Hz frame rate (limited by the MEMS mirrors) could cause image flickering.

Free-space combiners

The term ‘free-space’ generally refers to the case when light is freely propagating in space, as opposed to a waveguide that traps light into TIRs. Regarding the combiner, it can be a partial mirror, as commonly used in AR systems based on traditional geometric optics. Alternatively, the combiner can also be a reflective HOE. The strong chromatic dispersion of HOE necessitates the use of a laser source, which usually leads to a Maxwellian-type system.

Traditional geometric designs

Several systems based on geometric optics are illustrated in Fig. 8 . The simplest design uses a single freeform half-mirror 6 , 147 to directly collimate the displayed images to the viewer’s eye (Fig. 8a ). This design can achieve a large FoV (up to 90°) 147 , but the limited design freedom with a single freeform surface leads to image distortions, also called pupil swim 6 . The placement of half-mirror also results in a relatively bulky form factor. Another design using so-called birdbath optics 6 , 148 is shown in Fig. 8b . Compared to the single-combiner design, birdbath design has an extra optics on the display side, which provides space for aberration correction. The integration of beam splitter provides a folded optical path, which reduces the form factor to some extent. Another way to fold optical path is to use a TIR-prism. Cheng et al. 149 designed a freeform TIR-prism combiner (Fig. 8c ) offering a diagonal FoV of 54° and exit pupil diameter of 8 mm. All the surfaces are freeform, which offer an excellent image quality. To cancel the optical power for the transmitted environmental light, a compensator is added to the TIR prism. The whole system has a well-balanced performance between FoV, eyebox, and form factor. To release the space in front of viewer’s eye, relay optics can be used to form an intermediate image near the combiner 150 , 151 , as illustrated in Fig. 8d . Although the design offers more optical surfaces for aberration correction, the extra lenses also add to system weight and form factor.

figure 8

a Single freeform surface as the combiner. b Birdbath optics with a beam splitter and a half mirror. c Freeform TIR prism with a compensator. d Relay optics with a half mirror. Adapted from c ref. 149 with permission from OSA Publishing and d ref. 151 with permission from OSA Publishing

Regarding the approaches to solve the VAC issue, the most straightforward way is to integrate a tunable lens into the optical path, like a liquid lens 152 or Alvarez lens 99 , to form a varifocal system. Alternatively, integral imaging 153 , 154 can also be used, by replacing the original display panel with the central depth plane of an integral imaging module. The integral imaging can also be combined with varifocal approach to overcome the tradeoff between resolution and depth of field (DoF) 155 , 156 , 157 . However, the inherent tradeoff between resolution and view number still exists in this case.

Overall, AR displays based on traditional geometric optics have a relatively simple design with a decent FoV (~60°) and eyebox (8 mm) 158 . They also exhibit a reasonable efficiency. To measure the efficiency of an AR combiner, an appropriate measure is to divide the output luminance (unit: nit) by the input luminous flux (unit: lm), which we note as combiner efficiency. For a fixed input luminous flux, the output luminance, or image brightness, is related to the FoV and exit pupil of the combiner system. If we assume no light waste of the combiner system, then the maximum combiner efficiency for a typical diagonal FoV of 60° and exit pupil (10 mm square) is around 17,000 nit/lm (Eq. S2 ). To estimate the combiner efficiency of geometric combiners, we assume 50% of half-mirror transmittance and the efficiency of other optics to be 50%. Then the final combiner efficiency is about 4200 nit/lm, which is a high value in comparison with waveguide combiners. Nonetheless, to further shrink the system size or improve system performance ultimately encounters the etendue conservation issue. In addition, AR systems with traditional geometric optics is hard to achieve a configuration resembling normal flat glasses because the half-mirror has to be tilted to some extent.

Maxwellian-type systems

The Maxwellian view, proposed by James Clerk Maxwell (1860), refers to imaging a point light source in the eye pupil 159 . If the light beam is modulated in the imaging process, a corresponding image can be formed on the retina (Fig. 9a ). Because the point source is much smaller than the eye pupil, the image is always-in-focus on the retina irrespective of the eye lens’ focus. For applications in AR display, the point source is usually a laser with narrow angular and spectral bandwidths. LED light sources can also build a Maxwellian system, by adding an angular filtering module 160 . Regarding the combiner, although in theory a half-mirror can also be used, HOEs are generally preferred because they offer the off-axis configuration that places combiner in a similar position like eyeglasses. In addition, HOEs have a lower reflection of environment light, which provides a more natural appearance of the user behind the display.

figure 9

a Schematic of the working principle of Maxwellian displays. Maxwellian displays based on b SLM and laser diode light source and c MEMS-LBS with a steering mirror as additional modulation method. Generation of depth cues by d computational digital holography and e scanning of steering mirror to produce multiple views. Adapted from b, d ref. 143 and c, e ref. 167 under the Creative Commons Attribution 4.0 License

To modulate the light, a SLM like LCoS or DMD can be placed in the light path, as shown in Fig. 9b . Alternatively, LBS system can also be used (Fig. 9c ), where the intensity modulation occurs in the laser diode itself. Besides the operation in a normal Maxwellian-view, both implementations offer additional degrees of freedom for light modulation.

For a SLM-based system, there are several options to arrange the SLM pixels 143 , 161 . Maimone et al. 143 demonstrated a Maxwellian AR display with two modes to offer a large-DoF Maxwellian-view, or a holographic view (Fig. 9d ), which is often referred as computer-generated holography (CGH) 162 . To show an always-in-focus image with a large DoF, the image can be directly displayed on an amplitude SLM, or using amplitude encoding for a phase-only SLM 163 . Alternatively, if a 3D scene with correct depth cues is to be presented, then optimization algorithms for CGH can be used to generate a hologram for the SLM. The generated holographic image exhibits the natural focus-and-blur effect like a real 3D object (Fig. 9d ). To better understand this feature, we need to again exploit the concept of etendue. The laser light source can be considered to have a very small etendue due to its excellent collimation. Therefore, the system etendue is provided by the SLM. The micron-sized pixel-pitch of SLM offers a certain maximum diffraction angle, which, multiplied by the SLM size, equals system etendue. By varying the display content on SLM, the final exit pupil size can be changed accordingly. In the case of a large-DoF Maxwellian view, the exit pupil size is small, accompanied by a large FoV. For the holographic display mode, the reduced DoF requires a larger exit pupil with dimension close to the eye pupil. But the FoV is reduced accordingly due to etendue conservation. Another commonly concerned issue with CGH is the computation time. To achieve a real-time CGH rendering flow with an excellent image quality is quite a challenge. Fortunately, with recent advances in algorithm 164 and the introduction of convolutional neural network (CNN) 165 , 166 , this issue is gradually solved with an encouraging pace. Lately, Liang et al. 166 demonstrated a real-time CGH synthesis pipeline with a high image quality. The pipeline comprises an efficient CNN model to generate a complex hologram from a 3D scene and an improved encoding algorithm to convert the complex hologram to a phase-only one. An impressive frame rate of 60 Hz has been achieved on a desktop computing unit.

For LBS-based system, the additional modulation can be achieved by integrating a steering module, as demonstrated by Jang et al. 167 . The steering mirror can shift the focal point (viewpoint) within the eye pupil, therefore effectively expanding the system etendue. When the steering process is fast and the image content is updated simultaneously, correct 3D cues can be generated, as shown in Fig. 9e . However, there exists a tradeoff between the number of viewpoint and the final image frame rate, because the total frames are equally divided into each viewpoint. To boost the frame rate of MEMS-LBS systems by the number of views (e.g., 3 by 3) may be challenging.

Maxwellian-type systems offer several advantages. The system efficiency is usually very high because nearly all the light is delivered into viewer’s eye. The system FoV is determined by the f /# of combiner and a large FoV (~80° in horizontal) can be achieved 143 . The issue of VAC can be mitigated with an infinite-DoF image that deprives accommodation cue, or completely solved by generating a true-3D scene as discussed above. Despite these advantages, one major weakness of Maxwellian-type system is the tiny exit pupil, or eyebox. A small deviation of eye pupil location from the viewpoint results in the complete disappearance of the image. Therefore, to expand eyebox is considered as one of the most important challenges in Maxwellian-type systems.

Pupil duplication and steering

Methods to expand eyebox can be generally categorized into pupil duplication 168 , 169 , 170 , 171 , 172 and pupil steering 9 , 13 , 167 , 173 . Pupil duplication simply generates multiple viewpoints to cover a large area. In contrast, pupil steering dynamically shifts the viewpoint position, depending on the pupil location. Before reviewing detailed implementations of these two methods, it is worth discussing some of their general features. The multiple viewpoints in pupil duplication usually mean to equally divide the total light intensity. In each time frame, however, it is preferable that only one viewpoint enters the user’s eye pupil to avoid ghost image. This requirement, therefore, results in a reduced total light efficiency, while also conditioning the viewpoint separation to be larger than the pupil diameter. In addition, the separation should not be too large to avoid gap between viewpoints. Considering that human pupil diameter changes in response to environment illuminance, the design of viewpoint separation needs special attention. Pupil steering, on the other hand, only produces one viewpoint at each time frame. It is therefore more light-efficient and free from ghost images. But to determine the viewpoint position requires the information of eye pupil location, which demands a real-time eye-tracking module 9 . Another observation is that pupil steering can accommodate multiple viewpoints by its nature. Therefore, a pupil steering system can often be easily converted to a pupil duplication system by simultaneously generating available viewpoints.

To generate multiple viewpoints, one can focus on modulating the incident light or the combiner. Recall that viewpoint is the image of light source. To duplicate or shift light source can achieve pupil duplication or steering accordingly, as illustrated in Fig. 10a . Several schemes of light modulation are depicted in Fig. 10b–e . An array of light sources can be generated with multiple laser diodes (Fig. 10b ). To turn on all or one of the sources achieves pupil duplication or steering. A light source array can also be produced by projecting light on an array-type PPHOE 168 (Fig. 10c ). Apart from direct adjustment of light sources, modulating light on the path can also effectively steer/duplicate the light sources. Using a mechanical steering mirror, the beam can be deflected 167 (Fig. 10d ), which equals to shifting the light source position. Other devices like a grating or beam splitter can also serve as ray deflector/splitter 170 , 171 (Fig. 10e ).

figure 10

a Schematic of duplicating (or shift) viewpoint by modulation of incident light. Light modulation by b multiple laser diodes, c HOE lens array, d steering mirror and e grating or beam splitters. f Pupil duplication with multiplexed PPHOE. g Pupil steering with LCHOE. Reproduced from c ref. 168 under the Creative Commons Attribution 4.0 License, e ref. 169 with permission from OSA Publishing, f ref. 171 with permission from OSA Publishing and g ref. 173 with permission from OSA Publishing

Nonetheless, one problem of the light source duplication/shifting methods for pupil duplication/steering is that the aberrations in peripheral viewpoints are often serious 168 , 173 . The HOE combiner is usually recorded at one incident angle. For other incident angles with large deviations, considerable aberrations will occur, especially in the scenario of off-axis configuration. To solve this problem, the modulation can be focused on the combiner instead. While the mechanical shifting of combiner 9 can achieve continuous pupil steering, its integration into AR display with a small factor remains a challenge. Alternatively, the versatile functions of HOE offer possible solutions for combiner modulation. Kim and Park 169 demonstrated a pupil duplication system with multiplexed PPHOE (Fig. 10f ). Wavefronts of several viewpoints can be recorded into one PPHOE sample. Three viewpoints with a separation of 3 mm were achieved. However, a slight degree of ghost image and gap can be observed in the viewpoint transition. For a PPHOE to achieve pupil steering, the multiplexed PPHOE needs to record different focal points with different incident angles. If each hologram has no angular crosstalk, then with an additional device to change the light incident angle, the viewpoint can be steered. Alternatively, Xiong et al. 173 demonstrated a pupil steering system with LCHOEs in a simpler configuration (Fig. 10g ). The polarization-sensitive nature of LCHOE enables the controlling of which LCHOE to function with a polarization converter (PC). When the PC is off, the incident RCP light is focused by the right-handed LCHOE. When the PC is turned on, the RCP light is firstly converted to LCP light and passes through the right-handed LCHOE. Then it is focused by the left-handed LCHOE into another viewpoint. To add more viewpoints requires stacking more pairs of PC and LCHOE, which can be achieved in a compact manner with thin glass substrates. In addition, to realize pupil duplication only requires the stacking of multiple low-efficiency LCHOEs. For both PPHOEs and LCHOEs, because the hologram for each viewpoint is recorded independently, the aberrations can be eliminated.

Regarding the system performance, in theory the FoV is not limited and can reach a large value, such as 80° in horizontal direction 143 . The definition of eyebox is different from traditional imaging systems. For a single viewpoint, it has the same size as the eye pupil diameter. But due to the viewpoint steering/duplication capability, the total system eyebox can be expanded accordingly. The combiner efficiency for pupil steering systems can reach 47,000 nit/lm for a FoV of 80° by 80° and pupil diameter of 4 mm (Eq. S2 ). At such a high brightness level, eye safety could be a concern 174 . For a pupil duplication system, the combiner efficiency is decreased by the number of viewpoints. With a 4-by-4 viewpoint array, it can still reach 3000 nit/lm. Despite the potential gain of pupil duplication/steering, when considering the rotation of eyeball, the situation becomes much more complicated 175 . A perfect pupil steering system requires a 5D steering, which proposes a challenge for practical implementation.

Pin-light systems

Recently, another type of display in close relation with Maxwellian view called pin-light display 148 , 176 has been proposed. The general working principle of pin-light display is illustrated in Fig. 11a . Each pin-light source is a Maxwellian view with a large DoF. When the eye pupil is no longer placed near the source point as in Maxwellian view, each image source can only form an elemental view with a small FoV on retina. However, if the image source array is arranged in a proper form, the elemental views can be integrated together to form a large FoV. According to the specific optical architectures, pin-light display can take different forms of implementation. In the initial feasibility demonstration, Maimone et al. 176 used a side-lit waveguide plate as the point light source (Fig. 11b ). The light inside the waveguide plate is extracted by the etched divots, forming a pin-light source array. A transmissive SLM (LCD) is placed behind the waveguide plate to modulate the light intensity and form the image. The display has an impressive FoV of 110° thanks to the large scattering angle range. However, the direct placement of LCD before the eye brings issues of insufficient resolution density and diffraction of background light.

figure 11

a Schematic drawing of the working principle of pin-light display. b Pin-light display utilizing a pin-light source and a transmissive SLM. c An example of pin-mirror display with a birdbath optics. d SWD system with LBS image source and off-axis lens array. Reprinted from b ref. 176 under the Creative Commons Attribution 4.0 License and d ref. 180 with permission from OSA Publishing

To avoid these issues, architectures using pin-mirrors 177 , 178 , 179 are proposed. In these systems, the final combiner is an array of tiny mirrors 178 , 179 or gratings 177 , in contrast to their counterparts using large-area combiners. An exemplary system with birdbath design is depicted in Fig. 11c . In this case, the pin-mirrors replace the original beam-splitter in the birdbath and can thus shrink the system volume, while at the same time providing large DoF pin-light images. Nonetheless, such a system may still face the etendue conservation issue. Meanwhile, the size of pin-mirror cannot be too small in order to prevent degradation of resolution density due to diffraction. Therefore, its influence on the see-through background should also be considered in the system design.

To overcome the etendue conservation and improve see-through quality, Xiong et al. 180 proposed another type of pin-light system exploiting the etendue expansion property of waveguide, which is also referred as scanning waveguide display (SWD). As illustrated in Fig. 11d , the system uses an LBS as the image source. The collimated scanned laser rays are trapped in the waveguide and encounter an array of off-axis lenses. Upon each encounter, the lens out-couples the laser rays and forms a pin-light source. SWD has the merits of good see-through quality and large etendue. A large FoV of 100° was demonstrated with the help of an ultra-low f /# lens array based on LCHOE. However, some issues like insufficient image resolution density and image non-uniformity remain to be overcome. To further improve the system may require optimization of Gaussian beam profile and additional EPE module 180 .

Overall, pin-light systems inherit the large DoF from Maxwellian view. With adequate number of pin-light sources, the FoV and eyebox can be expanded accordingly. Nonetheless, despite different forms of implementation, a common issue of pin-light system is the image uniformity. The overlapped region of elemental views has a higher light intensity than the non-overlapped region, which becomes even more complicated considering the dynamic change of pupil size. In theory, the displayed image can be pre-processed to compensate for the optical non-uniformity. But that would require knowledge of precise pupil location (and possibly size) and therefore an accurate eye-tracking module 176 . Regarding the system performance, pin-mirror systems modified from other free-space systems generally shares similar FoV and eyebox with original systems. The combiner efficiency may be lower due to the small size of pin-mirrors. SWD, on the other hand, shares the large FoV and DoF with Maxwellian view, and large eyebox with waveguide combiners. The combiner efficiency may also be lower due to the EPE process.

Waveguide combiner

Besides free-space combiners, another common architecture in AR displays is waveguide combiner. The term ‘waveguide’ indicates the light is trapped in a substrate by the TIR process. One distinctive feature of a waveguide combiner is the EPE process that effectively enlarges the system etendue. In the EPE process, a portion of the trapped light is repeatedly coupled out of the waveguide in each TIR. The effective eyebox is therefore enlarged. According to the features of couplers, we divide the waveguide combiners into two types: diffractive and achromatic, as described in the followings.

Diffractive waveguides

As the name implies, diffractive-type waveguides use diffractive elements as couplers. The in-coupler is usually a diffractive grating and the out-coupler in most cases is also a grating with the same period as the in-coupler, but it can also be an off-axis lens with a small curvature to generate image with finite depth. Three major diffractive couplers have been developed: SRGs, photopolymer gratings (PPGs), and liquid crystal gratings (grating-type LCHOE; also known as polarization volume gratings (PVGs)). Some general protocols for coupler design are that the in-coupler should have a relatively high efficiency and the out-coupler should have a uniform light output. A uniform light output usually requires a low-efficiency coupler, with extra degrees of freedom for local modulation of coupling efficiency. Both in-coupler and out-coupler should have an adequate angular bandwidth to accommodate a reasonable FoV. In addition, the out-coupler should also be optimized to avoid undesired diffractions, including the outward diffraction of TIR light and diffraction of environment light into user’s eyes, which are referred as light leakage and rainbow. Suppression of these unwanted diffractions should also be considered in the optimization process of waveguide design, along with performance parameters like efficiency and uniformity.

The basic working principles of diffractive waveguide-based AR systems are illustrated in Fig. 12 . For the SRG-based waveguides 6 , 8 (Fig. 12a ), the in-coupler can be a transmissive-type or a reflective-type 181 , 182 . The grating geometry can be optimized for coupling efficiency with a large degree of freedom 183 . For the out-coupler, a reflective SRG with a large slant angle to suppress the transmission orders is preferred 184 . In addition, a uniform light output usually requires a gradient efficiency distribution in order to compensate for the decreased light intensity in the out-coupling process. This can be achieved by varying the local grating configurations like height and duty cycle 6 . For the PPG-based waveguides 185 (Fig. 12b ), the small angular bandwidth of a high-efficiency transmissive PPG prohibits its use as in-coupler. Therefore, both in-coupler and out-coupler are usually reflective types. The gradient efficiency can be achieved by space-variant exposure to control the local index modulation 186 or local Bragg slant angle variation through freeform exposure 19 . Due to the relatively small angular bandwidth of PPG, to achieve a decent FoV usually requires stacking two 187 or three 188 PPGs together for a single color. The PVG-based waveguides 189 (Fig. 12c ) also prefer reflective PVGs as in-couplers because the transmissive PVGs are much more difficult to fabricate due to the LC alignment issue. In addition, the angular bandwidth of transmissive PVGs in Bragg regime is also not large enough to support a decent FoV 29 . For the out-coupler, the angular bandwidth of a single reflective PVG can usually support a reasonable FoV. To obtain a uniform light output, a polarization management layer 190 consisting of a LC layer with spatially variant orientations can be utilized. It offers an additional degree of freedom to control the polarization state of the TIR light. The diffraction efficiency can therefore be locally controlled due to the strong polarization sensitivity of PVG.

figure 12

Schematics of waveguide combiners based on a SRGs, b PPGs and c PVGs. Reprinted from a ref. 85 with permission from OSA Publishing, b ref. 185 with permission from John Wiley and Sons and c ref. 189 with permission from OSA Publishing

The above discussion describes the basic working principle of 1D EPE. Nonetheless, for the 1D EPE to produce a large eyebox, the exit pupil in the unexpanded direction of the original image should be large. This proposes design challenges in light engines. Therefore, a 2D EPE is favored for practical applications. To extend EPE in two dimensions, two consecutive 1D EPEs can be used 191 , as depicted in Fig. 13a . The first 1D EPE occurs in the turning grating, where the light is duplicated in y direction and then turned into x direction. Then the light rays encounter the out-coupler and are expanded in x direction. To better understand the 2D EPE process, the k -vector diagram (Fig. 13b ) can be used. For the light propagating in air with wavenumber k 0 , its possible k -values in x and y directions ( k x and k y ) fall within the circle with radius k 0 . When the light is trapped into TIR, k x and k y are outside the circle with radius k 0 and inside the circle with radius nk 0 , where n is the refractive index of the substrate. k x and k y stay unchanged in the TIR process and are only changed in each diffraction process. The central red box in Fig. 13b indicates the possible k values within the system FoV. After the in-coupler, the k values are added by the grating k -vector, shifting the k values into TIR region. The turning grating then applies another k -vector and shifts the k values to near x -axis. Finally, the k values are shifted by the out-coupler and return to the free propagation region in air. One observation is that the size of red box is mostly limited by the width of TIR band. To accommodate a larger FoV, the outer boundary of TIR band needs to be expanded, which amounts to increasing waveguide refractive index. Another important fact is that when k x and k y are near the outer boundary, the uniformity of output light becomes worse. This is because the light propagation angle is near 90° in the waveguide. The spatial distance between two consecutive TIRs becomes so large that the out-coupled beams are spatially separated to an unacceptable degree. The range of possible k values for practical applications is therefore further shrunk due to this fact.

figure 13

a Schematic of 2D EPE based on two consecutive 1D EPEs. Gray/black arrows indicate light in air/TIR. Black dots denote TIRs. b k-diagram of the two-1D-EPE scheme. c Schematic of 2D EPE with a 2D hexagonal grating d k-diagram of the 2D-grating scheme

Aside from two consecutive 1D EPEs, the 2D EPE can also be directly implemented with a 2D grating 192 . An example using a hexagonal grating is depicted in Fig. 13c . The hexagonal grating can provide k -vectors in six directions. In the k -diagram (Fig. 13d ), after the in-coupling, the k values are distributed into six regions due to multiple diffractions. The out-coupling occurs simultaneously with pupil expansion. Besides a concise out-coupler configuration, the 2D EPE scheme offers more degrees of design freedom than two 1D EPEs because the local grating parameters can be adjusted in a 2D manner. The higher design freedom has the potential to reach a better output light uniformity, but at the cost of a higher computation demand for optimization. Furthermore, the unslanted grating geometry usually leads to a large light leakage and possibly low efficiency. Adding slant to the geometry helps alleviate the issue, but the associated fabrication may be more challenging.

Finally, we discuss the generation of full-color images. One important issue to clarify is that although diffractive gratings are used here, the final image generally has no color dispersion even if we use a broadband light source like LED. This can be easily understood in the 1D EPE scheme. The in-coupler and out-coupler have opposite k -vectors, which cancels the color dispersion for each other. In the 2D EPE schemes, the k -vectors always form a closed loop from in-coupled light to out-coupled light, thus, the color dispersion also vanishes likewise. The issue of using a single waveguide for full-color images actually exists in the consideration of FoV and light uniformity. The breakup of propagation angles for different colors results in varied out-coupling situations for each color. To be more specific, if the red and the blue channels use the same in-coupler, the propagating angle for the red light is larger than that of the blue light. The red light in peripheral FoV is therefore easier to face the mentioned large-angle non-uniformity issue. To acquire a decent FoV and light uniformity, usually two or three layers of waveguides with different grating pitches are adopted.

Regarding the system performance, the eyebox is generally large enough (~10 mm) to accommodate different user’s IPD and alignment shift during operation. A parameter of significant concern for a waveguide combiner is its FoV. From the k -vector analysis, we can conclude the theoretical upper limit is determined by the waveguide refractive index. But the light/color uniformity also influences the effective FoV, over which the degradation of image quality becomes unacceptable. Current diffractive waveguide combiners generally achieve a FoV of about 50°. To further increase FoV, a straightforward method is to use a higher refractive index waveguide. Another is to tile FoV through direct stacking of multiple waveguides or using polarization-sensitive couplers 79 , 193 . As to the optical efficiency, a typical value for the diffractive waveguide combiner is around 50–200 nit/lm 6 , 189 . In addition, waveguide combiners adopting grating out-couplers generate an image with fixed depth at infinity. This leads to the VAC issue. To tackle VAC in waveguide architectures, the most practical way is to generate multiple depths and use the varifocal or multifocal driving scheme, similar to those mentioned in the VR systems. But to add more depths usually means to stack multiple layers of waveguides together 194 . Considering the additional waveguide layers for RGB colors, the final waveguide thickness would undoubtedly increase.

Other parameters special to waveguide includes light leakage, see-through ghost, and rainbow. Light leakage refers to out-coupled light that goes outwards to the environment, as depicted in Fig. 14a . Aside from decreased efficiency, the leakage also brings drawback of unnatural “bright-eye” appearance of the user and privacy issue. Optimization of the grating structure like geometry of SRG may reduce the leakage. See-through ghost is formed by consecutive in-coupling and out-couplings caused by the out-coupler grating, as sketched in Fig. 14b , After the process, a real object with finite depth may produce a ghost image with shift in both FoV and depth. Generally, an out-coupler with higher efficiency suffers more see-through ghost. Rainbow is caused by the diffraction of environment light into user’s eye, as sketched in Fig. 14c . The color dispersion in this case will occur because there is no cancellation of k -vector. Using the k -diagram, we can obtain a deeper insight into the formation of rainbow. Here, we take the EPE structure in Fig. 13a as an example. As depicted in Fig. 14d , after diffractions by the turning grating and the out-coupler grating, the k values are distributed in two circles that shift from the origin by the grating k -vectors. Some diffracted light can enter the see-through FoV and form rainbow. To reduce rainbow, a straightforward way is to use a higher index substrate. With a higher refractive index, the outer boundary of k diagram is expanded, which can accommodate larger grating k -vectors. The enlarged k -vectors would therefore “push” these two circles outwards, leading to a decreased overlapping region with the see-through FoV. Alternatively, an optimized grating structure would also help reduce the rainbow effect by suppressing the unwanted diffraction.

figure 14

Sketches of formations of a light leakage, b see-through ghost and c rainbow. d Analysis of rainbow formation with k-diagram

Achromatic waveguide

Achromatic waveguide combiners use achromatic elements as couplers. It has the advantage of realizing full-color image with a single waveguide. A typical example of achromatic element is a mirror. The waveguide with partial mirrors as out-coupler is often referred as geometric waveguide 6 , 195 , as depicted in Fig. 15a . The in-coupler in this case is usually a prism to avoid unnecessary color dispersion if using diffractive elements otherwise. The mirrors couple out TIR light consecutively to produce a large eyebox, similarly in a diffractive waveguide. Thanks to the excellent optical property of mirrors, the geometric waveguide usually exhibits a superior image regarding MTF and color uniformity to its diffractive counterparts. Still, the spatially discontinuous configuration of mirrors also results in gaps in eyebox, which may be alleviated by using a dual-layer structure 196 . Wang et al. designed a geometric waveguide display with five partial mirrors (Fig. 15b ). It exhibits a remarkable FoV of 50° by 30° (Fig. 15c ) and an exit pupil of 4 mm with a 1D EPE. To achieve 2D EPE, similar architectures in Fig. 13a can be used by integrating a turning mirror array as the first 1D EPE module 197 . Unfortunately, the k -vector diagrams in Fig. 13b, d cannot be used here because the k values in x-y plane no longer conserve in the in-coupling and out-coupling processes. But some general conclusions remain valid, like a higher refractive index leading to a larger FoV and gradient out-coupling efficiency improving light uniformity.

figure 15

a Schematic of the system configuration. b Geometric waveguide with five partial mirrors. c Image photos demonstrating system FoV. Adapted from b , c ref. 195 with permission from OSA Publishing

The fabrication process of geometric waveguide involves coating mirrors on cut-apart pieces and integrating them back together, which may result in a high cost, especially for the 2D EPE architecture. Another way to implement an achromatic coupler is to use multiplexed PPHOE 198 , 199 to mimic the behavior of a tilted mirror (Fig. 16a ). To understand the working principle, we can use the diagram in Fig. 16b . The law of reflection states the angle of reflection equals to the angle of incidence. If we translate this behavior to k -vector language, it means the mirror can apply any length of k -vector along its surface normal direction. The k -vector length of the reflected light is always equal to that of the incident light. This puts a condition that the k -vector triangle is isosceles. With a simple geometric deduction, it can be easily observed this leads to the law of reflection. The behavior of a general grating, however, is very different. For simplicity we only consider the main diffraction order. The grating can only apply a k -vector with fixed k x due to the basic diffraction law. For the light with a different incident angle, it needs to apply different k z to produce a diffracted light with equal k -vector length as the incident light. For a grating with a broad angular bandwidth like SRG, the range of k z is wide, forming a lengthy vertical line in Fig. 16b . For a PPG with a narrow angular bandwidth, the line is short and resembles a dot. If multiple of these tiny dots are distributed along the oblique line corresponding to a mirror, then the final multiplexed PPGs can imitate the behavior of a tilted mirror. Such a PPHOE is sometimes referred as a skew-mirror 198 . In theory, to better imitate the mirror, a lot of multiplexed PPGs is preferred, while each PPG has a small index modulation δn . But this proposes a bigger challenge in device fabrication. Recently, Utsugi et al. demonstrated an impressive skew-mirror waveguide based on 54 multiplexed PPGs (Fig. 16c, d ). The display exhibits an effective FoV of 35° by 36°. In the peripheral FoV, there still exists some non-uniformity (Fig. 16e ) due to the out-coupling gap, which is an inherent feature of the flat-type out-couplers.

figure 16

a System configuration. b Diagram demonstrating how multiplexed PPGs resemble the behavior of a mirror. Photos showing c the system and d image. e Picture demonstrating effective system FoV. Adapted from c – e ref. 199 with permission from ITE

Finally, it is worth mentioning that metasurfaces are also promising to deliver achromatic gratings 200 , 201 for waveguide couplers ascribed to their versatile wavefront shaping capability. The mechanism of the achromatic gratings is similar to that of the achromatic lenses as previously discussed. However, the current development of achromatic metagratings is still in its infancy. Much effort is needed to improve the optical efficiency for in-coupling, control the higher diffraction orders for eliminating ghost images, and enable a large size design for EPE.

Generally, achromatic waveguide combiners exhibit a comparable FoV and eyebox with diffractive combiners, but with a higher efficiency. For a partial-mirror combiner, its combiner efficiency is around 650 nit/lm 197 (2D EPE). For a skew-mirror combiner, although the efficiency of multiplexed PPHOE is relatively low (~1.5%) 199 , the final combiner efficiency of the 1D EPE system is still high (>3000 nit/lm) due to multiple out-couplings.

Table 2 summarizes the performance of different AR combiners. When combing the luminous efficacy in Table 1 and the combiner efficiency in Table 2 , we can have a comprehensive estimate of the total luminance efficiency (nit/W) for different types of systems. Generally, Maxwellian-type combiners with pupil steering have the highest luminance efficiency when partnered with laser-based light engines like laser-backlit LCoS/DMD or MEM-LBS. Geometric optical combiners have well-balanced image performances, but to further shrink the system size remains a challenge. Diffractive waveguides have a relatively low combiner efficiency, which can be remedied by an efficient light engine like MEMS-LBS. Further development of coupler and EPE scheme would also improve the system efficiency and FoV. Achromatic waveguides have a decent combiner efficiency. The single-layer design also enables a smaller form factor. With advances in fabrication process, it may become a strong contender to presently widely used diffractive waveguides.

Conclusions and perspectives

VR and AR are endowed with a high expectation to revolutionize the way we interact with digital world. Accompanied with the expectation are the engineering challenges to squeeze a high-performance display system into a tightly packed module for daily wearing. Although the etendue conservation constitutes a great obstacle on the path, remarkable progresses with innovative optics and photonics continue to take place. Ultra-thin optical elements like PPHOEs and LCHOEs provide alternative solutions to traditional optics. Their unique features of multiplexing capability and polarization dependency further expand the possibility of novel wavefront modulations. At the same time, nanoscale-engineered metasurfaces/SRGs provide large design freedoms to achieve novel functions beyond conventional geometric optical devices. Newly emerged micro-LEDs open an opportunity for compact microdisplays with high peak brightness and good stability. Further advances on device engineering and manufacturing process are expected to boost the performance of metasurfaces/SRGs and micro-LEDs for AR and VR applications.

Data availability

All data needed to evaluate the conclusions in the paper are present in the paper. Additional data related to this paper may be requested from the authors.

Cakmakci, O. & Rolland, J. Head-worn displays: a review. J. Disp. Technol. 2 , 199–216 (2006).

Article   ADS   Google Scholar  

Zhan, T. et al. Augmented reality and virtual reality displays: perspectives and challenges. iScience 23 , 101397 (2020).

Rendon, A. A. et al. The effect of virtual reality gaming on dynamic balance in older adults. Age Ageing 41 , 549–552 (2012).

Article   Google Scholar  

Choi, S., Jung, K. & Noh, S. D. Virtual reality applications in manufacturing industries: past research, present findings, and future directions. Concurrent Eng. 23 , 40–63 (2015).

Li, X. et al. A critical review of virtual and augmented reality (VR/AR) applications in construction safety. Autom. Constr. 86 , 150–162 (2018).

Kress, B. C. Optical Architectures for Augmented-, Virtual-, and Mixed-Reality Headsets (Bellingham: SPIE Press, 2020).

Cholewiak, S. A. et al. A perceptual eyebox for near-eye displays. Opt. Express 28 , 38008–38028 (2020).

Lee, Y. H., Zhan, T. & Wu, S. T. Prospects and challenges in augmented reality displays. Virtual Real. Intell. Hardw. 1 , 10–20 (2019).

Kim, J. et al. Foveated AR: dynamically-foveated augmented reality display. ACM Trans. Graph. 38 , 99 (2019).

Tan, G. J. et al. Foveated imaging for near-eye displays. Opt. Express 26 , 25076–25085 (2018).

Lee, S. et al. Foveated near-eye display for mixed reality using liquid crystal photonics. Sci. Rep. 10 , 16127 (2020).

Yoo, C. et al. Foveated display system based on a doublet geometric phase lens. Opt. Express 28 , 23690–23702 (2020).

Akşit, K. et al. Manufacturing application-driven foveated near-eye displays. IEEE Trans. Vis. Computer Graph. 25 , 1928–1939 (2019).

Zhu, R. D. et al. High-ambient-contrast augmented reality with a tunable transmittance liquid crystal film and a functional reflective polarizer. J. Soc. Inf. Disp. 24 , 229–233 (2016).

Lincoln, P. et al. Scene-adaptive high dynamic range display for low latency augmented reality. In Proc. 21st ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games . (ACM, San Francisco, CA, 2017).

Duerr, F. & Thienpont, H. Freeform imaging systems: fermat’s principle unlocks “first time right” design. Light.: Sci. Appl. 10 , 95 (2021).

Bauer, A., Schiesser, E. M. & Rolland, J. P. Starting geometry creation and design method for freeform optics. Nat. Commun. 9 , 1756 (2018).

Rolland, J. P. et al. Freeform optics for imaging. Optica 8 , 161–176 (2021).

Jang, C. et al. Design and fabrication of freeform holographic optical elements. ACM Trans. Graph. 39 , 184 (2020).

Gabor, D. A new microscopic principle. Nature 161 , 777–778 (1948).

Kostuk, R. K. Holography: Principles and Applications (Boca Raton: CRC Press, 2019).

Lawrence, J. R., O'Neill, F. T. & Sheridan, J. T. Photopolymer holographic recording material. Optik 112 , 449–463 (2001).

Guo, J. X., Gleeson, M. R. & Sheridan, J. T. A review of the optimisation of photopolymer materials for holographic data storage. Phys. Res. Int. 2012 , 803439 (2012).

Jang, C. et al. Recent progress in see-through three-dimensional displays using holographic optical elements [Invited]. Appl. Opt. 55 , A71–A85 (2016).

Xiong, J. H. et al. Holographic optical elements for augmented reality: principles, present status, and future perspectives. Adv. Photonics Res. 2 , 2000049 (2021).

Tabiryan, N. V. et al. Advances in transparent planar optics: enabling large aperture, ultrathin lenses. Adv. Optical Mater. 9 , 2001692 (2021).

Zanutta, A. et al. Photopolymeric films with highly tunable refractive index modulation for high precision diffractive optics. Optical Mater. Express 6 , 252–263 (2016).

Moharam, M. G. & Gaylord, T. K. Rigorous coupled-wave analysis of planar-grating diffraction. J. Optical Soc. Am. 71 , 811–818 (1981).

Xiong, J. H. & Wu, S. T. Rigorous coupled-wave analysis of liquid crystal polarization gratings. Opt. Express 28 , 35960–35971 (2020).

Xie, S., Natansohn, A. & Rochon, P. Recent developments in aromatic azo polymers research. Chem. Mater. 5 , 403–411 (1993).

Shishido, A. Rewritable holograms based on azobenzene-containing liquid-crystalline polymers. Polym. J. 42 , 525–533 (2010).

Bunning, T. J. et al. Holographic polymer-dispersed liquid crystals (H-PDLCs). Annu. Rev. Mater. Sci. 30 , 83–115 (2000).

Liu, Y. J. & Sun, X. W. Holographic polymer-dispersed liquid crystals: materials, formation, and applications. Adv. Optoelectron. 2008 , 684349 (2008).

Xiong, J. H. & Wu, S. T. Planar liquid crystal polarization optics for augmented reality and virtual reality: from fundamentals to applications. eLight 1 , 3 (2021).

Yaroshchuk, O. & Reznikov, Y. Photoalignment of liquid crystals: basics and current trends. J. Mater. Chem. 22 , 286–300 (2012).

Sarkissian, H. et al. Periodically aligned liquid crystal: potential application for projection displays. Mol. Cryst. Liq. Cryst. 451 , 1–19 (2006).

Komanduri, R. K. & Escuti, M. J. Elastic continuum analysis of the liquid crystal polarization grating. Phys. Rev. E 76 , 021701 (2007).

Kobashi, J., Yoshida, H. & Ozaki, M. Planar optics with patterned chiral liquid crystals. Nat. Photonics 10 , 389–392 (2016).

Lee, Y. H., Yin, K. & Wu, S. T. Reflective polarization volume gratings for high efficiency waveguide-coupling augmented reality displays. Opt. Express 25 , 27008–27014 (2017).

Lee, Y. H., He, Z. Q. & Wu, S. T. Optical properties of reflective liquid crystal polarization volume gratings. J. Optical Soc. Am. B 36 , D9–D12 (2019).

Xiong, J. H., Chen, R. & Wu, S. T. Device simulation of liquid crystal polarization gratings. Opt. Express 27 , 18102–18112 (2019).

Czapla, A. et al. Long-period fiber gratings with low-birefringence liquid crystal. Mol. Cryst. Liq. Cryst. 502 , 65–76 (2009).

Dąbrowski, R., Kula, P. & Herman, J. High birefringence liquid crystals. Crystals 3 , 443–482 (2013).

Mack, C. Fundamental Principles of Optical Lithography: The Science of Microfabrication (Chichester: John Wiley & Sons, 2007).

Genevet, P. et al. Recent advances in planar optics: from plasmonic to dielectric metasurfaces. Optica 4 , 139–152 (2017).

Guo, L. J. Nanoimprint lithography: methods and material requirements. Adv. Mater. 19 , 495–513 (2007).

Park, J. et al. Electrically driven mid-submicrometre pixelation of InGaN micro-light-emitting diode displays for augmented-reality glasses. Nat. Photonics 15 , 449–455 (2021).

Khorasaninejad, M. et al. Metalenses at visible wavelengths: diffraction-limited focusing and subwavelength resolution imaging. Science 352 , 1190–1194 (2016).

Li, S. Q. et al. Phase-only transmissive spatial light modulator based on tunable dielectric metasurface. Science 364 , 1087–1090 (2019).

Liang, K. L. et al. Advances in color-converted micro-LED arrays. Jpn. J. Appl. Phys. 60 , SA0802 (2020).

Jin, S. X. et al. GaN microdisk light emitting diodes. Appl. Phys. Lett. 76 , 631–633 (2000).

Day, J. et al. Full-scale self-emissive blue and green microdisplays based on GaN micro-LED arrays. In Proc. SPIE 8268, Quantum Sensing and Nanophotonic Devices IX (SPIE, San Francisco, California, United States, 2012).

Huang, Y. G. et al. Mini-LED, micro-LED and OLED displays: present status and future perspectives. Light.: Sci. Appl. 9 , 105 (2020).

Parbrook, P. J. et al. Micro-light emitting diode: from chips to applications. Laser Photonics Rev. 15 , 2000133 (2021).

Day, J. et al. III-Nitride full-scale high-resolution microdisplays. Appl. Phys. Lett. 99 , 031116 (2011).

Liu, Z. J. et al. 360 PPI flip-chip mounted active matrix addressable light emitting diode on silicon (LEDoS) micro-displays. J. Disp. Technol. 9 , 678–682 (2013).

Zhang, L. et al. Wafer-scale monolithic hybrid integration of Si-based IC and III–V epi-layers—A mass manufacturable approach for active matrix micro-LED micro-displays. J. Soc. Inf. Disp. 26 , 137–145 (2018).

Tian, P. F. et al. Size-dependent efficiency and efficiency droop of blue InGaN micro-light emitting diodes. Appl. Phys. Lett. 101 , 231110 (2012).

Olivier, F. et al. Shockley-Read-Hall and Auger non-radiative recombination in GaN based LEDs: a size effect study. Appl. Phys. Lett. 111 , 022104 (2017).

Konoplev, S. S., Bulashevich, K. A. & Karpov, S. Y. From large-size to micro-LEDs: scaling trends revealed by modeling. Phys. Status Solidi (A) 215 , 1700508 (2018).

Li, L. Z. et al. Transfer-printed, tandem microscale light-emitting diodes for full-color displays. Proc. Natl Acad. Sci. USA 118 , e2023436118 (2021).

Oh, J. T. et al. Light output performance of red AlGaInP-based light emitting diodes with different chip geometries and structures. Opt. Express 26 , 11194–11200 (2018).

Shen, Y. C. et al. Auger recombination in InGaN measured by photoluminescence. Appl. Phys. Lett. 91 , 141101 (2007).

Wong, M. S. et al. High efficiency of III-nitride micro-light-emitting diodes by sidewall passivation using atomic layer deposition. Opt. Express 26 , 21324–21331 (2018).

Han, S. C. et al. AlGaInP-based Micro-LED array with enhanced optoelectrical properties. Optical Mater. 114 , 110860 (2021).

Wong, M. S. et al. Size-independent peak efficiency of III-nitride micro-light-emitting-diodes using chemical treatment and sidewall passivation. Appl. Phys. Express 12 , 097004 (2019).

Ley, R. T. et al. Revealing the importance of light extraction efficiency in InGaN/GaN microLEDs via chemical treatment and dielectric passivation. Appl. Phys. Lett. 116 , 251104 (2020).

Moon, S. W. et al. Recent progress on ultrathin metalenses for flat optics. iScience 23 , 101877 (2020).

Arbabi, A. et al. Efficient dielectric metasurface collimating lenses for mid-infrared quantum cascade lasers. Opt. Express 23 , 33310–33317 (2015).

Yu, N. F. et al. Light propagation with phase discontinuities: generalized laws of reflection and refraction. Science 334 , 333–337 (2011).

Liang, H. W. et al. High performance metalenses: numerical aperture, aberrations, chromaticity, and trade-offs. Optica 6 , 1461–1470 (2019).

Park, J. S. et al. All-glass, large metalens at visible wavelength using deep-ultraviolet projection lithography. Nano Lett. 19 , 8673–8682 (2019).

Yoon, G. et al. Single-step manufacturing of hierarchical dielectric metalens in the visible. Nat. Commun. 11 , 2268 (2020).

Lee, G. Y. et al. Metasurface eyepiece for augmented reality. Nat. Commun. 9 , 4562 (2018).

Chen, W. T. et al. A broadband achromatic metalens for focusing and imaging in the visible. Nat. Nanotechnol. 13 , 220–226 (2018).

Wang, S. M. et al. A broadband achromatic metalens in the visible. Nat. Nanotechnol. 13 , 227–232 (2018).

Lan, S. F. et al. Metasurfaces for near-eye augmented reality. ACS Photonics 6 , 864–870 (2019).

Fan, Z. B. et al. A broadband achromatic metalens array for integral imaging in the visible. Light.: Sci. Appl. 8 , 67 (2019).

Shi, Z. J., Chen, W. T. & Capasso, F. Wide field-of-view waveguide displays enabled by polarization-dependent metagratings. In Proc. SPIE 10676, Digital Optics for Immersive Displays (SPIE, Strasbourg, France, 2018).

Hong, C. C., Colburn, S. & Majumdar, A. Flat metaform near-eye visor. Appl. Opt. 56 , 8822–8827 (2017).

Bayati, E. et al. Design of achromatic augmented reality visors based on composite metasurfaces. Appl. Opt. 60 , 844–850 (2021).

Nikolov, D. K. et al. Metaform optics: bridging nanophotonics and freeform optics. Sci. Adv. 7 , eabe5112 (2021).

Tamir, T. & Peng, S. T. Analysis and design of grating couplers. Appl. Phys. 14 , 235–254 (1977).

Miller, J. M. et al. Design and fabrication of binary slanted surface-relief gratings for a planar optical interconnection. Appl. Opt. 36 , 5717–5727 (1997).

Levola, T. & Laakkonen, P. Replicated slanted gratings with a high refractive index material for in and outcoupling of light. Opt. Express 15 , 2067–2074 (2007).

Shrestha, S. et al. Broadband achromatic dielectric metalenses. Light.: Sci. Appl. 7 , 85 (2018).

Li, Z. Y. et al. Meta-optics achieves RGB-achromatic focusing for virtual reality. Sci. Adv. 7 , eabe4458 (2021).

Ratcliff, J. et al. ThinVR: heterogeneous microlens arrays for compact, 180 degree FOV VR near-eye displays. IEEE Trans. Vis. Computer Graph. 26 , 1981–1990 (2020).

Wong, T. L. et al. Folded optics with birefringent reflective polarizers. In Proc. SPIE 10335, Digital Optical Technologies 2017 (SPIE, Munich, Germany, 2017).

Li, Y. N. Q. et al. Broadband cholesteric liquid crystal lens for chromatic aberration correction in catadioptric virtual reality optics. Opt. Express 29 , 6011–6020 (2021).

Bang, K. et al. Lenslet VR: thin, flat and wide-FOV virtual reality display using fresnel lens and lenslet array. IEEE Trans. Vis. Computer Graph. 27 , 2545–2554 (2021).

Maimone, A. & Wang, J. R. Holographic optics for thin and lightweight virtual reality. ACM Trans. Graph. 39 , 67 (2020).

Kramida, G. Resolving the vergence-accommodation conflict in head-mounted displays. IEEE Trans. Vis. Computer Graph. 22 , 1912–1931 (2016).

Zhan, T. et al. Multifocal displays: review and prospect. PhotoniX 1 , 10 (2020).

Shimobaba, T., Kakue, T. & Ito, T. Review of fast algorithms and hardware implementations on computer holography. IEEE Trans. Ind. Inform. 12 , 1611–1622 (2016).

Xiao, X. et al. Advances in three-dimensional integral imaging: sensing, display, and applications [Invited]. Appl. Opt. 52 , 546–560 (2013).

Kuiper, S. & Hendriks, B. H. W. Variable-focus liquid lens for miniature cameras. Appl. Phys. Lett. 85 , 1128–1130 (2004).

Liu, S. & Hua, H. Time-multiplexed dual-focal plane head-mounted display with a liquid lens. Opt. Lett. 34 , 1642–1644 (2009).

Wilson, A. & Hua, H. Design and demonstration of a vari-focal optical see-through head-mounted display using freeform Alvarez lenses. Opt. Express 27 , 15627–15637 (2019).

Zhan, T. et al. Pancharatnam-Berry optical elements for head-up and near-eye displays [Invited]. J. Optical Soc. Am. B 36 , D52–D65 (2019).

Oh, C. & Escuti, M. J. Achromatic diffraction from polarization gratings with high efficiency. Opt. Lett. 33 , 2287–2289 (2008).

Zou, J. Y. et al. Broadband wide-view Pancharatnam-Berry phase deflector. Opt. Express 28 , 4921–4927 (2020).

Zhan, T., Lee, Y. H. & Wu, S. T. High-resolution additive light field near-eye display by switchable Pancharatnam–Berry phase lenses. Opt. Express 26 , 4863–4872 (2018).

Tan, G. J. et al. Polarization-multiplexed multiplane display. Opt. Lett. 43 , 5651–5654 (2018).

Lanman, D. R. Display systems research at facebook reality labs (conference presentation). In Proc. SPIE 11310, Optical Architectures for Displays and Sensing in Augmented, Virtual, and Mixed Reality (AR, VR, MR) (SPIE, San Francisco, California, United States, 2020).

Liu, Z. J. et al. A novel BLU-free full-color LED projector using LED on silicon micro-displays. IEEE Photonics Technol. Lett. 25 , 2267–2270 (2013).

Han, H. V. et al. Resonant-enhanced full-color emission of quantum-dot-based micro LED display technology. Opt. Express 23 , 32504–32515 (2015).

Lin, H. Y. et al. Optical cross-talk reduction in a quantum-dot-based full-color micro-light-emitting-diode display by a lithographic-fabricated photoresist mold. Photonics Res. 5 , 411–416 (2017).

Liu, Z. J. et al. Micro-light-emitting diodes with quantum dots in display technology. Light.: Sci. Appl. 9 , 83 (2020).

Kim, H. M. et al. Ten micrometer pixel, quantum dots color conversion layer for high resolution and full color active matrix micro-LED display. J. Soc. Inf. Disp. 27 , 347–353 (2019).

Xuan, T. T. et al. Inkjet-printed quantum dot color conversion films for high-resolution and full-color micro light-emitting diode displays. J. Phys. Chem. Lett. 11 , 5184–5191 (2020).

Chen, S. W. H. et al. Full-color monolithic hybrid quantum dot nanoring micro light-emitting diodes with improved efficiency using atomic layer deposition and nonradiative resonant energy transfer. Photonics Res. 7 , 416–422 (2019).

Krishnan, C. et al. Hybrid photonic crystal light-emitting diode renders 123% color conversion effective quantum yield. Optica 3 , 503–509 (2016).

Kang, J. H. et al. RGB arrays for micro-light-emitting diode applications using nanoporous GaN embedded with quantum dots. ACS Applied Mater. Interfaces 12 , 30890–30895 (2020).

Chen, G. S. et al. Monolithic red/green/blue micro-LEDs with HBR and DBR structures. IEEE Photonics Technol. Lett. 30 , 262–265 (2018).

Hsiang, E. L. et al. Enhancing the efficiency of color conversion micro-LED display with a patterned cholesteric liquid crystal polymer film. Nanomaterials 10 , 2430 (2020).

Kang, C. M. et al. Hybrid full-color inorganic light-emitting diodes integrated on a single wafer using selective area growth and adhesive bonding. ACS Photonics 5 , 4413–4422 (2018).

Geum, D. M. et al. Strategy toward the fabrication of ultrahigh-resolution micro-LED displays by bonding-interface-engineered vertical stacking and surface passivation. Nanoscale 11 , 23139–23148 (2019).

Ra, Y. H. et al. Full-color single nanowire pixels for projection displays. Nano Lett. 16 , 4608–4615 (2016).

Motoyama, Y. et al. High-efficiency OLED microdisplay with microlens array. J. Soc. Inf. Disp. 27 , 354–360 (2019).

Fujii, T. et al. 4032 ppi High-resolution OLED microdisplay. J. Soc. Inf. Disp. 26 , 178–186 (2018).

Hamer, J. et al. High-performance OLED microdisplays made with multi-stack OLED formulations on CMOS backplanes. In Proc. SPIE 11473, Organic and Hybrid Light Emitting Materials and Devices XXIV . Online Only (SPIE, 2020).

Joo, W. J. et al. Metasurface-driven OLED displays beyond 10,000 pixels per inch. Science 370 , 459–463 (2020).

Vettese, D. Liquid crystal on silicon. Nat. Photonics 4 , 752–754 (2010).

Zhang, Z. C., You, Z. & Chu, D. P. Fundamentals of phase-only liquid crystal on silicon (LCOS) devices. Light.: Sci. Appl. 3 , e213 (2014).

Hornbeck, L. J. The DMD TM projection display chip: a MEMS-based technology. MRS Bull. 26 , 325–327 (2001).

Zhang, Q. et al. Polarization recycling method for light-pipe-based optical engine. Appl. Opt. 52 , 8827–8833 (2013).

Hofmann, U., Janes, J. & Quenzer, H. J. High-Q MEMS resonators for laser beam scanning displays. Micromachines 3 , 509–528 (2012).

Holmström, S. T. S., Baran, U. & Urey, H. MEMS laser scanners: a review. J. Microelectromechanical Syst. 23 , 259–275 (2014).

Bao, X. Z. et al. Design and fabrication of AlGaInP-based micro-light-emitting-diode array devices. Opt. Laser Technol. 78 , 34–41 (2016).

Olivier, F. et al. Influence of size-reduction on the performances of GaN-based micro-LEDs for display application. J. Lumin. 191 , 112–116 (2017).

Liu, Y. B. et al. High-brightness InGaN/GaN Micro-LEDs with secondary peak effect for displays. IEEE Electron Device Lett. 41 , 1380–1383 (2020).

Qi, L. H. et al. 848 ppi high-brightness active-matrix micro-LED micro-display using GaN-on-Si epi-wafers towards mass production. Opt. Express 29 , 10580–10591 (2021).

Chen, E. G. & Yu, F. H. Design of an elliptic spot illumination system in LED-based color filter-liquid-crystal-on-silicon pico projectors for mobile embedded projection. Appl. Opt. 51 , 3162–3170 (2012).

Darmon, D., McNeil, J. R. & Handschy, M. A. 70.1: LED-illuminated pico projector architectures. Soc. Inf. Disp. Int. Symp . Dig. Tech. Pap. 39 , 1070–1073 (2008).

Essaian, S. & Khaydarov, J. State of the art of compact green lasers for mobile projectors. Optical Rev. 19 , 400–404 (2012).

Sun, W. S. et al. Compact LED projector design with high uniformity and efficiency. Appl. Opt. 53 , H227–H232 (2014).

Sun, W. S., Chiang, Y. C. & Tsuei, C. H. Optical design for the DLP pocket projector using LED light source. Phys. Procedia 19 , 301–307 (2011).

Chen, S. W. H. et al. High-bandwidth green semipolar (20–21) InGaN/GaN micro light-emitting diodes for visible light communication. ACS Photonics 7 , 2228–2235 (2020).

Yoshida, K. et al. 245 MHz bandwidth organic light-emitting diodes used in a gigabit optical wireless data link. Nat. Commun. 11 , 1171 (2020).

Park, D. W. et al. 53.5: High-speed AMOLED pixel circuit and driving scheme. Soc. Inf. Disp. Int. Symp . Dig. Tech. Pap. 41 , 806–809 (2010).

Tan, L., Huang, H. C. & Kwok, H. S. 78.1: Ultra compact polarization recycling system for white light LED based pico-projection system. Soc. Inf. Disp. Int. Symp. Dig. Tech. Pap. 41 , 1159–1161 (2010).

Maimone, A., Georgiou, A. & Kollin, J. S. Holographic near-eye displays for virtual and augmented reality. ACM Trans. Graph. 36 , 85 (2017).

Pan, J. W. et al. Portable digital micromirror device projector using a prism. Appl. Opt. 46 , 5097–5102 (2007).

Huang, Y. et al. Liquid-crystal-on-silicon for augmented reality displays. Appl. Sci. 8 , 2366 (2018).

Peng, F. L. et al. Analytical equation for the motion picture response time of display devices. J. Appl. Phys. 121 , 023108 (2017).

Pulli, K. 11-2: invited paper: meta 2: immersive optical-see-through augmented reality. Soc. Inf. Disp. Int. Symp . Dig. Tech. Pap. 48 , 132–133 (2017).

Lee, B. & Jo, Y. in Advanced Display Technology: Next Generation Self-Emitting Displays (eds Kang, B., Han, C. W. & Jeong, J. K.) 307–328 (Springer, 2021).

Cheng, D. W. et al. Design of an optical see-through head-mounted display with a low f -number and large field of view using a freeform prism. Appl. Opt. 48 , 2655–2668 (2009).

Zheng, Z. R. et al. Design and fabrication of an off-axis see-through head-mounted display with an x–y polynomial surface. Appl. Opt. 49 , 3661–3668 (2010).

Wei, L. D. et al. Design and fabrication of a compact off-axis see-through head-mounted display using a freeform surface. Opt. Express 26 , 8550–8565 (2018).

Liu, S., Hua, H. & Cheng, D. W. A novel prototype for an optical see-through head-mounted display with addressable focus cues. IEEE Trans. Vis. Computer Graph. 16 , 381–393 (2010).

Hua, H. & Javidi, B. A 3D integral imaging optical see-through head-mounted display. Opt. Express 22 , 13484–13491 (2014).

Song, W. T. et al. Design of a light-field near-eye display using random pinholes. Opt. Express 27 , 23763–23774 (2019).

Wang, X. & Hua, H. Depth-enhanced head-mounted light field displays based on integral imaging. Opt. Lett. 46 , 985–988 (2021).

Huang, H. K. & Hua, H. Generalized methods and strategies for modeling and optimizing the optics of 3D head-mounted light field displays. Opt. Express 27 , 25154–25171 (2019).

Huang, H. K. & Hua, H. High-performance integral-imaging-based light field augmented reality display using freeform optics. Opt. Express 26 , 17578–17590 (2018).

Cheng, D. W. et al. Design and manufacture AR head-mounted displays: a review and outlook. Light.: Adv. Manuf. 2 , 24 (2021).

Google Scholar  

Westheimer, G. The Maxwellian view. Vis. Res. 6 , 669–682 (1966).

Do, H., Kim, Y. M. & Min, S. W. Focus-free head-mounted display based on Maxwellian view using retroreflector film. Appl. Opt. 58 , 2882–2889 (2019).

Park, J. H. & Kim, S. B. Optical see-through holographic near-eye-display with eyebox steering and depth of field control. Opt. Express 26 , 27076–27088 (2018).

Chang, C. L. et al. Toward the next-generation VR/AR optics: a review of holographic near-eye displays from a human-centric perspective. Optica 7 , 1563–1578 (2020).

Hsueh, C. K. & Sawchuk, A. A. Computer-generated double-phase holograms. Appl. Opt. 17 , 3874–3883 (1978).

Chakravarthula, P. et al. Wirtinger holography for near-eye displays. ACM Trans. Graph. 38 , 213 (2019).

Peng, Y. F. et al. Neural holography with camera-in-the-loop training. ACM Trans. Graph. 39 , 185 (2020).

Shi, L. et al. Towards real-time photorealistic 3D holography with deep neural networks. Nature 591 , 234–239 (2021).

Jang, C. et al. Retinal 3D: augmented reality near-eye display via pupil-tracked light field projection on retina. ACM Trans. Graph. 36 , 190 (2017).

Jang, C. et al. Holographic near-eye display with expanded eye-box. ACM Trans. Graph. 37 , 195 (2018).

Kim, S. B. & Park, J. H. Optical see-through Maxwellian near-to-eye display with an enlarged eyebox. Opt. Lett. 43 , 767–770 (2018).

Shrestha, P. K. et al. Accommodation-free head mounted display with comfortable 3D perception and an enlarged eye-box. Research 2019 , 9273723 (2019).

Lin, T. G. et al. Maxwellian near-eye display with an expanded eyebox. Opt. Express 28 , 38616–38625 (2020).

Jo, Y. et al. Eye-box extended retinal projection type near-eye display with multiple independent viewpoints [Invited]. Appl. Opt. 60 , A268–A276 (2021).

Xiong, J. H. et al. Aberration-free pupil steerable Maxwellian display for augmented reality with cholesteric liquid crystal holographic lenses. Opt. Lett. 46 , 1760–1763 (2021).

Viirre, E. et al. Laser safety analysis of a retinal scanning display system. J. Laser Appl. 9 , 253–260 (1997).

Ratnam, K. et al. Retinal image quality in near-eye pupil-steered systems. Opt. Express 27 , 38289–38311 (2019).

Maimone, A. et al. Pinlight displays: wide field of view augmented reality eyeglasses using defocused point light sources. In Proc. ACM SIGGRAPH 2014 Emerging Technologies (ACM, Vancouver, Canada, 2014).

Jeong, J. et al. Holographically printed freeform mirror array for augmented reality near-eye display. IEEE Photonics Technol. Lett. 32 , 991–994 (2020).

Ha, J. & Kim, J. Augmented reality optics system with pin mirror. US Patent 10,989,922 (2021).

Park, S. G. Augmented and mixed reality optical see-through combiners based on plastic optics. Inf. Disp. 37 , 6–11 (2021).

Xiong, J. H. et al. Breaking the field-of-view limit in augmented reality with a scanning waveguide display. OSA Contin. 3 , 2730–2740 (2020).

Levola, T. 7.1: invited paper: novel diffractive optical components for near to eye displays. Soc. Inf. Disp. Int. Symp . Dig. Tech. Pap. 37 , 64–67 (2006).

Laakkonen, P. et al. High efficiency diffractive incouplers for light guides. In Proc. SPIE 6896, Integrated Optics: Devices, Materials, and Technologies XII . (SPIE, San Jose, California, United States, 2008).

Bai, B. F. et al. Optimization of nonbinary slanted surface-relief gratings as high-efficiency broadband couplers for light guides. Appl. Opt. 49 , 5454–5464 (2010).

Äyräs, P., Saarikko, P. & Levola, T. Exit pupil expander with a large field of view based on diffractive optics. J. Soc. Inf. Disp. 17 , 659–664 (2009).

Yoshida, T. et al. A plastic holographic waveguide combiner for light-weight and highly-transparent augmented reality glasses. J. Soc. Inf. Disp. 26 , 280–286 (2018).

Yu, C. et al. Highly efficient waveguide display with space-variant volume holographic gratings. Appl. Opt. 56 , 9390–9397 (2017).

Shi, X. L. et al. Design of a compact waveguide eyeglass with high efficiency by joining freeform surfaces and volume holographic gratings. J. Optical Soc. Am. A 38 , A19–A26 (2021).

Han, J. et al. Portable waveguide display system with a large field of view by integrating freeform elements and volume holograms. Opt. Express 23 , 3534–3549 (2015).

Weng, Y. S. et al. Liquid-crystal-based polarization volume grating applied for full-color waveguide displays. Opt. Lett. 43 , 5773–5776 (2018).

Lee, Y. H. et al. Compact see-through near-eye display with depth adaption. J. Soc. Inf. Disp. 26 , 64–70 (2018).

Tekolste, R. D. & Liu, V. K. Outcoupling grating for augmented reality system. US Patent 10,073,267 (2018).

Grey, D. & Talukdar, S. Exit pupil expanding diffractive optical waveguiding device. US Patent 10,073, 267 (2019).

Yoo, C. et al. Extended-viewing-angle waveguide near-eye display with a polarization-dependent steering combiner. Opt. Lett. 45 , 2870–2873 (2020).

Schowengerdt, B. T., Lin, D. & St. Hilaire, P. Multi-layer diffractive eyepiece with wavelength-selective reflector. US Patent 10,725,223 (2020).

Wang, Q. W. et al. Stray light and tolerance analysis of an ultrathin waveguide display. Appl. Opt. 54 , 8354–8362 (2015).

Wang, Q. W. et al. Design of an ultra-thin, wide-angle, stray-light-free near-eye display with a dual-layer geometrical waveguide. Opt. Express 28 , 35376–35394 (2020).

Frommer, A. Lumus: maximus: large FoV near to eye display for consumer AR glasses. In Proc. SPIE 11764, AVR21 Industry Talks II . Online Only (SPIE, 2021).

Ayres, M. R. et al. Skew mirrors, methods of use, and methods of manufacture. US Patent 10,180,520 (2019).

Utsugi, T. et al. Volume holographic waveguide using multiplex recording for head-mounted display. ITE Trans. Media Technol. Appl. 8 , 238–244 (2020).

Aieta, F. et al. Multiwavelength achromatic metasurfaces by dispersive phase compensation. Science 347 , 1342–1345 (2015).

Arbabi, E. et al. Controlling the sign of chromatic dispersion in diffractive optics with dielectric metasurfaces. Optica 4 , 625–632 (2017).

Download references

Acknowledgements

The authors are indebted to Goertek Electronics for the financial support and Guanjun Tan for helpful discussions.

Author information

Authors and affiliations.

College of Optics and Photonics, University of Central Florida, Orlando, FL, 32816, USA

Jianghao Xiong, En-Lin Hsiang, Ziqian He, Tao Zhan & Shin-Tson Wu

You can also search for this author in PubMed   Google Scholar

Contributions

J.X. conceived the idea and initiated the project. J.X. mainly wrote the manuscript and produced the figures. E.-L.H., Z.H., and T.Z. contributed to parts of the manuscript. S.W. supervised the project and edited the manuscript.

Corresponding author

Correspondence to Shin-Tson Wu .

Ethics declarations

Conflict of interest.

The authors declare no competing interests.

Supplementary information

Supplementary information, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Xiong, J., Hsiang, EL., He, Z. et al. Augmented reality and virtual reality displays: emerging technologies and future perspectives. Light Sci Appl 10 , 216 (2021). https://doi.org/10.1038/s41377-021-00658-8

Download citation

Received : 06 June 2021

Revised : 26 September 2021

Accepted : 04 October 2021

Published : 25 October 2021

DOI : https://doi.org/10.1038/s41377-021-00658-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Color liquid crystal grating based color holographic 3d display system with large viewing angle.

  • Qiong-Hua Wang

Light: Science & Applications (2024)

Mass-produced and uniformly luminescent photochromic fibers toward future interactive wearable displays

Enhancing the color gamut of waveguide displays for augmented reality head-mounted displays through spatially modulated diffraction grating.

  • Jae-Sang Lee
  • Seong-Hyeon Cho
  • Young-Wan Choi

Scientific Reports (2024)

Effects of virtual reality exposure therapy on state-trait anxiety in individuals with dentophobia

  • Elham Majidi
  • Gholamreza Manshaee

Current Psychology (2024)

A review of convolutional neural networks in computer vision

  • Milan Parmar

Artificial Intelligence Review (2024)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

research paper on virtual environment

  • Reference Manager
  • Simple TEXT file

People also looked at

Review article, social interaction with agents and avatars in immersive virtual environments: a survey.

www.frontiersin.org

  • GET Lab, Department of Multimedia and Graphic Arts, Cyprus University of Technology, Limassol, Cyprus

Immersive virtual reality technologies are used in a wide range of fields such as training, education, health, and research. Many of these applications include virtual humans that are classified into avatars and agents. An overview of the applications and the advantages of immersive virtual reality and virtual humans is presented in this survey, as well as the basic concepts and terminology. To be effective, many virtual reality applications require that the users perceive and react socially to the virtual humans in a realistic manner. Numerous studies show that people can react socially to virtual humans; however, this is not always the case. This survey provides an overview of the main findings regarding the factors affecting the social interaction with virtual humans within immersive virtual environments. Finally, this survey highlights the need for further research that can lead to a better understanding of human–virtual human interaction.

Introduction

Apart from the fact that virtual reality (VR) technologies can simulate environments and situations in a realistic and believable manner, they offer several advantages that make their use very beneficial in various fields. As a result, in the past decade, VR technologies are used in a wide range of applications. For example, social VR applications ( McVeigh-Schultz et al., 2018 ) allow people to remotely meet, collaborate, and share ( Li et al., 2019 ). Also, many of the most widely used and promising VR applications concern training simulations that are used as a training tool for pilots and drivers of various vehicles, dangerous jobs such as mine workers ( Bellanca et al., 2019 ), and the military ( Koźlak et al., 2013 ). A key advantage of using VR in these applications is that it provides realistic training conditions in a controlled and, therefore, much safer environment while significantly reducing the cost and increasing the efficiency of the training. Things that cannot be controlled in the physical world, such as the time of day, or are random, such as the weather conditions, in a virtual world are fully controllable. Moreover, VR offers the possibility of repeating scenarios and evaluating the learner’s performance better. The introduction of VR in education can enhance learning outcomes ( Merchant, Goetz, Cifuentes, Keeney-Kennicutt and Davis, 2014 ). VR increases the learner’s motivation and involvement. VR allows students to experience, rather than just watch and listen, while promoting complex learning ( Villena Taranilla et al., 2019 ). It gives students an opportunity to explore objects or events that are not accessible, such as the solar system, historical places, and events ( Villena Taranilla et al., 2019 ; Kyrlitsias et al., 2020 ) or the inside of the human body ( Parong and Mayer, 2018 ; Michael-Grigoriou, Yiannakou and Christofi, 2017). Also, VR can be beneficial for teacher training ( Stavroulia et al., 2019 ). Immersive virtual reality technologies are used in the fields of health on the part of education and training as well as in various kinds of therapies. The use of simulators in medical education protects patients while offering students a way to develop their skills, knowledge, and confidence, as well as evaluating their performance ( Lateef, 2010 ; Pottle, 2019 ). Virtual reality therapies ( Wiederhold and Riva, 2019 ) are used in patients with various phobias such as fear of heights ( Rothbaum et al., 1995 ; Seinfeld et al., 2016 ), claustrophobia ( Christofi, and Michael-Grigoriou, 2016 ; Rahani et al., 2018 ), fear of public speaking ( Nazligul et al., 2017 ; Takac et al., 2019 ), social anxiety ( Chesham et al., 2018 ), posttraumatic stress ( Botella, Serrano, Baños, and Garcia-Palacios, 2015 ), and depression ( Falconer et al., 2016 ).

The above are just a few examples of applications of VR technologies in various fields, through which we can distinguish the advantages of this technology. To summarize, VR technologies can provide affordable, realistic, controlled, safe, interactive, and accessible experiences to the user. Below, the basic concepts related to virtual humans (VHs) and VR are presented along with the relevant references. Then, the theory and the main factors that affect social interactions with VHs are presented. Finally, the authors summarize and discuss the topic, and suggest future research directions on social interactions with VHs. The references listed in this survey were selected by the authors to better illustrate the relevant literature. No systematic approach was followed for this survey.

Virtual Humans

Many of these applications described above require the inclusion of virtual representations of humans. The representations of humans in virtual environments are called VHs. We define a VH as a “perceivable digital representation” of a human ( Bailenson and Blascovich, 2004 ). VHs are classified into avatars and agents ( Bailenson and Blascovich, 2004 ; von der Pütten et al., 2010 ), depending on who directs their behavior. An avatar is a VH whose behaviors reflect those executed by a specific human being. On the other hand, an agent is a VH whose behaviors are determined by the computer algorithm. However, since today’s technology is unable to reflect all human actions on avatars, the distinction between an agent and an avatar is not always clear ( Bailenson and Blascovich, 2004 ). Various forms of communications (e.g., facial expressions, gaze behavior, tone of voice, or body language) that may not be tracked by the system and, therefore, not attributed to the avatar are omitted or alternatively rendered onto the VH. As a result, a VH usually constitutes a hybrid of an agent and an avatar. However, recent technological advances such as real-time body and facial expression tracking can provide affordable solutions so the behavioral resemblance of the user and the avatar can be extremely accurate. In the future, we expect to have photorealistic avatars whose voices, movements, facial expressions, and gaze are determined completely by the user in real time. Despite that, hybrid agent-avatars can be used to combine the advantages of both agent and avatar technologies ( Roth, Latoschik, Vogeley and Bente, 2015 ). Additionally, unlike the physical world where there are clear boundaries between humans and nonhumans, there are not necessarily any visible differences between human-controlled and computer-controlled VHs ( Nowak and Fox, 2018 ). It is up to the developer of the VR application to conceal or inform (or even mislead) the user whether a VH is an avatar or an agent. Therefore, in a shared virtual environment, the user may not know which of the VHs are agents and which are avatars.

In immersive virtual environments (IVEs), an avatar is the (usually visual) representation of the user in a virtual world. An avatar is perceivable by the user and/or by the other users, in the case of multiuser virtual environments ( Nowak and Fox, 2018 ) such as social VR applications ( Gunkel et al., 2018 ; McVeigh-Schultz et al., 2018 ). In the case of the self-representation, the users can observe their avatar from either a first-person or a third-person perspective ( Gorisse, Christmann, Amato and Richir, 2017 ), whereas in some cases the use of avatars is implied or omitted. In projection-based VR systems (e.g., Cruz-Neiraet al., 1993 ; Roth, Waldow, Latoschik, Fuhrmann and Bente, 2017 ), no avatars are required for self-representation since the users can observe their physical body. In head-mounted display (HMD)-based VR settings, users are unable to see their physical body. In these cases, an avatar can be used to provide the users with a virtual body, usually with a first-person perspective. The degree to which the users can control their avatars varies, depending on the capabilities of the VR system. Under some situations ( Kilteniet al., 2012 ), a sense of ownership over the virtual body can emerge to the user, which is called the sense of embodiment. Studies ( Slater and Sanchez-Vives, 2014 ) showed that people tend to alter their attitudes and behaviors to match the expectations that are implied by the attributes of their virtual body. This phenomenon is known as the Proteus effect ( Yee et al., 2009 ).

With the constant advancement of technology in the fields of computer graphics, machine learning, and artificial intelligence ( Petrović, 2018 ), virtual agents are becoming more and more realistic in both appearance and behavior. At the same time, the opportunities and the efficiency of their use increase.

In VR entertainment applications, such as videogames, we refer to VHs that are used as actors in the game environment as non-player characters (NPCs). They act in the game as hostile, friendly, or neutral characters to the player. Their behavior is most of the time scripted and limited to the level needed to support their role in the game. However, there are examples of NPCs that are able to interact in more complex ways with the player ( Takahashi et al., 2018 ), such as expressing emotions ( Li and Campbell, 2010 ), taking decisions autonomously ( Xi and Smith, 2016 ), and acting independently. The NPCs are a crucial part of a VR game and can drastically impact the user’s gaming experience ( Petrović, 2018 ).

Using VR, agents can play the role of the audience in applications for practicing presentation skills and overcoming public-speaking anxiety. Individuals can practice their presentations or speeches in an immersive virtual environment that includes real-life conditions. Studies ( Nazligul et al., 2017 ; Takac et al., 2019 ) have shown that these applications are found to be beneficial in treating social anxiety disorders. Also, the number and the behavior of an audience consisting of agents are highly flexible and customizable, allowing the gradation of the challenge level using different scenarios ( Botella, Garcia-Palacios, Baños and Quero, 2009 ). In the same way, agents are used in the treatment of various types of phobias using VR. The virtual agents who, through the use of artificial intelligence, have the capability of engaging in humanlike conversations are referred to as conversational agents ( Yildirim, 2021 ). In some examples, agents are used to help, guide, encourage, and motivate the patient, replacing the human therapist ( Bălan et al., 2020 ), while sometimes replacing patients in training scenarios for doctors and therapists ( Lok et al., 2006 ; Rizzo and Talbot, 2016 ), or motivating other patients ( Najm et al., 2020 ). Agents are used as healthcare assistants ( Kim et al., 2019 ) to support registered healthcare professionals in conducting clinical tasks and providing care to the patients. Also, a study ( Lucas, Gratch, King and Morency, 2014 ) showed that VH-interviewers can increase willingness to disclose and elicit more honest responses in a clinical interview context. In educational VR applications, agents have a crucial role, either as teachers or students. Studies showed that using pedagogical ( Johnson and Lester, 2018 ; Makransky et al., 2019 ) agents can improve students’ learning experience in an educational VR environment, enhance their engagement, and improve their knowledge construction and performance ( Grivokostopoulou et al., 2020 ). Also, agents can play the role of students in teacher training scenarios ( Stavroulia et al., 2019 ).

These were just a few examples of how the recruitment of virtual agents can be beneficial in an unlimited range of applications. They can be used in combination with other technologies to replace humans in social tasks efficiently. To summarize, some of the advantages of the use of virtual agents are that they are always available, even for multiple instances at the same moment; affordable; fully customizable and flexible, in both appearance and behavior; and fully controllable.

Hybrid Agents-Avatars

While the addition of computer-controlled behavior in avatars is usually performed to cover the inability of the technology to mirror the user’s behavior ( Bailenson and Blascovich, 2004 ), hybrid agent-avatars can be used to modify or enhance the avatar-mediated communication in shared VEs ( Roth et al., 2015 ). For example, a study ( Beall, Bailenson, Loomis, Blascovich and Rex, 2003 ) showed the example that an avatar can be shown to maintain eye contact with more than one interactant at a time. A study by Oh, Bailenson, Krämer, and Li (2016) showed enhancing the smile that was tracked from the participant led to more positive communication outcomes. In another study (Roth, Mal, Purps, Kullmann and Latoschik, 2018), mimicry behavior was injected in an avatar-meditated interaction to enhance the interpersonal understanding and rapport between the interactants. Roth, Kullmann, Bente, Gall, and Latoschik (2018) altered the avatar’s tracked gaze direction in selected occasions to induce a listening focus to the other user.

The Use of Immersive Virtual Reality and Virtual Humans for Research

We have previously referred to the benefits and possibilities that immersive virtual reality (IVR) technologies offer as well as to the solutions that these technologies provide in a wide range of fields. Besides that, researchers have come to realize early that IVR can be very useful as a research tool ( Blascovich et al., 2002 ; Tarr and Warren, 2002 ; Foreman, 2009 ). In the last 2 decades, IVR technologies are used for the study of human behavior and cognition in the fields of psychology ( Wilson and Soranzo, 2015 ; Pan and Hamilton, 2018 ) and neuroscience ( Bohil et al., 2011 ; Parsons et al., 2017 ; Bell, Nicholas, Alvarez-Jimenez, Thompson and Valmaggia, 2020 ).

Additionally, studies replicated classic social experiments in using IVR and VHs, demonstrating social effects such as obedience ( Slater et al., 2006 ; Neyret et al., 2020 ), conformity ( Kyrlitsias and Michael-Grigoriou, 2018 ; Kyrlitsias et al., 2020 ), and social facilitation/inhibition ( Hoyt et al., 2003 ).

IVR technologies not only can offer researchers solutions to address several methodological problems, but they also create new research possibilities that were not possible in the past.

With IVR technologies, researchers can achieve realistic and complex environments that simulate accurately the experimental scenario and, therefore, high mundane realism (the degree to which the materials and procedures involved in an experiment are similar to events that occur in the real world; Kelly, 2007 ). At the same time, IVR provides the capability to induce to the participant the illusion of presence and elicit realistic (similar to real-life) reactions ( Slater, 2009 ), achieving high experimental realism (the extent to which situations created in experiments are real and impactful to participants; Kosloff, 2007 ). This applies also to experiments that include social interactions, through social presence, as subjective feelings, and behavioral and physiological reactions during human–VH interactions can be very similar to those shown during human–human interactions ( Bombari, Schmid Mast, Canadas and Bachmann, 2015 ).

Consequently, VR offers the possibility to conduct experiments with high ecological validity (“the extent to which research findings would generalize to settings typical of everyday life”; Baumeister and Vohs, 2007 , p. 276), something that in the past was very difficult and required a high amount of resources to be achieved. For example, in experiments studying social influence, actors trained to maintain the same verbal and nonverbal behavior across sessions were used as confederates ( Asch, 1956 ; Milgram, 1963 ). These solutions not only lead to more expensive experimental scenarios but are also difficult to implement and can often affect the level of experimental control. And this is one of the main methodological problems for researchers, the tradeoff between ecological validity and experimental control ( Blascovich et al., 2002 ; Kothgassner and Felnhofer, 2020 ). VR technologies can provide a high level of ecological validity as they can generate stimuli that approximate the complexity of a real-life situation while allowing the investigator for near-perfect experimental control ( Bombari et al., 2015 ; Parsons, 2015 ). The high level of experimental control and the flexibility offered to the experimenter by VR technologies “enables the researcher to selectively manipulate variables that in naturalistic situations cannot be independently investigated” ( Parsons, 2015 , p. 7).

In addition, using VR makes replication of studies easier. According to Blascovich et al. (2002) , in domains such as social neuroscience and psychology, one of the reasons for the lack of replications is the difficulty for a researcher to implement and use the exact methods and procedures of other investigators. VR technologies, however, enable researchers to conduct perfect (or at least near-perfect) replications ( Bombari et al., 2015 ).

Finally, using VR, researchers can conduct experiments with scenarios that are impossible (e.g., Friedman et al., 2014 ) or unethical (e.g., Gonzalez-Franco et al., 2018 ; Neyret et al., 2020 ) to be tested in real life. This is possible because participants react to virtual characters and events as if they were real, and at the same time they remain aware that there are no real danger and consequences as a result of their actions ( Pan and Hamilton, 2018 ). For example, perception and behavior in dangerous or threatening situations can be studied, without participants being exposed to real danger ( Kinateder et al., 2015 ; McCall, Hildebrandt, Bornemann and Singer, 2015 ). Even though the main effort in research and development focuses on the best possible simulation of the real world, VR has the possibility of going beyond the limits of physical reality ( Slater and Sanchez-Vives, 2016 ). Rules that exist in the “real” world do not necessarily exist in a virtual world. The physical laws, the time continuity ( Friedman et al., 2014 ), human body characteristics, and limits ( Slater and Sanchez-Vives, 2014 ) are manipulatable by the researcher, creating new research opportunities. For example, in a recent study ( Friedman et al., 2014 ), the participants were given the illusion of traveling back in time, having the ability to prevent a tragic event in which they were present.

Using VR, researchers are able to dramatically alter the participants’ self-representation by inducing in them a sense of embodiment toward a virtual body with different characteristics. This ability created a wide range of opportunities for investigating the impact of self-representation on the individual’s attitudes and behaviors ( Maister, Slater, Sanchez-Vives and Tsakiris, 2015 ). Even if in experiments with such manipulations the ecological validity is typically low, researchers can investigate the interaction with different variables and expand the theoretical understanding of human cognition and behavior ( Bombari et al., 2015 ). A study by Kilteni et al. (2013) showed that participants embodied in a dark-skinned, casually dressed, virtual body expressed significantly greater body movement in a task that required playing drums than participants embodied in a light-skinned, formally dressed, body. This result was attributed to the stereotype that a dark-skinned, casually dressed, body is expected to be more bodily expressive. Other studies ( Maister, Sebanz, Knoblich and Tsakiris, 2013 ; Peck, Seinfeld, Aglioti and Slater, 2013 ) showed that embodiment in a dark-skinned body resulted in a reduction of the implicit racial bias toward dark-skinned people. Also, a study found that the impact on implicit racial bias remained even a week after the participants’ embodiment experience ( Banakou et al., 2016 ).

To summarize, VR technologies became a powerful tool for researchers and studying human behavior. They can provide a series of advantages, such as realistic and complex experimental scenarios with almost perfect experimental control of the environment and the VHs, allowing researchers to overcome methodological problems. Additionally, they create new research opportunities for testing scenarios that are difficult or even impossible to be conducted in real-life settings.

Immersive Virtual Environments and Virtual Human Technologies

With VR we refer to the creation of simulated environments (i.e., IVEs) with the use of computer technology, software, and hardware. In contrast to traditional interfaces, VR not only displays the created environments to the users but also gives them the feeling that they are “inside” the environment. This is achieved by “careful integration of hardware and software systems, including multimedia development software, databases, computers, rendering engines, and user interfaces” ( Blascovich et al., 2002 , p. 107). Today, typical VR systems provide stereoscopic vision that is updated as a function of the user’s head-tracking and directional audio ( Slater and Sanchez-Vives, 2016 ). It is also common for the VR systems to provide additional tracking technology (apart from the head) for the user’s hands or even for the full body. An article by Slater and Sanchez-Vives (2016) presents an overview of the basic concepts and the technology of VR systems.

The applications described above are feasible due to the huge technological advances that have taken place in the last 2 decades. Nevertheless, the possibilities of the current technology are not unlimited, but on the contrary they include several limitations and disadvantages. Therefore, the ideal virtual reality, in which the experience offered can be compared to that of the real world, is far from the possibilities of today. Above that, the availability, the cost, and the physical and technical limitations and drawbacks of the current technology are creating additional limitations and tradeoffs on the quality of a virtual reality experience.

For example, the visual fidelity of and rendering quality of virtual environments (and VHs) are limited by the computational capability of the computer. Some techniques are used for the optimization of performance that usually sacrifice realism, such as precomputed illumination (i.e., lighting, shadows, and reflections) instead of real time illumination that is changing dynamically. Another limitation is that the display quality (resolution and refresh rate) of the current VR systems is quite limited even in the most sophisticated VR devices (i.e., HMDs), with the distinction of pixel still visible and distracting. Despite continuous advances in computing power, graphic representation, and display quality, the visual quality in VR is yet far from perfect.

Even more challenging than displaying visually plausible virtual environments and humans is the attempt to display environments and humans that are behaving and interacting with the user in a realistic way. The way that users in VR can interact with virtual objects is an ongoing challenge. Designing of VHs (i.e., agents) that behave and interact with the user in a realistic way is an even bigger challenge due to the complexity of human behavior. Using the current technology, as described in the previous section, agents can interact with the user, have a verbal conversation, or show nonverbal responses such as facial expressions and gestures. However, in these examples, each aspect of the agents’ behavior and intelligence is limited to the functions implemented by the creators, usually to support the purpose of the application.

Regarding avatars, the accurate resample of human actions such as body movements, facial expressions, and eye movements on the virtual body is important for inducing the sense of body ownership, as well as for communication with other users in sheared immersive environments. There are many available methods that are used to transfer the users’ body movements to the avatar. These methods use different technologies and vary in accuracy, cost, and convenience of use. Advanced motion tracking systems that are used for full-body motion tracking of the user provide very accurate resemblance of the users’ body movement on the avatar with low latency; however, these systems are very expensive and require the users to wear a suit of trackers and time for calibration. For finger tracking, additional gloves are required. Head position and orientation is typically tracked by the HMD. Commercial VR systems typically include tracked controllers that in combination with an inverse kinematic technique can be used for approximating the pose of the arms. Similarly, using 2, 3, or more additional trackers (typically for the feet and the waist), a full-body motion approximation can be achieved. This method does not provide as accurate results as the advanced motion tracking systems; however, it is significantly more affordable and easier to set up and use. Alternatively, instead of using additional trackers for the legs, prerecorded walking animations are used for the leg movement. Another method of tracking the users’ body is using depth camera devices. This method has the advantage of not requiring the user to wear or hold any equipment; however, the tracking quality is limited. Additionally, HMDs with a built-in eye tracker, as well as facial trackers, are commercially available, which can be used to track the user’s eye movements and facial expressions (including lip motion while talking), respectively, and render them on the avatar.

Besides visual information, for creating realistic VR experiences, additional senses such as touch, smell, temperature, and even taste ( Rubio-Tamayo, Gertrudix Barrio and García García, 2017 ), can include meaningful information in face-to-face interactions. Additionally, a crucial aspect for inducing a sense of embodiment over a virtual body (i.e., avatar) is the creation of the illusion that the virtual body is the source of the experience sensations ( Kilteni et al., 2012 ), usually achieved using synchronous visuotactile or visuoproprioceptive stimulation. Successful embodiment can have an impact on social interactions with VHs ( Ratan, Beyea, Li and Graciano, 2020 ). Researchers used several tricks to simulate the sense of touch to the participants, such as the experimenter touching the participant with a wand ( Slater et al., 2009 ). Today a wide range of devices are commercially available ( Perret and Vander Poorten, 2018 ), mainly haptic gloves, with different approaches and functions. However, providing realistic haptic feedback with easy-to-use equipment remains a challenge.

Other limitations and problems that are associated with VR technologies over time are still challenging and need to be addressed in the future. One of them is the physical discomfort or cybersickness ( Davis et al., 2014 ) that may result from the use of HMDs and can have a negative impact on the user’s experience in the VE ( Weech et al., 2019 ) and, therefore, on the social interactions taking place in it. One way of dealing with the problem of cybersickness is improving the hardware, by increasing the refresh rate, improving head-tracking quality, and reducing tracking and display delay ( Chang et al., 2020 ). Cybersickness is also attributed to the content of the VR application. For that reason, it is very important to develop VR applications to avoid content that promotes cybersickness and include techniques that are proven to reduce cybersickness.

Another inhered problem of VR is the locomotion within the virtual environment (Cherni et al., 2020). Even with the current HMDs that include positional tracking and allow the user to walk physically, the walking area is restricted to the physical space. Another method of navigation in the virtual environment is using a joystick; however, this method is associated with cybersickness ( Saredakis et al., 2020 ). For that reason, teleporting has become a popular way of navigation in VR. A new way of locomotion in VR is omnidirectional treadmill devices that allow the user to navigate with seminatural movements while staying in place. However, these devices are still expensive and not easy to use ( Christofi et al., 2020 ).

Virtual Reality Concepts

The ability of the system to provide the user with an illusion of reality is called immersion and is defined as “the extent to which the computer displays are capable of delivering an inclusive, extensive, surrounding and vivid illusion of reality to the senses of a human participant” ( Slater and Wilbur, 1997 , p. 3). Consequently, immersion can be objectively assessed, based on technical parameters used to describe a system.

As mentioned above, VR systems are not designed only to display the virtual environment to users but also attempt to induce the feeling that they are “inside” the environment, and that is what makes VR special. However, the term VR is sometimes used to describe systems that do not have the technical capability to induce the user with the sense of being inside the virtual environment that is displayed by the system. The terms non-immersive VR and desktop VR are also used to describe these systems. In this article, the term VR is used to describe immersive VR systems.

The use of VR technologies in a wide range of fields and the use of VHs in many of these applications were discussed in the previous section. A crucial factor for the effectiveness of many of these applications is that the user perceives and responds to the events and situations taking place in the virtual environment as if they were real. Empirical studies have explored factors that contribute to realistic behavior in immersive virtual environments, while various theories have attempted to explain this phenomenon. Most of these theories are based on the concept of presence, the sense of “being” in the virtual environment, also referred to as telepresence or place illusion ( Ijsselsteijn and Riva, 2003 ; Sanchez-Vives and Slater, 2005 ; Slater, 2009 ). Slater (2009) defines presence as “the strong illusion of being in a place in spite of the sure knowledge that you are not there” ( p . 3551).

Although it is strongly related to immersion ( Slater, 2003 ), presence is a subjective perception determined by how the person perceives and interprets stimuli, defined by characteristics of the VR system and the level of immersion ( Ijsselsteijn and Riva, 2003 ).

Presence has been the main focus of both applied and academic work on VR as it is associated with the effectiveness of a VR experience. The greater the sense of the user’s presence in the virtual environment, the more realistic (similar to the real world) their reactions and behaviors are and, in turn, the more successful the VR application is ( Cummings and Bailenson, 2016 ).

Social Presence

As described above, VR is capable of inducing to the users a sense of presence, which is the feeling of being in the virtual environment. The greater the sense of the users’ presence in the virtual world, the more realistic (similar to the real world) their reactions and behaviors are. However, the sense of “being there” is not enough for a realistic perception and reaction toward VHs ( Lee, Jung, Kim and Kim, 2006 ). In virtual environments, where the user coexists with VHs, it is important that the user perceives the presence of the VH not only physically but also socially. Social presence (also referred to as co-presence) refers to the extent to which the user actively perceives a VH in a virtual environment and at the same time has the sense that the “other” perceives the presence of the user ( Biocca, 1997 ; Oh et al., 2018 ). While presence describes the illusion of “being” in a virtual space that may include VHs, social presence refers to the experience of “being together” with a sentient social being, either an agent or an avatar ( Biocca et al., 2003 ).

Social presence is important due to the impact it has on social influence ( Blascovich, 2002 ) and is associated with a variety of positive communication outcomes ( Oh et al., 2018 ). For example, the results of a study (Thellman, Silvervarg, Gulz and Ziemke, 2016) demonstrated the effect of social presence on social influence by VHs. Specifically, participants who reported a stronger social presence were more inclined to accept the VH’s offer in an ultimatum game. The impact of social presence on social influence is demonstrated by other studies (e.g., Hoyt et al., 2003 ; Strojny, Dużmańska-Misiarczyk, Lipp and Strojny, 2020 ). Consequently, the greater the sense of the users’ social presence for a VH, the more realistic (similar to human–human and face-to-face) their social reactions are. This makes social presence a vital component for the realism and the effectiveness of social interactions between the user and VHs in VR environments. Also, studies ( Schroeder et al., 2001 ; Heldal et al., 2005 ; Guimarães et al., 2020 ) showed that the participant’s sense of social presence to VHs was higher for immersive VR than a non-immersive platform. This finding indicates the advantage of VR over non-immersive technologies in simulating social interactions with VHs.

Social Interaction With Virtual Humans

Numerous studies show that people react socially to VHs. While an individual interacts with an avatar (or believing that it is an avatar), social responses are expected because such an interaction is perceived to be a human–human interaction mediated by the technology ( Nowak and Fox, 2018 ). But why do individuals respond socially even if they know (or believe) that they are interacting with an agent, directed by a computer? Several theories attempt to explain social effects in interactions with computers. Earlier theories suggested that individuals socially react to computers temporarily due to the novelty of the situation ( Kiesler and Sproull, 1997 ) or due to human deficits such as ignorance ( Barley, 1988 ). Another approach suggests that social reactions are oriented toward the programmer rather than the computer itself ( Dennett, 1987 ). However, the above theories have not been adopted and have become obsolete. The prevailing theory ( Nasset al., 1994 ; Nass and Moon, 2000 ), known as the computers are social actors (CASA) paradigm, supports that social responses to computers result neither from the users’ belief that they are interacting with the programmer nor from ignorance. Instead, the CASA paradigm argues that people unconsciously react to computers in the same way as they do toward humans. This can be attributed to the fact that the human brain is developed to automatically respond to social cues to deal successfully with daily life ( Reeves and Nass, 1996 , p. 97).

Evaluating Social Interactions With Virtual Humans

Several methods are used in the literature for the evaluation of the quality of the interactions with VHs. A common method of evaluation of social interactions with VHs is through subjective measures. Specifically, using self-reported questionnaires with which the participants are asked to evaluate their experience after their exposure to an IVE using scales such as social presence ( Bioccaet al., 2003 ; Bailenson, Blascovich, Beall and Loomis, 2003), self-reported copresence, perceived other’s copresence ( Nowak, and Biocca, 2003 ) the Quality of Interaction, and Social Meaning ( Li et al., 2019 ), and other positive communication outcomes such as likability and credibility ( Guadagno, Blascovich, Bailenson and McCall, 2007 ).

Behavioral objective measures are also used in evaluating social interactions with VHs. Using VR technologies is very convenient for recording several aspects of the participants’ behavior, for example, using the built-in motion trackers of the HMD and the built-in eye trackers, and by recording the participants’ actions and navigation within the virtual environment. Measures such as participants’ gaze behavior (Roth et al., 2018; Kyrlitsias et al., 2020 ), interpersonal distance (Bailenson, Blascovich, Beall and Loomis, 2003; Roth et al., 2018), verbal behavior ( von der Pütten et al., 2010 ; Oh et al., 2016 ), social influence ( Kyrlitsias et al., 2020 ; Neyret et al., 2020 ; Dzardanova, Kasapakis, Gavalas and Sylaiou, 2021 ), mimicry ( Hasleret al., 2017 ), persuasion ( Guadagno et al., 2007 ), and others are used as indicators of the effectiveness and the quality of social interactions.

Physiological measures are also used to evaluate social interactions with VHs: heart rate ( Garau, Slater, Pertaub and Razzaque, 2005 ; Lee, Kolkmeier, Heylen and Ijsselsteijn, 2021 ), electrodermal activity ( Garau et al., 2005 ; Neyret et al., 2020 ), and electroencephalography ( Neyret et al., 2020 ).

Also, to avoid possible biases from confounding variables such as personality traits and simulation sickness, they are measured and used as control variables (e.g., Roth et al., 2018).

Factors Affecting Social Interaction With Virtual Humans

The benefits of recruiting VHs in a wide range of applications are reviewed in a previous section. The effectiveness of these applications usually requires that the user perceive and interact with VHs as if they were real humans. For that reason, investigation of the factors that enhance social presence and increase social influence with VHs has attracted great interest by the researchers. An overview of the main findings regarding the factors that affect the social interaction with VHs is reviewed in this section.

Representation of the Virtual Humans

The way that VHs look and behave varies between different VR applications. These variations are not only due to the different capabilities of the VR systems regarding graphical quality and the interactivity, and the effort and the skill of the creators of the VR applications to provide convincing VHs but also due to the nature and purpose of the VR application. This results in VHs with different levels of realism. Several studies were conducted to investigate the impact of the VHs’ visual and behavioral realism on social interactions.

Visual Realism

While studies showed that the presence of a VH’s visual representation leads to a higher level of social presence compared to the absence of any visual representation (e.g., voice only), the effect of VHs’ visual (photographic and anthropomorphic) realism is not consistent ( Oh et al., 2018 ). For example, a recent study ( Zibrek et al., 2019 ) investigated the level of a VH’s visual realism using three render styles: realistic, simple, and sketch styles. The results showed that the level of a VH’s visual realism did not have an impact on the participants’ sense of the social presence of the VH. The impact of visual realism on the participants’ emotional response was attributed to the fact that realistic rendering of the VH’s facial expressions was more perceivable than the less realistic rendering, which is not directly associated with the level of realism.

Behavioral Realism

In contrast with visual realism, the VH’s behavioral realism consists of an important factor for social interactions and a powerful predictor of social presence ( Oh et al., 2018 ). Behavioral realism refers to the extent to which a VH behaves in the way an actual person would behave. Several studies showed that increasing the VH’s behavioral realism leads to a stronger sense of social presence, especially when the VH’s behavior indicates awareness of the user’s presence (e.g., mutual gaze) and provides interactivity. The interactivity of a VH’s behavior is an important factor for creating social presence ( Oh et al., 2018 ) as it gives the impression that the VH is aware of the user’s presence and actions. For example, a study ( von der Pütten et al., 2010 ) showed that participants felt higher levels of social presence and mutual awareness, and talked more when the VH showed feedback behavior (head nodding) than when the VH did not show any feedback behavior. Another study ( Guadagno et al., 2007 ) showed that VHs with more realistic gaze behavior led to a higher sense of social presence. Additionally, male participants reported more attitude change after interacting with male-like VHs with behavioral realism compared with male-like VHs with lower behavioral realism. Another study ( Pan, Gillies Slater, 2008 ) focused on the effects of a VH’s blushing during an embarrassing situation on participants’ reaction. Especially, the effects of no blushing, cheek blushing, and whole-face blushing were compared. The results of the study showed that the VH’s whole-face blushing improved participants’ degree of social presence, while participants in the cheek blushing condition tended to withdraw earlier from the VH’s presentation. A study by Roth et al. (2016) showed no difference in the effectiveness in a verbal negotiation task between participants embodied in abstract avatars without gaze behavior and facial expressions in VR, compared with physical word setting. This result suggests that the absence of behavioral cues can partly be compensated.

Enhancing the VH’s behavioral realism implies increased social channels (e.g., the inclusion of facial expressions or gaze behavior) that are simulating better the face-to-face interactions. A study by Roth et al. (2018) investigated the impact of nonrealistic (in the means of simulating face-to-face interactions) social cues (i.e., social augmentations), by visualizing eye contact with floating bubbles, joint attention with particles, and grouping by matching the color of the abstract box-shaped avatars. The results of the study showed that the augmentations had a positive impact on participants’ sense of social presence as well as an influence on their behavior. This result suggests that increasing social cues is important for social interactions with VHs, despite if these cues are replicating face-to-face interactions or not. It is also revealing the potential of VR to enhance social interactions with additional social channels.

The Uncanny Valley

Additionally, the uncanny valley theory ( Mori et al., 2012 ) that initially referred to humanoid robots but also applies to VHs suggests that the relation between a VH’s realism and the perceiver’s affinity for it is not linear. Instead, as VHs appear more human-like, they become more appealing up to a certain point. When a VH looks and moves to an almost life-like degree, but not yet as a human, it is perceived as creepy and unsettling. Only when the realism of a VH is fully convincing will it elicit positive responses. Consequently, this effect can have a negative impact on social interactions with VHs ( Nowak and Fox, 2018 ). The results of a study ( Groom et al., 2009 ) support the uncanny valley theory, as the VH received lower evaluations by the participants when exhibiting more realistic behavior (i.e., lip sync and body movement). The persuasiveness of the VH is not affected by the level of realism.

Self-Representation

Studies showed that the appearance of the user’s avatar (i.e., self-representation in the virtual environment) may have an impact on the social interactions with VHs ( Ratan, Beyea, Li and Graciano, 2020 ). This effect is related to the sense of embodiment inside the virtual body ( Kilteni et al., 2012 ), and the tendency of altering attitudes and behaviors to match the expectations that are implied by the attributes of their virtual body, named the Proteus effect ( Yee et al., 2009 ; Slater and Sanchez-Vives, 2014 ). For example, a study by Yee and Bailenson (2007 ) showed that participants embodied in taller avatars were more confident in a negotiation task (the ultimatum game; Forsythe, Horowitz, Savin and Sefton, 1994 ) with an agent confederate.

Agency is the extent to which the user believes that a VH is controlled by another user (avatar) rather than a computer through an algorithm (agent). Blascovich (2002) defines agency as “the extent to which individuals perceive virtual others as representations of real persons” ( p . 130). When the user has the impression that a VH is controlled by another user, the level of agency is high. Instead, when the user believes that a VH is controlled by the computer, the level of agency is considered to be low. It is important to state that the level of agency describes the user’s perception of the VH as an agent or an avatar, rather than the VH’s actual state ( Fox et al., 2015 ). Additionally, agency is a continuum, as individuals perceive a VH to be partially controlled by a human and the computer ( Blascovich, 2002 ). It is important to note that the term agency is also used to describe the feeling of controlling one’s own (virtual) body ( Tsakiris et al., 2006 ), and the two definitions should not be confused.

The impact of agency on social interactions with VHs is not clear in the literature. According to the CASA theory, the responses to computers that exhibit human characteristics are mindless and automatic ( Reeves and Nass, 1996 ; Nass and Moon, 2000 ), and therefore, people will respond socially to VHs regardless of the level of agency. On the contrary, the Threshold Model of Social Influence ( Blascovich, 2002 ; Blascovich et al., 2002 ) argues that agency, along with behavioral realism, is a major factor that affects social presence.

According to the Threshold Model of Social Influence, an increase in agency and/or behavioral realism leads to an increase in social presence. If/when social presence meets a threshold value, social influences begin to operate. Specifically, when the user believes that the VH is controlled by the computer (low agency), the VH must behave very realistically in order for the social influence threshold to be met and social influence to occur. If the individual believes that the VH represents a real person (high agency), then behavioral realism does not need to be high to cause a social reaction. According to the authors, the location of the social influence threshold varies as a function of two moderating factors, which are interpersonal self-relevance and the response system. Interpersonal self-relevance is the importance of the interaction to the individual’s sense of self. In a social interaction that requires a discussion of one’s beliefs and attitudes (e.g., participating in a job interview), the interpersonal self-relevance is expected to be high. In social interactions that do not involve central or core aspects of an individual (e.g., making a small withdrawal from a bank), the interpersonal self-relevance is expected to be low. According to the model of social influence, when self-relevance is low, the threshold’s slope is shallow, which means that lower behavioral realism is required for social influence to occur. Instead, in high self-relevance interactions, the slope is steep, and therefore, higher behavioral realism is required for the threshold to be crossed and social influence to occur. The second factor that moderates the social influence threshold is the level of the behavioral response system of interest. For low-level response systems such as unconscious reflexes, the threshold is lower compared to high-level response systems such as verbal communication. Therefore, a lower level of agency and behavioral realism is required for low-level, implicit, or automatic social responses than for high-level response systems involving purposeful and conscious actions.

Several studies explored the impact of agency on social interactions with VHs. The perceived agency was manipulated generally by introducing the VH as an agent or an avatar prior to the interaction. For example, a study by Guadagno, Swinth and Blascovich (2011) examined the social evaluations (i.e., empathy and positivity) for a virtual peer counselor, who was introduced as either an agent or an avatar. The VH had two levels of behavior (i.e., smile and not smile). The results showed that the VH’s smile affected the social evaluations; however, the level of agency moderated this effect. Specifically, the social evaluations were enhanced by the smile behavior for participants in the low-agency condition but were degraded in the high-agency condition. Using two experiments, de Melo, Gratch and Carnevale (2014) examined the effect of the VH’s emotional expressions on participants’ behavior. The results of the first experiment showed that the participants collaborated more with the VH who exhibited collaborative instead of competing expressions in a social dilemma, and this effect was more intense in the high-agency condition. In the second experiment, the participants who were led to believe that they were interacting with an avatar conceded more in a negotiation task when the VH showed angry expressions. Instead, in the low-agency condition, the participants conceded the same regardless of whether the VH showed neutral or angry emotions. The results of a study ( Felnhofer et al., 2018 ) that examined social avoidance tendencies and prosocial behaviors toward VHs were contradictory regarding the impact of agency. While presence, social presence, social interaction anxiety, and stress were not affected by agency, participants in the avatar condition showed more social avoidance and prosocial behavior. The results of a study by von der Pütten et al. (2010) showed no effect of agency on participants’ social behavior and evaluations.

As shown above, there are several examples in the literature aiming to compare the usage of agents versus avatars, with many studies proving that avatars affect the social behavior of participants to a greater extent than agents, whereas others demonstrated no significant difference between the two. A meta-analysis by Fox et al. (2015) showed that perceived avatars produced stronger responses than perceived agents. A systematic review ( Oh et al., 2018 ) reported that approximately half of the studies surveyed showed an impact of agency on social presence, whereas in the remaining half of the studies the participants perceived similar levels of social presence regardless of the level of agency.

Level of Immersion

Regarding social presence, the level of immersion does not seem to be as crucial as it is for presence ( Oh et al., 2018 ), although some studies ( Schroeder et al., 2001 ; Heldal et al., 2005 ) showed that participants reported a stronger sense of social presence when using an immersive compared to a non-immersive platform. Also, a recent study ( Bailey et al., 2019 ) showed that children in an IVR condition demonstrated greater social influence (compliance) from a virtual character than children in a non-immersive condition, suggesting that IVR may elicit differential cognitive and social responses compared to less immersive technologies.

Discussion and Future Directions

In this article, we presented the applications and the potential of IVR and VHs in a wide range of fields such as training, education, and health. Additionally, we presented the benefits of using IVR as a research tool on experimental research in fields such as cognitive and social neuroscience and psychology. This potential stems from the many advantages of VR over traditional media. However, to be effective, many of these applications require that the user react to the virtual stimuli in a realistic way. The ability of the VR technologies to immerse the user in a virtual environment, and therefore to react in a realistic manner to it (as the user was physically there), is considered straightforward due to the ability of VR to induce the illusion of “being” inside a virtual environment. This sense of being in the virtual environment is called presence and is associated with realistic reactions to the virtual stimuli.

In contrast, eliciting realistic reactions to social stimuli within virtual environments seems to be more complex, and a deeper understanding of the users’ cognitive process is required to achieve them. While studies demonstrated realistic social reactions toward VHs within virtual environments, other studies failed to replicate social effects using VHs. To react realistically to a social situation, the user not only has to perceive the VH as it is physically present but also mentally present as it was a sentient human being. The extent to which the user actively perceives a VH in a virtual environment and at the same time has the sense that the “other” perceives the presence of the user is called social presence.

Several factors of the design of the VR applications and the virtual representations seem to impact the effectiveness of human–VH social interactions in terms of realistic reactions by the user. In this article, we listed several of these factors. Concerning the VH’s representation, the literature suggests that visual realism (image fidelity) seems to be not so important in creating social presence and eliciting realistic social responses to the user. On the other hand, the literature suggests that the behavioral realism of a VH (the extent to which a VH behaves like a real human) is an important factor for social influence. Behavioral realism consists of many parameters such as verbal and nonverbal behavior (body movements and gestures, facial expressions, and gaze behavior), responsiveness, and interactivity with the environment and the user. Therefore, more research is needed in the direction of designing VHs’ behavior to enhance their social potential.

As described in this article, the use of virtual agents offers many advantages over the use of avatars. The creation of agents that are perceived and treated by the users in a similar way as avatars is very important. The role of agency, the extent that the user believes that a VH is controlled by other humans rather than by the computer through an algorithm, is not clear in the literature. While studies supported the theory that users will respond socially to a VH only (or to a greater extent) when it is perceived as an agent (controlled by other users), other studies showed no impact of agency on social presence or social influence. According to the theory ( Blascovich, 2002 ), the importance of agency depends on the type of interaction. Specifically, unconscious and automatic social reactions seem not to be affected by the level of agency ( Nass and Moon, 2000 ), while interactions require more conscious social responses that are more likely to occur when the VH is perceived to be an avatar, controlled by another human, or an agent who behaves very realistically ( Blascovich, 2002 ). Therefore, more studies are needed to investigate the impact of agency on social interactions with VHs, taking into account the type of interaction. Additionally, according to Blascovich (2002) , agents that are behaving realistic enough to exceed the threshold of social influence may overcome the limitation of agency and perceive the same way as agents, despite the fact that the user knows that they are interacting with an agent. This demonstrates the need of future research in the direction of creating agents with plausible, intelligent, and interactive behavior, which might be “the biggest challenge in social VR research” ( Pan and Hamilton, 2018 , pp. 410–411).

Another direction that is offered for future research is the impact of self-representation in social interactions in VR environments. The sense of embodiment is the perception of the virtual body by the participant as his biological body ( Kilteni et al., 2012 ), which could be achieved by using real-time full-body motion tracking technology and by mapping the participants’ movements to those of their virtual avatars. Studies ( Slater and Sanchez-Vives, 2014 ) showed that people tend to alter their attitudes and behaviors to match the expectations that are implied by the attributes of their virtual body, including social behavior ( Yee and Bailenson, 2007 ). We presume that there is a great scope for further research ( Mal, 2020 ) on the impact of several aspects of self-representation (e.g., visual realism, body characteristics, gender, and age) in many forms of social interactions in VR.

Also, there is evidence that the level of immersion has an impact on social interaction with VHs; however, the literature is very limited. Further investigation is needed on whether more immersive systems can enhance the realism of social interactions with VHs.

Finally, the commercialization of social VR to the general audience in the form of entertainment and socialization may involve risks and unpleasant psychological and social consequences. An article by Slater et al. (2020) summarizes the potential negative implications of VR. Studies showed that the exposure to VR and especially virtual embodiment can lead to beneficial emotional, cognitive, and behavioral changes. However, the same techniques can be used to the opposite direction, leading to negative and undesired changes. Also, exposure to enjoyable environments and interactions, as well as the ability to create a desired self-representation, can create an individual preference of the virtual world over the real world, or even lead to prioritizing the virtual world. Studies also showed that VR and VHs influence the behavior and actions of an individual, with social effects such as persuasion ( Guadagno et al., 2007 ), obedience ( Neyret et al., 2020 ), and conformity ( Kyrlitsias et al., 2020 ). However, in contrast with the real world, a virtual environment and its virtual occupants, agents and even avatars, are highly controllable by the administrator of the VR application. This gives great power to the administrator of such applications over the users’ behavior. These are only some examples of the ethical concerns raised by the introduction of VR as a mass consumer product and demonstrate that ethics is a major challenge for VR.

To sum up, realistic social interactions with VHs are crucial for the effectiveness for many VR applications; however, it is not yet clear how to achieve them, and further research is required.

Author Contributions

CK and DM-G have made a substantial, direct, and intellectual contribution to this work.

This work has been partially funded by ED-DESPINA MICHAIL-300155-310200-3319 budget of the Cyprus University of Technology.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Asch, S. E. (1956). Studies of independence and Conformity: I. A Minority of One against a Unanimous Majority. Psychol. Monogr. Gen. Appl. 70 (9), 1–70. doi:10.1037/h0093718

CrossRef Full Text | Google Scholar

Bailenson, J., and Blascovich, J. (2004). “Avatars,” in Encyclopedia of Human-Computer Interaction . Editor W. S. Bainbridge (Great Barrington, MA: Berkshire ), 62–64.

Google Scholar

Bailey, J. O., Bailenson, J. N., Obradović, J., and Aguiar, N. R. (2019). Virtual Reality's Effect on Children's Inhibitory Control, Social Compliance, and Sharing. J. Appl. Dev. Psychol. 64, 101052. doi:10.1016/j.appdev.2019.101052

Bălan, O., Cristea, Ș., Moise, G., Petrescu, L., Ivașcu, S., Moldoveanu, A., et al. (2020). “eTher - an Assistive Virtual Agent for Acrophobia Therapy in Virtual Reality,” in International Conference on Human-Computer Interaction (Cham: Springer ), 12–25. doi:10.1007/978-3-030-59990-4_2

Banakou, D., Hanumanthu, P. D., and Slater, M. (2016). Virtual Embodiment of white People in a Black Virtual Body Leads to a Sustained Reduction in Their Implicit Racial Bias. Front. Hum. Neurosci. 10, 601. doi:10.3389/fnhum.2016.00601

PubMed Abstract | CrossRef Full Text | Google Scholar

Barley, S. R. (1988). “The Social Construction of a Machine: Ritual, Superstition, Magical Thinking and Other Pragmatic Responses to Running a CT Scanner,” in Knowledge and Practice in Medicine: Social, Cultural, and Historical Approaches . Editors M. Lock, and D. Gordon (Hingham, MA: Reidel ), 497–539. doi:10.1007/978-94-009-2725-4_19

Baumeister, R. F., and Vohs, K. D. (2007). “Ecological Validity,” in Encyclopedia of Social Psychology (Thousand Oaks, CA: SAGE Publications, Inc ), 1, 276. doi:10.4135/9781412956253.n167

Beall, A. C., Bailenson, J. N., Loomis, J., Blascovich, J., and Rex, C. S. (2003). “Non-zero-sum Gaze in Immersive Virtual Environments,” in Proceedings of HCI International .

Bell, I. H., Nicholas, J., Alvarez-Jimenez, M., Thompson, A., and Valmaggia, L. (2020). Virtual Reality as a Clinical Tool in Mental Health Research and Practice. Dialogues Clin. Neurosci. 22 (2), 169–177. doi:10.31887/DCNS.2020.22.2/lvalmaggia

Bellanca, J. L., Orr, T. J., Helfrich, W. J., Macdonald, B., Navoyski, J., and Demich, B. (2019). Developing a Virtual Reality Environment for Mining Research. Mining, Metall. Exploration 36 (4), 597–606. doi:10.1007/s42461-018-0046-2

Biocca, F., Harms, C., and Burgoon, J. K. (2003). Toward a More Robust Theory and Measure of Social Presence: Review and Suggested Criteria. Presence: Teleoperators & Virtual Environments 12 (5), 456–480. doi:10.1162/105474603322761270

Biocca, F. (1997). The Cyborg's Dilemma: Progressive Embodiment in Virtual Environments [1]. J. computer-mediated Commun. 3 (2), JCMC324. doi:10.1111/j.1083-6101.1997.tb00070.x

Blascovich, J., Loomis, J., Beall, A. C., Swinth, K. R., Hoyt, C. L., and Bailenson, J. N. (2002). TARGET ARTICLE: Immersive Virtual Environment Technology as a Methodological Tool for Social Psychology. Psychol. Inq. 13 (2), 103–124. doi:10.1207/S15327965PLI1302_01

Blascovich, J. (2002). “Social Influence within Immersive Virtual Environments,” in The Social Life of Avatars: Presence and Interaction in Shared Virtual Environments . Editor R. Schroeder (London: Springer ), 127–145. doi:10.1007/978-1-4471-0277-910.1007/978-1-4471-0277-9_8

Bohil, C. J., Alicea, B., and Biocca, F. A. (2011). Virtual Reality in Neuroscience Research and Therapy. Nat. Rev. Neurosci. 12 (12), 752–762. doi:10.1038/nrn3122

Bombari, D., Schmid Mast, M., Canadas, E., and Bachmann, M. (2015). Studying Social Interactions through Immersive Virtual Environment Technology: Virtues, Pitfalls, and Future Challenges. Front. Psychol. 6, 869. doi:10.3389/fpsyg.2015.00869

Botella, C., Garcia-Palacios, A., Baños, R. M., and Quero, S. (2009). Cybertherapy: Advantages, Limitations, and Ethical Issues. PsychNology J. 7 (1), 77–100.

Botella, C., Serrano, B., baños, R., and García-Palacios, A. (2015). Virtual Reality Exposure-Based Therapy for the Treatment of post-traumatic Stress Disorder: a Review of its Efficacy, the Adequacy of the Treatment Protocol, and its Acceptability. Ndt 11, 2533. doi:10.2147/NDT.S89542

Chang, E., Kim, H. T., and Yoo, B. (2020). Virtual Reality Sickness: a Review of Causes and Measurements. Int. J. Human-Computer Interaction 36 (17), 1658–1682. doi:10.1080/10447318.2020.1778351

Chesham, R. K., Malouff, J. M., and Schutte, N. S. (2018). Meta-analysis of the Efficacy of Virtual Reality Exposure Therapy for Social Anxiety. Behav. Change 35 (3), 152–166. doi:10.1017/bec.2018.15

Christofi, M., Michael-Grigoriou, D., and Kyrlitsias, C. (2020). A Virtual Reality Simulation of Drug Users' Everyday Life: The Effect of Supported Sensorimotor Contingencies on Empathy. Front. Psychol. 11, 1242. doi:10.3389/fpsyg.2020.01242

Christofi, M., and Michael-Grigoriou, D. (2016). “Virtual Environments Design Assessment for the Treatment of Claustrophobia,” in 2016 22nd International Conference on Virtual System & Multimedia (VSMM) , IEEE , 1–8. doi:10.1109/VSMM.2016.7863215

Cruz-Neira, C., Sandin, D. J., and DeFanti, T. A. (1993). “Surround-screen Projection-Based Virtual Reality,” in Proceedings Of the 20 th Annual Conference on Computer Graphics and Interactive Techniques , 135–142. doi:10.1145/166117.166134

Cummings, J. J., and Bailenson, J. N. (2016). How Immersive Is Enough? A Meta-Analysis of the Effect of Immersive Technology on User Presence. Media Psychol. 19 (2), 272–309. doi:10.1080/15213269.2015.1015740

Davis, S., Nesbitt, K., and Nalivaiko, E. (2014). “A Systematic Review of Cybersickness,” in Proceedings of the 2014 Conference on Interactive Entertainment , 1–9. doi:10.1145/2677758.2677780

de Melo, C. M., Gratch, J., and Carnevale, P. J. (2015). Humans versus Computers: Impact of Emotion Expressions on People's Decision Making. IEEE Trans. Affective Comput. 6 (2), 127–136. doi:10.1109/TAFFC.2014.2332471

Dennett, D. C. (1987). The Intentional Stance . Cambridge, MA: MIT Press .

Dzardanova, E., Kasapakis, V., Gavalas, D., and Sylaiou, S. (2021). Virtual Reality as a Communication Medium: a Comparative Study of Forced Compliance in Virtual Reality versus Physical World. Virtual Reality , 1–21. doi:10.1007/s10055-021-00564-9

Falconer, C. J., Rovira, A., King, J. A., Gilbert, P., Antley, A., Fearon, P., et al. (2016). Embodying Self-Compassion within Virtual Reality and its Effects on Patients with Depression. BJPsych open 2 (1), 74–80. doi:10.1192/bjpo.bp.115.002147

Felnhofer, A., Kafka, J. X., Hlavacs, H., Beutl, L., Kryspin-Exner, I., and Kothgassner, O. D. (2018). Meeting Others Virtually in a Day-To-Day Setting: Investigating Social Avoidance and Prosocial Behavior towards Avatars and Agents. Comput. Hum. Behav. 80, 399–406. doi:10.1016/j.chb.2017.11.031

Felnhofer, A., Kothgassner, O. D., Beutl, L., Hlavacs, H., and Kryspin-Exner, I. (2012). Is Virtual Reality Made for Men Only? Exploring Gender Differences in the Sense of Presence. Proc. Int. Soc. presence Res. , 103–112.

Foreman, N. (2009). Virtual Reality in Psychology. Themes Sci. Tech. Edu. 2 (1), 225–252.

Forsythe, R., Horowitz, J. L., Savin, N. E., and Sefton, M. (1994). Fairness in Simple Bargaining Experiments. Games Econ. Behav. 6 (3), 347–369. doi:10.1006/game.1994.1021

Fox, J., Ahn, S. J., Janssen, J. H., Yeykelis, L., Segovia, K. Y., and Bailenson, J. N. (2015). Avatars versus Agents: a Meta-Analysis Quantifying the Effect of agency on Social Influence. Human-Computer Interaction 30 (5), 401–432. doi:10.1080/07370024.2014.921494

Friedman, D., Pizarro, R., Or-Berkers, K., Neyret, S. n., Pan, X., and Slater, M. (2014). A Method for Generating an Illusion of Backwards Time Travel Using Immersive Virtual Realityâ€"an Exploratory Study. Front. Psychol. 5, 943. doi:10.3389/fpsyg.2014.00943

Garau, M., Slater, M., Pertaub, D.-P., and Razzaque, S. (2005). The Responses of People to Virtual Humans in an Immersive Virtual Environment. Presence: Teleoperators & Virtual Environments 14 (1), 104–116. doi:10.1162/1054746053890242

Gonzalez-Franco, M., Slater, M., Birney, M. E., Swapp, D., Haslam, S. A., and Reicher, S. D. (2018). Participant Concerns for the Learner in a Virtual Reality Replication of the Milgram Obedience Study. PloS one 13 (12), e0209704. doi:10.1371/journal.pone.0209704

Gorisse, G., Christmann, O., Amato, E. A., and Richir, S. (2017). First- and Third-Person Perspectives in Immersive Virtual Environments: Presence and Performance Analysis of Embodied Users. Front. Robot. AI 4, 33. doi:10.3389/frobt.2017.00033

Grivokostopoulou, F., Kovas, K., and Perikos, I. (2020). The Effectiveness of Embodied Pedagogical Agents and Their Impact on Students Learning in Virtual Worlds. Appl. Sci. 10 (5), 1739. doi:10.3390/app10051739

Groom, V., Nass, C., Chen, T., Nielsen, A., Scarborough, J. K., and Robles, E. (2009). Evaluating the Effects of Behavioral Realism in Embodied Agents. Int. J. Human-Computer Stud. 67 (10), 842–849. doi:10.1016/j.ijhcs.2009.07.001

Guadagno, R. E., Blascovich, J., Bailenson, J. N., and McCall, C. (2007). Virtual Humans and Persuasion: The Effects of agency and Behavioral Realism. Media Psychol. 10 (1), 1–22. doi:10.1080/15213260701300865

Guadagno, R. E., Swinth, K. R., and Blascovich, J. (2011). Social Evaluations of Embodied Agents and Avatars. Comput. Hum. Behav. 27 (6), 2380–2385. doi:10.1016/j.chb.2011.07.017

Guimarães, M., Prada, R., Santos, P. A., Dias, J., Jhala, A., and Mascarenhas, S. (2020). “The Impact of Virtual Reality in the Social Presence of a Virtual Agent,” in Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents , 1–8. doi:10.1145/3383652.3423879

Gunkel, S., Stokking, H., Prins, M., Niamut, O., Siahaan, E., and Cesar, P. (2018). “Experiencing Virtual Reality Together,” in Proceedings of the 2018 ACM International Conference on Interactive Experiences for TV and Online Video , 233–238. doi:10.1145/3210825.3213566

Hasler, B. S., Spanlang, B., and Slater, M. (2017). Virtual Race Transformation Reverses Racial In-Group Bias. PloS one 12 (4), e0174965. doi:10.1371/journal.pone.0174965

Heldal, I., Schroeder, R., Steed, A., Axelsson, A., Spante, M., and Widestrom, J. (2005). “Immersiveness and Symmetry in Copresent Scenarios,” in IEEE Proceedings. VR 2005 , IEEE , 171–178. doi:10.1109/VR.2005.1492771

Hoyt, C. L., Blascovich, J., and Swinth, K. R. (2003). Social Inhibition in Immersive Virtual Environments. Presence: Teleoperators & Virtual Environments 12 (2), 183–195. doi:10.1162/105474603321640932

Ijsselsteijn, W., and Riva, G. (2003). “Being There: The Experience of Presence in Mediated Environments,” in Being There: Concepts, Effects and Measurements of User Presence in Synthetic Environments . Editors G. Riva, F. Davide, and W. A. Ijsselsteijn (Amsterdam: IOS Press ), 3–16.

Johnson, W. L., and Lester, J. C. (2018). Pedagogical Agents: Back to the Future. AIMag 39 (2), 33–44. doi:10.1609/aimag.v39i2.2793

Kelly, J. R. (2007). “Mundane Realism,” in Encyclopedia of Social Psychology . Editors R. F. Baumeister, and K. D. Vohs (Thousand Oaks, CA: SAGE Publications, Inc ), 1, 599. doi:10.4135/9781412956253.n357

Kiesler, S., and Sproull, L. (1997). ““Social” Human-Computer Interaction,” in Human Values and the Design of Computer Technology . Editor B. Friedman (Stanford, CA: CSLI Publications ), 191–199.

Kilteni, K., Bergstrom, I., and Slater, M. (2013). Drumming in Immersive Virtual Reality: the Body Shapes the Way We Play. IEEE Trans. Vis. Comput. Graphics 19 (4), 597–605. doi:10.1109/TVCG.2013.29

Kilteni, K., Groten, R., and Slater, M. (2012). The Sense of Embodiment in Virtual Reality. Presence: Teleoperators and Virtual Environments 21 (4), 373–387. doi:10.1162/PRES_a_00124

Kim, K., Norouzi, N., Losekamp, T., Bruder, G., Anderson, M., and Welch, G. (2019).Effects of Patient Care Assistant Embodiment and Computer Mediation on User Experience. In 2019 IEEE International Conference on Artificial Intelligence and Virtual Reality . IEEE Computer Society , 17–177. doi:10.1109/aivr46125.2019.00013

Kinateder, M., Gromer, D., Gast, P., Buld, S., Müller, M., Jost, M., et al. (2015). The Effect of Dangerous Goods Transporters on hazard Perception and Evacuation Behavior - A Virtual Reality experiment on Tunnel Emergencies. Fire Saf. J. 78, 24–30. doi:10.1016/j.firesaf.2015.07.002

Kosloff, S. (2007). “Experimental Realism,” in Encyclopedia of Social Psychology . Editors R. F. Baumeister, and K. D. Vohs (Thousand Oaks, CA: SAGE Publications, Inc ), 1, 329–330. doi:10.4135/9781412956253.n202

Kothgassner, O. D., and Felnhofer, A. (2020). Does Virtual Reality Help to Cut the Gordian Knot between Ecological Validity and Experimental Control? Ann. Int. Commun. Assoc. 44 (3), 210–218. doi:10.1080/23808985.2020.1792790

Koźlak, M., Kurzeja, A., and Nawrat, A. (2013). “Virtual Reality Technology for Military and Industry Training Programs,” in Vision Based Systems for UAV Applications (Heidelberg: Springer ), 327–334. doi:10.1007/978-3-319-00369-6_21

Kyrlitsias, C., Christofi, M., Michael-Grigoriou, D., Banakou, D., and Ioannou, A. (2020). A Virtual Tour of a Hardly Accessible Archaeological Site: The Effect of Immersive Virtual Reality on User Experience, Learning and Attitude Change. Front. Comput. Sci. 2, 23. doi:10.3389/fcomp.2020.00023

Kyrlitsias, C., and Michael-Grigoriou, D. (2018). Asch Conformity experiment Using Immersive Virtual Reality. Comput. Anim. Virtual Worlds 29 (5), e1804. doi:10.1002/cav.1804

Kyrlitsias, C., Michael-Grigoriou, D., Banakou, D., and Christofi, M. (2020). Social Conformity in Immersive Virtual Environments: The Impact of Agents' Gaze Behavior. Front. Psychol. 11, 2254. doi:10.3389/fpsyg.2020.02254

Lateef, F. (2010). Simulation-based Learning: Just like the Real Thing. J. Emerg. Trauma Shock 3 (4), 348. doi:10.4103/0974-2700.70743

Lee, K. M., Jung, Y., Kim, J., and Kim, S. R. (2006). Are Physically Embodied Social Agents Better Than Disembodied Social Agents?: The Effects of Physical Embodiment, Tactile Interaction, and People's Loneliness in Human-Robot Interaction. Int. J. human-computer Stud. 64 (10), 962–973. doi:10.1016/j.ijhcs.2006.05.002

Lee, M., Kolkmeier, J., Heylen, D., and Ijsselsteijn, W. (2021). Who Makes Your Heart Beat? what Makes You Sweat? Social Conflict in Virtual Reality for Educators. Front. Psychol. 12. doi:10.3389/fpsyg.2021.628246

Li, J., Kong, Y., Röggla, T., De Simone, F., Ananthanarayan, S., De Ridder, H., et al. (2019). “Measuring and Understanding Photo Sharing Experiences in Social Virtual Reality,” in Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems , 1–14. doi:10.1145/3290605.3300897

Li, L., and Campbell, J. (2010). Emotion Modeling and Interaction of NPCS in Virtual Simulation and Games. Ijvr 9 (4), 1–6. doi:10.20870/IJVR.2010.9.4.2784

Lok, B., Ferdig, R. E., Raij, A., Johnsen, K., Dickerson, R., Coutts, J., Stevens, A., and Lind, D. S. (2006). Applying Virtual Reality in Medical Communication Education: Current Findings and Potential Teaching and Learning Benefits of Immersive Virtual Patients. Virtual Reality 10 (3-4), 185–195. doi:10.1007/s10055-006-0037-3

Lucas, G. M., Gratch, J., King, A., and Morency, L.-P. (2014). It's Only a Computer: Virtual Humans Increase Willingness to Disclose. Comput. Hum. Behav. 37, 94–100. doi:10.1016/j.chb.2014.04.043

Maister, L., Sebanz, N., Knoblich, G., and Tsakiris, M. (2013). Experiencing Ownership over a Dark-Skinned Body Reduces Implicit Racial Bias. Cognition 128 (2), 170–178. doi:10.1016/j.cognition.2013.04.002

Maister, L., Slater, M., Sanchez-Vives, M. V., and Tsakiris, M. (2015). Changing Bodies Changes Minds: Owning Another Body Affects Social Cognition. Trends Cognitive Sciences 19 (1), 6–12. doi:10.1016/j.tics.2014.11.001

Makransky, G., Wismer, P., and Mayer, R. E. (2019). A Gender Matching Effect in Learning with Pedagogical Agents in an Immersive Virtual Reality Science Simulation. J. Comput. Assist. Learn. 35 (3), 349–358. doi:10.1111/jcal.12335

Mal, D. (2020). “[DC] the Impact of Social Interactions on an Embodied Individual's Self-Perception in Virtual Environments,” in 2020 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW) , 545–546. doi:10.1109/VRW50115.2020.00124 IEEE

McCall, C., Hildebrandt, L. K., Bornemann, B., and Singer, T. (2015). Physiophenomenology in Retrospect: Memory Reliably Reflects Physiological Arousal during a Prior Threatening Experience. Conscious. Cogn. 38, 60–70. doi:10.1016/j.concog.2015.09.011

McVeigh-Schultz, J., Márquez Segura, E., Merrill, N., and Isbister, K. (2018). “What's it Mean to "Be Social" in VR? Proceedings of the 2018 ACM Conference Companion Publication on Designing Interactive Systems , 289–294. doi:10.1145/3197391.3205451

Merchant, Z., Goetz, E. T., Cifuentes, L., Keeney-Kennicutt, W., and Davis, T. J. (2014). Effectiveness of Virtual Reality-Based Instruction on Students' Learning Outcomes in K-12 and Higher Education: A Meta-Analysis. Comput. Edu. 70, 29–40. doi:10.1016/j.compedu.2013.07.033

Milgram, S. (1963). Behavioral Study of Obedience. J. abnormal Soc. Psychol. 67 (4), 371–378. doi:10.1037/h0040525

Mori, M., MacDorman, K., and Kageki, N. (2012). The Uncanny valley [from the Field]. IEEE Robot. Automat. Mag. 19 (2), 98–100. doi:10.1109/MRA.2012.2192811

Najm, A., Michael-Grigoriou, D., Kyrlitsias, C., Christofi, M., Hadjipanayi, C., and Sokratous, D. (2020). “A Virtual Reality Adaptive Exergame for the Enhancement of Physical Rehabilitation Using Social Facilitation,” in ICAT-EGVE 2020 – International Conference on Artificial Reality and Telexistence and Eurographics Symposium on Virtual Environments – Posters and Demos (Geneva: The Eurographics Association ). doi:10.2312/egve.20201269

Nass, C., and Moon, Y. (2000). Machines and Mindlessness: Social Responses to Computers. J. Soc. Isssues 56 (1), 81–103. doi:10.1111/0022-4537.00153

Nass, C., Steuer, J., and Tauber, E. R. (1994). “Computers Are Social Actors,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems , 72–78. doi:10.1145/191666.191703

Nazligul, M. D., Yilmaz, M., Gulec, U., Gozcu, M. A., O’Connor, R. V., and Clarke, P. M. (2017). “Overcoming Public Speaking Anxiety of Software Engineers Using Virtual Reality Exposure Therapy,” in European Conference on Software Process Improvement (Cham: Springer ), 191–202. doi:10.1007/978-3-319-64218-5_15

Neyret, S., Navarro, X., Beacco, A., Oliva, R., Bourdin, P., Valenzuela, J., et al. (2020). An Embodied Perspective as a Victim of Sexual Harassment in Virtual Reality Reduces Action Conformity in a Later Milgram Obedience Scenario. Sci. Rep. 10 (1), 1–18. doi:10.1038/s41598-020-62932-w

Nowak, K. L., and Biocca, F. (2003). The Effect of the Agency and Anthropomorphism on Users' Sense of Telepresence, Copresence, and Social Presence in Virtual Environments. Presence: Teleoperators & Virtual Environments 12 (5), 481–494. doi:10.1162/105474603322761289

Nowak, K. L., Fox, J., and Fox , J. (2018). Avatars and Computer-Mediated Communication: A Review of the Definitions, Uses, and Effects of Digital Representations on Communication. Rcr 6, 30–53. doi:10.12840/issn.2255-4165.2018.06.01.015

Oh, C. S., Bailenson, J. N., and Welch, G. F. (2018). A Systematic Review of Social Presence: Definition, Antecedents, and Implications. Front. Robot. AI 5, 114. doi:10.3389/frobt.2018.00114

Oh, S. Y., Bailenson, J., Krämer, N., and Li, B. (2016). Let the Avatar Brighten Your Smile: Effects of Enhancing Facial Expressions in Virtual Environments. PloS one 11 (9), e0161794. doi:10.1371/journal.pone.0161794

Pan, X., Gillies, M., and Slater, M. (2008). “The Impact of Avatar Blushing on the Duration of Interaction between a Real and Virtual Person,” in Presence 2008: The 11 th Annual International Workshop On Presence , 100–106.

Pan, X., and Hamilton, A. F. d. C. (2018). Why and How to Use Virtual Reality to Study Human Social Interaction: The Challenges of Exploring a New Research Landscape. Br. J. Psychol. 109 (3), 395–417. doi:10.1111/bjop.12290

Parong, J., and Mayer, R. E. (2018). Learning Science in Immersive Virtual Reality. J. Educ. Psychol. 110 (6), 785–797. doi:10.1037/edu0000241

Parsons, T. D. (2015). Virtual Reality for Enhanced Ecological Validity and Experimental Control in the Clinical, Affective and Social Neurosciences. Front. Hum. Neurosci. 9, 660. doi:10.3389/fnhum.2015.00660

Parsons, T., Gaggioli, A., and Riva, G. (2017). Virtual Reality for Research in Social Neuroscience. Brain Sci. 7 (4), 42. doi:10.3390/brainsci7040042

Peck, T. C., Seinfeld, S., Aglioti, S. M., and Slater, M. (2013). Putting Yourself in the Skin of a Black Avatar Reduces Implicit Racial Bias. Conscious. Cogn. 22 (3), 779–787. doi:10.1016/j.concog.2013.04.016

Perret, J., and Vander Poorten, E. (2018). “Touching Virtual Reality: a Review of Haptic Gloves,” in ACTUATOR 2018; 16th International Conference on New Actuators (Bremen: VDE ), 1–5.

Petrovic, V. M. (2018). Artificial Intelligence and Virtual Worlds - toward Human-Level AI Agents. IEEE Access 6, 39976–39988. doi:10.1109/ACCESS.2018.2855970

Pottle, J. (2019). Virtual Reality and the Transformation of Medical Education. Future Healthc. J. 6 (3), 181–185. doi:10.7861/fhj.2019-0036

Ratan, R., Beyea, D., Li, B. J., and Graciano, L. (2020). Avatar Characteristics Induce Users' Behavioral Conformity with Small-To-Medium Effect Sizes: a Meta-Analysis of the proteus Effect. Media Psychol. 23 (5), 651–675. doi:10.1080/15213269.2019.1623698

Reeves, B., and Nass, C. (1996). The media Equation: How People Treat Computers, Television, and New media like Real People and Places . New York: Cambridge University Press .

Rizzo, A., and Talbot, T. (2016). Virtual Reality Standardized Patients for Clinical Training. Educ. Pract. , 255–272. doi:10.1002/9781118952788.ch18

Roth, D., Klelnbeck, C., Feigl, T., Mutschler, C., and Latoschik, M. E. (2018).Beyond Replication: Augmenting Social Behaviors in Multi-User Virtual Realities. In IEEE Conference on Virtual Reality and 3D User Interfaces (VR) . IEEE , 215–222. doi:10.1109/VR.2018.8447550

Roth, D., Kullmann, P., Bente, G., Gall, D., and Latoschik, M. E. (2018).Effects of Hybrid and Synthetic Social Gaze in Avatar-Mediated Interactions. In 2018 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct) . IEEE , 103–108. doi:10.1109/ISMAR-Adjunct.2018.00044

Roth, D., Latoschik, M. E., Vogeley, K., and Bente, G. (2015). Hybrid Avatar-Agent Technology - A Conceptual Step towards Mediated "Social" Virtual Reality and its Respective Challenges. I-com 14 (2), 107–114. doi:10.1515/icom-2015-0030

Roth, D., Lugrin, J.-L., Galakhov, D., Hofmann, A., Bente, G., Latoschik, M. E., et al. (2016). Avatar Realism and Social Interaction Quality in Virtual Reality. IEEE Virtual Reality , 277–278. doi:10.1109/VR.2016.7504761

Roth, D., Mal, D., Purps, C. F., Kullmann, P., and Latoschik, M. E. (2018). “Injecting Nonverbal Mimicry with Hybrid Avatar-Agent Technologies,” in Proceedings of the Symposium on Spatial User Interaction , 69–73. doi:10.1145/3267782.3267791

Roth, D., Waldow, K., Latoschik, M. E., Fuhrmann, A., and Bente, G. (2017). Socially Immersive Avatar-Based Communication. IEEE Virtual Reality , 259–260.

Rothbaum, B. O., Hodges, L. F., Kooper, R., Opdyke, D., Williford, J. S., and North, M. (1995). Virtual Reality Graded Exposure in the Treatment of Acrophobia: A Case Report. Behav. Ther. 26 (3), 547–554. doi:10.1016/S0005-7894(05)80100-5

Rubio-Tamayo, J., Gertrudix Barrio, M., and García García, F. (2017). Immersive Environments and Virtual Reality: Systematic Review and Advances in Communication, Interaction and Simulation. Mti 1 (4), 21. doi:10.3390/mti1040021

Sanchez-Vives, M. V., and Slater, M. (2005). From Presence to Consciousness through Virtual Reality. Nat. Rev. Neurosci. 6 (4), 332–339. doi:10.1038/nrn1651

Saredakis, D., Szpak, A., Birckhead, B., Keage, H. A. D., Rizzo, A., and Loetscher, T. (2020). Factors Associated with Virtual Reality Sickness in Head-Mounted Displays: a Systematic Review and Meta-Analysis. Front. Hum. Neurosci. 14, 96. doi:10.3389/fnhum.2020.00096

Schroeder, R., Steed, A., Axelsson, A.-S., Heldal, I., Abelin, Å., Wideström, J., et al. (2001). Collaborating in Networked Immersive Spaces: as Good as Being There Together? Comput. Graphics 25 (5), 781–788. doi:10.1016/S0097-8493(01)00120-0

Seinfeld, S., Bergstrom, I., Pomes, A., Arroyo-Palacios, J., Vico, F., Slater, M., et al. (2016). Influence of Music on Anxiety Induced by Fear of Heights in Virtual Reality. Front. Psychol. 6, 1969. doi:10.3389/fpsyg.2015.01969

Slater, M. (2003). A Note on Presence Terminology. Presence-connect 3 (3), 1–5.

Slater, M., Antley, A., Davison, A., Swapp, D., Guger, C., Barker, C., et al. (2006). A Virtual Reprise of the Stanley Milgram Obedience Experiments. PloS one 1 (1), e39. doi:10.1371/journal.pone.0000039

Slater, M., Gonzalez-Liencres, C., Haggard, P., Vinkers, C., Gregory-Clarke, R., Jelley, S., et al. (2020). The Ethics of Realism in Virtual and Augmented Reality. Front. Virtual Real. 1, 1. doi:10.3389/frvir.2020.00001

Slater, M., Pérez Marcos, D., Ehrsson, H., and Sanchez-Vives, M. V. (2009). Inducing Illusory Ownership of a Virtual Body. Front. Neurosci. 3, 214–220. doi:10.3389/neuro.01.029.2009

Slater, M. (2009). Place Illusion and Plausibility Can lead to Realistic Behaviour in Immersive Virtual Environments. Phil. Trans. R. Soc. B 364 (1535), 3549–3557. doi:10.1098/rstb.2009.0138

Slater, M., and Sanchez-Vives, M. V. (2016). Enhancing Our Lives with Immersive Virtual Reality. Front. Robot. AI 3, 74. doi:10.3389/frobt.2016.00074

Slater, M., and Sanchez-Vives, M. V. (2014). Transcending the Self in Immersive Virtual Reality. Computer 47 (7), 24–30. doi:10.1109/MC.2014.198

Slater, M., and Wilbur, S. (1997). A Framework for Immersive Virtual Environments (FIVE): Speculations on the Role of Presence in Virtual Environments. Presence: Teleoperators & Virtual Environments 6 (6), 603–616. doi:10.1162/pres.1997.6.6.603

Stavroulia, K. E., Christofi, M., Baka, E., Michael-Grigoriou, D., Magnenat-Thalmann, N., and Lanitis, A. (2019). Assessing the Emotional Impact of Virtual Reality-Based Teacher Training. Ijilt 36, 192–217. doi:10.1108/IJILT-11-2018-0127

Strojny, P. M., Dużmańska-Misiarczyk, N., Lipp, N., and Strojny, A. (2020). Moderators of Social Facilitation Effect in Virtual Reality: Co-presence and Realism of Virtual Agents. Front. Psychol. 11, 1252. doi:10.3389/fpsyg.2020.01252

Takac, M., Collett, J., Blom, K. J., Conduit, R., Rehm, I., and De Foe, A. (2019). Public Speaking Anxiety Decreases within Repeated Virtual Reality Training Sessions. PloS one 14 (5), e0216288. doi:10.1371/journal.pone.0216288

Takahashi, T., Tanaka, K., and Oka, N. (2018). “Adaptive Mixed-Initiative Dialog Motivates a Game Player to Talk with an NPC,” in Proceedings of the 6th International Conference on Human-Agent Interaction , 153–160. doi:10.1145/3284432.3284436

Tarr, M. J., and Warren, W. H. (2002). Virtual Reality in Behavioral Neuroscience and beyond. Nat. Neurosci. 5 (11), 1089–1092. doi:10.1038/nn948

Tsakiris, M., Prabhu, G., and Haggard, P. (2006). Having a Body versus Moving Your Body: How agency Structures Body-Ownership. Conscious. Cogn. 15 (2), 423–432. doi:10.1016/j.concog.2005.09.004

Vard, A., Rahani, V., and Najafi, M. (2018). Claustrophobia Game: Design and Development of a New Virtual Reality Game for Treatment of Claustrophobia. J. Med. Signals Sens 8 (4), 231. doi:10.4103/jmss.JMSS_27_18

Villena Taranilla, R., Cózar-Gutiérrez, R., González-Calero, J. A., and López Cirugeda, I. (2019). Strolling through a City of the Roman Empire: an Analysis of the Potential of Virtual Reality to Teach History in Primary Education. Interactive Learn. Environments , 1–11. doi:10.1080/10494820.2019.1674886

von der Pütten, A. M., Krämer, N. C., Gratch, J., and Kang, S.-H. (2010). "It Doesn't Matter what You Are!" Explaining Social Effects of Agents and Avatars. Comput. Hum. Behav. 26, 1641–1650. doi:10.1016/j.chb.2010.06.012

Weech, S., Kenny, S., and Barnett-Cowan, M. (2019). Presence and Cybersickness in Virtual Reality Are Negatively Related: a Review. Front. Psychol. 10, 158. doi:10.3389/fpsyg.2019.00158

Wiederhold, B. K., and Riva, G. (2019). Virtual Reality Therapy: Emerging Topics and Future Challenges. Cyberpsychology, Behav. Soc. Networking 22 (1), 3–6. doi:10.1089/cyber.2018.29136.bkw

Wilson, C. J., and Soranzo, A. (20152015). The Use of Virtual Reality in Psychology: a Case Study in Visual Perception. Comput. Math. Methods Med. 2015, 1–7. doi:10.1155/2015/151702

Xi, M., and Smith, S. P. (2016).Supporting Path Switching for Non-player Characters in a Virtual Environment. In 2016 IEEE Virtual Reality (VR) . IEEE , 315–316. doi:10.1109/VR.2016.7504780

Yee, N., Bailenson, J. N., and Ducheneaut, N. (2009). The Proteus Effect. Commun. Res. 36 (2), 285–312. doi:10.1177/2F009365020833025410.1177/0093650208330254

Yee, N., and Bailenson, J. (2007). The Proteus Effect: The Effect of Transformed Self-Representation on Behavior. Hum. Comm Res 33 (3), 271–290. doi:10.1111/j.1468-2958.2007.00299.x

Yildirim, C. (2021).An Immersive Model of User Trust in Conversational Agents in Virtual Reality. In 2021 Third International Conference on Transdisciplinary AI (TransAI) . IEEE , 17–18. doi:10.1109/TransAI51903.2021.00011

Zibrek, K., Martin, S., and McDonnell, R. (2019). Is Photorealism Important for Perception of Expressive Virtual Humans in Virtual Reality? ACM Trans. Appl. Percept. 16 (3), 1–19. doi:10.1145/3349609

Keywords: virtual reality, agents, avatars, social presence, social interaction

Citation: Kyrlitsias C and Michael-Grigoriou D (2022) Social Interaction With Agents and Avatars in Immersive Virtual Environments: A Survey. Front. Virtual Real. 2:786665. doi: 10.3389/frvir.2021.786665

Received: 05 October 2021; Accepted: 10 December 2021; Published: 11 January 2022.

Reviewed by:

Copyright © 2022 Kyrlitsias and Michael-Grigoriou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Despina Michael-Grigoriou, [email protected]

This article is part of the Research Topic

Do we really interact with artificial agents as if they are human?

Challenges and barriers in virtual teams: a literature review

  • Research Article
  • Published: 20 May 2020
  • Volume 2 , article number  1096 , ( 2020 )

Cite this article

research paper on virtual environment

  • Sarah Morrison-Smith   ORCID: orcid.org/0000-0002-4959-807X 1 &
  • Jaime Ruiz 2  

355k Accesses

210 Citations

76 Altmetric

Explore all metrics

Virtual teams (i.e., geographically distributed collaborations that rely on technology to communicate and cooperate) are central to maintaining our increasingly globalized social and economic infrastructure. “Global Virtual Teams” that include members from around the world are the most extreme example and are growing in prevalence (Scott and Wildman in Culture, communication, and conflict: a review of the global virtual team literature, Springer, New York, 2015). There has been a multitude of studies examining the difficulties faced by collaborations and use of technology in various narrow contexts. However, there has been little work in examining the challenges faced by virtual teams and their use of technology to mitigate issues. To address this issue, a literature review was performed to highlight the collaboration challenges experienced by virtual teams and existing mitigation strategies. In this review, a well-planned search strategy was utilized to identify a total of 255 relevant studies, primarily focusing on technology use. The physical factors relating to distance are tightly coupled with the cognitive, social, and emotional challenges faced by virtual teams. However, based on research topics in the selected studies, we separate challenges as belonging to five categories: geographical distance, temporal distance, perceived distance, the configuration of dispersed teams, and diversity of workers. In addition, findings from this literature review expose opportunities for research, such as resolving discrepancies regarding the effect of tightly coupled work on collaboration and the effect of temporal dispersion on coordination costs. Finally, we use these results to discuss opportunities and implications for designing groupware that better support collaborative tasks in virtual teams.

Similar content being viewed by others

research paper on virtual environment

Virtual Teams: An Intelligent Tool on the Path to Digitalization—A Case Study

research paper on virtual environment

Working with multicultural virtual teams: critical factors for facilitation, satisfaction and success

A process-based explanation of the psychic distance paradox: evidence from global virtual teams.

Avoid common mistakes on your manuscript.

1 Introduction

Virtual teams (i.e., geographically distributed collaborations that rely on technology to communicate and cooperate) have several potentially beneficial aspects that aid productivity. Much like collaboration in co-located teams, collaboration in virtual teams refers to synchronous and asynchronous interactions and tasks to achieve common goals. The use of virtual teams allows organizations to enroll key specialists, regardless of their physical location [ 106 , 151 ]. This allows organizations to optimize teams by using only the best talent available [ 63 , 136 ]. In theory, virtual teams also reduce the need for travelling between sites, which should reduce costs in terms of time, money, and stress [ 196 ]. It was estimated that by 2016, more than 85 % of working professionals were in some form of virtual team [ 235 ]. This implies that, as a result, virtual teams have become vital to maintaining our increasingly globalized social and economic infrastructure.

Similar to co-located teams, virtual teams participate in a variety of collaborative activities such as formal and informal meetings using technology like video conferencing (e.g., Zoom [ 121 ] and Skype [ 175 ]) and text (e.g., Slack [ 232 ] and Microsoft Teams [ 176 ]), file transfer, and application sharing [ 191 ]. As a result, virtual teams are experiencing difficulties collaborating that are making it difficult for them to be as successful as co-located teams [ 64 , 151 , 191 ]. As a result, virtual teams spend substantial time and money to relocate team members for specific projects to avoid the hindrances to teamwork associated with distance [ 231 , 257 ]. It is therefore important to develop technology that can better support virtual teams, reducing the need for costly re-locations and mitigating the problems that arise when relocation is not a viable solution.

Despite previous research examining the difficulties faced by collaborations and use of technology in specific contexts, such as distributed software development, there has been little work in examining the challenges faced by all virtual teams and their use of technology to mitigate issues. This understanding is vital to the development and utilization of technology to support virtual teams. Thus, this paper has two goals: (1) to elucidate the factors and challenges that hinder collaboration in virtual teams and (2) provide recommendations for designing groupware to better support collaboration in virtual teams, while also identifying opportunities for the Human–Computer Interaction (HCI) community to design this technology.

To achieve our goals, a Literature Review (LR) was performed with a well-planned search strategy that identified a total of 255 relevant studies, primarily focusing on technology use. Based on the selected studies, we categorized challenges as being related to: geographical distance, temporal distance, perceived distance, the configuration of dispersed teams, and diversity of workers. In addition, results from this LR identify opportunities for research, such as resolving discrepancies regarding the effect of tightly coupled work on collaboration, the effect of temporal dispersion on coordination costs, and whether virtual teams encounter more work-culture related problems than co-located teams. From the synthesis of these papers, we present four design implications for designing groupware that better support collaborative tasks in virtual teams.

This literature review explores the factors and challenges associated with collaboration in virtual teams. This paper begins with a review of related LRs in the domain of collaboration in Sect.  2 and progresses to a description of the method used to conduct the LR in Sect.  3 . Sections  5 and 6 explore issues related to distance and other contributing factors, respectively. Next, in Sect.  7 , findings from Sects. 5 and 6 are summarized, leading to Sect.  8 which completes the LR by presenting a set of four design implications for the development of groupware that supports collaboration in virtual teams.

2 Related work

Prior work includes eight systematic literature reviews surveying various topics related to distance collaboration. These topics fall into two categories: investigations of virtual teams in the domain of distributed software development (DSD) and explorations of the factors that influence collaboration in broader contexts.

Research into the challenges faced in DSD have resulted in determination of the factors associated with the relationship between distribution, coordination, and team performance that are the most commonly studied in software development, namely dimensions of dispersion (e.g., geographical, temporal, organizational, work process, and cultural dispersion) and coordination mechanisms (e.g., organic or social coordination and mechanistic or virtual coordination) [ 183 ]. Several challenges (e.g., including geographical, temporal, cultural, and linguistic dispersion [ 146 , 185 ]) and best practices or practical solutions (e.g., agile methods, test-driven development [ 146 ], frequent site visits and face-to-face meetings [ 185 , 233 ]) have been identified for traditional DSD teams [ 185 ] and teams that use a ‘follow-the-sun’ approach (i.e., where teams hand off work at the end of the day in one time-zone to workers beginning their day in another) [ 146 ]. Additional work identified opportunities for future research, such as addressing challenges present in multi-organizational software projects and supporting the development of coordination needs and methods over the course of a project [ 184 ]. This category of research also includes a study that classified empirical studies in DSD [ 64 ], revealing that communication warrants further exploration to better support awareness in this context [ 239 ].

These studies are informative and discuss several of the challenges that appear later in this LR (e.g., geographical, temporal, cultural, and linguistic dispersion). However, it is not guaranteed that the findings from the DSD studies with regards to these dimensions directly translate to collaboration in another context. In contrast, this paper examines distance collaboration in all virtual teams.

Other studies have studied the factors affecting collaboration in general. Mattessich and Monsey identified 19 factors necessary for successful collaboration, including the ability to compromise, mutual respect and trust, and flexibility [ 167 ]. Similarly, Patel et al. [ 201 ] developed a framework based on the categorization of seven factors related to collaboration (e.g., context, support, tasks, interaction processes, teams, individuals, and overarching factors) for use in collaborative engineering projects in the automotive, aerospace, and construction sectors.

In contrast to the results of the DSD studies, these findings apply to a broad range of contexts. However, since these literature reviews primarily focus on co-located collaboration, it is difficult to discern how the factors identified by these studies influence virtual teams. This paper differs by focusing only on virtual teams.

Relevant papers were extracted for LR using the guidelines proposed by Kitchenham and Charters [ 138 ] for performing Systematic Literature Reviews in software engineering, with the adjustments recommended by Kitchenham and Brereton [ 137 ]. These guidelines divide the review process into three steps:

Planning the review In this step, the research questions and review protocol are defined. This will be discussed in the remainder of Sect.  3 .

Conducting the review This step focuses on executing the review protocol created in the previous step. This will also be discussed in Sect.  3 .

Reporting the review This final step documents, validates, and reports the results of the review. This will be the subject of Sects. 5 and 6 .

3.1 Planning the review

This subsection will focus on developing the list of research questions used to generate the list of keywords for extracting papers and specify the search methodology.

3.1.1 Specifying research questions

The first stage of this literature review began by defining research questions using the Goal-Question-Metric approach described by Van Solingen et al. [ 258 ], which systematically organizes measurement programs. This model specifies the purpose, object, issue, and viewpoint that comprise a goal, which is then distilled into research questions and used to create metrics for answering those questions. The goal of this LR is:

Purpose Understand and characterize

Issue The challenges

Object Related to collaboration

Viewpoint Faced by workers in virtual teams

Using this goal, these research questions were derived:

What are the factors and challenges that impact distance collaboration?

What factors specific to distance cause issues?

What other factors contribute to these issues?

How can we design technology for supporting virtual teams?

The purpose of asking question 1 is to outline previous research investigating collaboration challenges. The expected outcome will be a comprehensive view of challenges affecting collaborations and identification of gaps or areas warranting future exploration. Research Question 1a will be the topic of Sect.  5 while Research Question 1b will be explored in Sect.  6 . Research Question 2, however, focuses on the development of technology for supporting collaboration. The answers to this question will yield an overview of design implications for the creation of groupware, which will be discussed in Sect.  8 .

3.1.2 Developing and executing the search strategy

The research questions listed above were used to identify keywords to use as search terms. For example, for the sub-question ‘ What factors can be attributed to distance ?’ the following keywords were selected: collaboration , distance , challenge ; in addition, synonyms and related words were also searched (e.g., geography, teamwork). This search can be described by the following boolean search query:

(collaboration OR teamwork OR CSCW) AND (challenge OR problem) AND (distance OR geography)

Our search methodology used multiple searches as terms were either exhausted or identified by collected papers. The generated search terms were used to conduct searches using Google Scholar since this search engine conducts a meta-search that returns results from several paper repositories (such as Science Direct, ResearchGate, Academia.edu, and the ACM digital library). During the review, it became apparent that after the first 8–9 pages of results, we reached concept saturation. As a result, we limited our search to the first 10 pages for a total of 1200 potential sources.

In addition, collected papers were used to generate additional searches via a ‘snowballing’ effect [ 26 , 249 ]. Specifically, collected papers were used to generate additional keywords, identify additional papers through the bibliography, identify newer papers that cited them, and identify authors who had written important papers published in relevant conferences. These included papers published in the ACM conference on Computer-Supported Collaborative Work (CSCW) and the ACM International Conference on Supporting Group Work (GROUP). These authors were searched for using the identified search engines, and all their papers were evaluated for inclusion. In addition, other researchers proposed sources that were used to boost paper extraction. These additional methods were used because prior work by Greehalgh and Peacock [ 91 ] found that less efficient methods like snowballing are likely to identify important sources that would otherwise be missed, since predefined protocol driven search strategies cannot solely be relied on.

3.1.3 Inclusion and exclusion criteria

The first ten pages of results from Google Scholar were reviewed since occasionally keywords resulted in a high amount of potential papers. All papers were reviewed from searches resulting in fewer than ten pages of results. As part of our search methodology, we utilized several inclusion and exclusion criteria to filter the collected papers from the potential papers found using the systematic search and snowballing. These inclusion and exclusion factors are listed in Table 1 . Figure 1 shows the number of identified papers that met the inclusion criteria across 5-year periods.

figure 1

Distribution of cited papers across time

3.1.4 Paper categorization

To facilitate analysis, the papers identified as part of the LR, shown in Fig.  1 , were further categorized by study type and contribution. Tables  2 ,  3 ,  4 ,  5 ,  6 ,  7 and  8 in the “ Appendix ” contain each paper organized by these categories.

4 Factors affecting virtual teams

Virtual teams are affected by physical factors such as geographic distance, in addition to temporal and perceive distance, which are time-based and cognitive respectively. These factors are tightly coupled with social and emotional factors, including trust, motivation, and conflicts. Based on the papers in this literature review, we separate these factors into the categories of distance factors, (which include geographical (physical), temporal, and perceived distance) and contributing factors that are driven by distance (including the nature of the work, the presence or need for explicit management, and group composition). Each category correlates with a set of challenges that greatly affect virtual teams. Distance categories and their associated challenges are discussed in Sect.  5 to answer Research Question 1a: what factors specific to distance cause challenges that impact distance collaboration? Contributing factors are discussed later in Sect.  6 .

5 Distance factors

Distance can be categorized as being primarily geographical, temporal, or perceived. Each category correlates with a set of challenges that greatly affect virtual teams. Distance categories and their associated challenges are discussed in the following sections to answer Research Question 1a: what factors specific to distance cause challenges that impact distance collaboration?

5.1 Geographical distance

Geographical distance has been defined as a measurement of the amount of work needed for a worker to visit a collaborator at that collaborator’s place of work, rather than the physical distance between the two collaborators [ 2 ]. Thus, two physically distant locations could be considered geographically close if they have regular direct flights. Even a distance as small as 30 meters has been shown to have a profound influence on communication between collaborators [ 4 ].

Furthermore, geographical distance is well known to pose challenges for virtual teams [ 191 ]. Olson and Olson explored these challenges at length in 2000 [ 191 ] and 2006 [ 193 ]. Their first work compared remote and co-located work through an analysis of more than ten years of laboratory and field research examining synchronous collaborations [ 191 ]. The 2006 paper presented a follow-up study that synthesized other prior work [ 78 , 190 ] to expand their 2000 contribution [ 193 ]. Findings from both studies identified the following ten challenges that hinder distance work:

Awareness of colleagues and their context

Motivational sense of presence of others

Trust is more difficult to establish

The level of technical competence of the team members

The level of technical infrastructure

Nature of work

Explicit management

Common ground

The competitive/cooperative culture

Alignment of incentives and goals

Challenges 1–5 will be discussed in this section while Challenges 6–10 will be topics of interest later in Sect.  6 .

5.1.1 Motivation and awareness in distributed collaborations

The motivational sense of the presence of others has well established ‘social facilitation’ effects, particularly the observation that people tend to work harder when they are not alone [ 193 ]. However, these effects are harder to find and cultivate in remote work, which poses an additional challenge to collaboration. In a similar vein, the difficulties associated with maintaining awareness of collaborators’ work progress at remote locations without the ability to casually ‘look over their shoulder’ is a significant challenge to collaboration [ 193 ]. The cause of these problems is likely because co-located workers have more opportunities for casual encounters and unplanned conversations [ 144 ], which boosts awareness. Similarly, distance prevents the informal visual observations necessary for maintaining awareness [ 8 ]. This is important since workers use the presence of specific teammates in a shared space to guide their work and prefer to be aware of who is sharing their work space [ 71 ]. Furthermore, the inability of virtual team members to observe each other’s actual effort tends to lead to a greater reliance on perceptions and assumptions that could be both biased and erroneously negative [ 206 ]. In addition to this, in situations where disengagement is not apparent, virtual team’s reliance on technology to communicate allows team members to disengage from the team due to decreased social impact [ 16 ]. Isolation can have an effect as well—when members of a virtual team become more isolated, their contributions and participation with the team decrease [ 32 ].

The importance of awareness in collaboration is discussed at length by Dourish and Bellotti [ 62 ], who investigate awareness through a case study examining ShrEdit [ 171 ], a text editor that supports multiple users synchronously. In this paper, awareness is defined as ‘an understanding of the activities of others, which provides a context for your own activity’ [ 62 ]. Dourish and Bellotti further stipulate that this context is necessary for guaranteeing that each person’s contributions are compatible with the group’s collective activity and plays a critical role in assessing individual actions in accordance with the group’s goals and progress. This context further allows individuals to avoid duplication of work. Collaborative work is significantly delayed without such awareness [ 193 ]. Moreover, awareness is a mandatory requirement for coordinating group activities, independent of the domain [ 62 ].

Many computer-based technologies have been developed to assist distance workers in maintaining awareness of their collaborators. Research suggests that the adoption of tools that allow members of virtual teams about the timing of each other’s contributions and activities may improve team coordination and learning [ 18 ]. Systems that provide real-time visual feedback about the behaviors of team members can be used as tools to mitigate various sources of “process-loss” in teams (e.g., team effort) [ 89 ]. Some early systems (e.g., [ 17 , 81 , 160 ]) were designed to feature computer-integrated audiovisual links between locations that were perpetually open, the idea being that providing unrestricted face-to-face communication and a ‘media space’ would facilitate collaboration as though the workers were in the same physical space. Since then, a number of modern systems (e.g., [ 153 , 197 ]) have been developed. For example, Glikson et al. [ 89 ] developed an effort visualization tool that calculated effort based on the number of keystrokes that team members made in a task collaboration space. They found that the visualization tool increased team effort and improved performance in teams that had a low proportion of highly conscientious members [ 89 ]. This effect did not hold true for teams with a high proportion of highly conscientious members. See the work of [ 154 ] for a more comprehensive review of awareness-supporting technology.

The concept of awareness as a direction for research has been criticized. In 2002, Schmidt argued that the term awareness was ‘ambiguous and unsatisfactory (p. 2)’ due to its exceptionally wide range of diverse applications and tendency to be paired with an adjective (e.g., ‘passive awareness’ [ 62 ]) in an attempt to lend some specificity. Instead, Schmidt recommended that researchers pursue more explicit, ‘researchable questions (p. 10)’ rather than focus on the enigmatic concept of awareness. This is more than a call to change terminology, but rather a fundamental shift in the way that research in this area is approached. Despite this recommendation, the awareness approach is still a commonly explored area [ 7 , 134 ], indicating disagreement within the community that has yet to be resolved, presenting a research opportunity.

5.1.2 Establishing trust

Throughout the relevant studies canvassed in this paper, trust has been defined in a multitude of ways. Cummings and Bromily [ 53 ] define trust within a collaboration as the worker’s belief that their team (a) ‘makes a good-faith effort to behave in accordance with any commitments both explicit or implicit, (b) is honest in whatever negotiations preceded such commitments, and (c) does not take excessive advantage of another even when the opportunity is available’. Pinjani and Palvia [ 208 ], in contrast, have a simpler definition of trust as the ‘level of confidence exercised among team members,’ and Choi and Cho [ 42 ] describe interpersonal trustworthiness as characterized by ability, benevolence, integrity, and goal congruence. Trust in the business literature is described as a person’s psychological state which indicates the person’s expectation that their team member will not act in a self-interested manner at the expense of the person’s welfare, which increases readiness to accept vulnerability [ 44 ]. Cho redefines this as a person’s believe in the beneficial actions of another even with the other is given the opportunity to act in self-interest [ 41 ]. Along with this, De Jong et al defines trust as ‘a shared and aggregate perception of trust that team members have for each other’ [ 59 ]. Lastly, Meyerson et al. [ 174 ] describe a specific type of trust, known as ‘swift trust’, which occurs in temporary organizations. The commonalities among these definitions include a perception that trust involves the belief that a collaborator will act in a beneficent manner as opposed to self-interest, acts in good-faith to honor commitments.

According to prior work [ 23 , 42 ], trust is the key variable that is crucial for all aspects of collaboration This includes team effectiveness, since trust determines whether team members ask each other for help, share feedback, and discuss issues and conflicts [ 23 ]. Team trust has a significant effect on team performance [ 59 ] and can be considered the ‘glue’ that holds collaborations together [ 48 ]. In fact, building mutual trust and personal knowledge about collaborators is more important to a good collaboration than resolving technical issues [ 250 ]. Furthermore, trust is particularly important in virtual teams since interactions on computer-mediated communication (CMC) technologies tend to be superficial (i.e., lacking contextual cues such as facial expressions and tone of voice) [ 38 , 155 , 267 ], impersonal, and less certain [ 155 ].

Trust is linked to positive aspects of collaboration. For example, commitment to the team and project is greatly influenced by trust [ 28 ]. Trust can also improve collaboration infrastructure [ 10 ] and is also crucial for the occurrence of normative actions [ 48 ]. Maurping and Agarwal [ 165 ] found that building trust early on in a virtual collaboration plays a critical role in developing adequate group functioning and the ability to manage social activities. In addition, virtual teams that develop trust early may notice information confirming the competence of their team members and may not notice contradicting evidence [ 273 ]. As a result of their early development of trust, members of these teams also gain the confidence to engage in normative actions that sustain both trust and later performance [ 48 ]. While some research has found that the relationship between early trust and performance is stronger in highly virtual teams than in less virtual teams [ 163 ], whether the performance actually improves is up for debate. Some prior work [ 128 ] reports positive effects of trust on performance while others report negligible or no effects [ 124 ]. That being said, trust has an affect on the perception of performance such that when trust is high in a collaboration, the team’s perception of its performance is higher [ 182 ].

Trust is more difficult to establish and maintain in geographically dispersed collaborations [ 170 , 193 , 220 ] for a variety of reasons including the lack of strong relationships common to co-located teams [ 36 , 37 , 38 , 123 ] difficulties having in-depth personal interactions due to the absence of nonverbal cues and difficulties inferring the intentions of others [ 67 ]. Trust is also dependent on frequency of interactions, which may be less in virtual teams [ 273 ]. Swift trust in virtual teams is particularly fragile due to the unexpected disruptions and differences across time, distance, organization, and culture in virtual teams [ 266 ]. Teams that interact virtually are considerably less likely to develop trust [ 216 ]. Furthermore, trust develops in a sequential approach in co-located tams but follows an ad-hoc, unpredictable approach in virtual teams [ 147 ].

This difficulty in establishing trust has profound effects on collaboration, (e.g., (1) corrosion of task coordination and cooperation [ 193 ], (2) decreased eagerness to communicate [ 101 ], (3) inability to systematically cope with unstructured tasks and uncertainty [ 123 ], (4) fewer members willing to take initiative [ 123 ], (5) lack of empathy for teammates [ 132 ], (6) lower amounts of feedback from collaborators [ 123 ]), and increased risk [ 218 ]. Additionally, several studies (e.g., [ 116 , 142 , 188 ]) showed that low trust caused by distance affected workers’ identification of themselves as belonging to a team spanning locations. These issues have detrimental effects on collaborations that can delay or even halt the progress of a project.

Lack of trust is most pronounced during the initial stage of the collaboration and tapers off throughout the course of the project [ 21 ], implying that there are mitigating factors for the effect of distance on trust. Taking social approaches, such as promoting social exchanges early on in the life of a project [ 123 ], or creating opportunities for casual, non-work-related interactions between collaborators [ 193 ], can improve trust. However, these types of informal interactions more commonly occur face-to-face [ 193 ]. Furthermore, [ 186 ] identified face-to-face communication as having an ‘irreplaceable’ role in building and repairing trust.

Face-to-face communication is not always possible in distance collaborations, which is why [ 20 ] investigated challenges associated with trust—particularly delayed trust (slowed rate of progress towards full cooperation) and fragile trust [susceptibility towards negative ‘opportunistic behavior (p. 1)’]—via an evaluation of four communication methods commonly used in distance collaborations: face-to-face, audiovisual (e.g., Skype [ 175 ], Google Hangouts [ 90 ], FaceTime [ 6 ]), audio (telephone), and text-based (email, [ 232 ]) tools. They found that the absence of body language, subtle voice inflections, facial expressions, etc. cause delays in workers’ decisions whether to trust a new collaborator and impede expression of their own trustworthiness. This finding agrees with Olson and Olson’s assertion that the presence of video when communicating helps in situations where workers are not familiar with each other [ 193 ]. The effect of stripping body language, subtle voice inflections, facial expressions, etc. from communication was clearly shown by the performance of people participating in a social dilemma game who relied on distance technology for communication—these collaborations markedly showed more fragile trust than those that communicated face-to-face. Textual communication was especially worse with regards to establishing and maintaining trust, although audiovisual and audio technologies did have some effect on delayed and fragile trust. It is unsurprising then that trust development is enhanced by facilitating an initial face-to-face meeting at the beginning of a team’s relationship [ 163 ]. Furthermore, the effectiveness, reliability, and usefulness of the CMC technology used by the virtual team affects trust [ 42 ]. The personal characteristics of team members (e.g., ability, integrity, competence, fairness, honesty, openness) and the level of autonomy in a team play an important part in establishing trust [ 42 ].

From these works, we see that not only does distance influence trust, but this effect can partially be attributed to the use of communication technology adopted by distance collaborations. This influence may be further affected by the manner in which communication technology is used, since irregular, unpredictable, and inequitable communication between collaborators hampers trust [ 123 ]. Thus, it is important for future research seeking to address trust in collaboration to consider communication methods, particularly since trust in collaboration is still a relevant issue [ 29 , 30 , 217 ].

5.1.3 Informal and face-to-face communication

Prior work has identified team communication as one of the fundamental challenges associated with virtuality [ 5 ]. Communication in virtual teams is a key predictor of various outcomes such as improved performance and increased commitment [ 76 ]. Often in co-located collaborations, informal communication (i.e., ‘coffee talk’ [ 57 ]) accounts for up to 75 minutes of a workday [ 102 ]. These crucial exchanges often occur after meetings or during unplanned encounters in the hallway [ 8 ] and have profound effects on collaboration. In contrast, communications in virtual teams are often more formal than in co-located settings and focus more on work-related issues [ 13 ]. This is as a result of limited opportunities for the informal and unintentional information exchanges that often happen in shared spaces such as the hallway, water cooler, or parking lot [ 13 ]. This in turn diminishes a virtual team’s ability to share knowledge [ 92 ]. Informal contact plays an important role in facilitating trust and critical task awareness [ 2 ]. Spontaneous, informal communication has been shown to foster the feeling of being a part of a cohesive team [ 11 , 102 , 132 ] and assist the provision of corrective feedback [ 8 ]. These types of informal encounters are particularly important for unstable, dynamic groups [ 2 ].

Informal communication is associated with face-to-face encounters [ 73 , 191 ], thus, face-to-face communication plays an important role in collaboration [ 64 ] and has been described as being ‘crucial’ [ 196 ] or ‘indispensable’ [ 11 ], particularly at the beginning of a project. Frequent face-to-face interactions enable collaboration in virtual teams [ 54 ] and is credited with the ability to dramatically boost the strength of work and social ties within the team [ 133 ], which promotes a worker’s sense of belonging to the team and awareness of group activities [ 2 ], as well as boosting mutual trust and understanding, which is critical for preventing conflicts [ 8 ]. In addition, face-to-face communication is associated with higher levels of consensus within groups, higher perceived quality, more communication, and greater efficiency in completing tasks [ 86 ]. For this reason, it is recommended by many authors that members of virtual teams meet face-to-face when possible, particularly during the initial launch [ 136 , 151 , 265 ], when a face-to-face meeting can create a lasting bridge across geographical, temporal, and socio-cultural distance [ 265 ]. (Socio-cultural distance will be discussed in further depth later in Sect.  6.4.2 ) It is unsurprising, then, that traveling for obtaining face-to-face contact is imperative for project success [ 116 ].

Opportunities for informal interactions are greatly reduced by geographic distance between collaborators [ 93 , 132 ]. As a result, remote collaborators are often excluded from spontaneous decisions that are made outside formal meetings [ 8 ]. This exclusion is partly as a result of the increased effort needed to reach out and contact a teammate [ 101 ], and likely partly due to the correlation between distance and diminished face-to-face communication [ 52 , 133 , 141 , 144 ]. Geographic barriers to face-to-face communication include an increase in cost and logistics [ 2 ] and the burdens of travel in terms of money and time [ 11 ].

It is no surprise, then, that virtual teams show a marked increase in online activity [ 191 , 213 ] and have a higher reliance on CMC technology [ 215 ]. computer-mediated communication technology refers to the use of computers for communication between individuals []. This technology includes audiovisual, audio, and text-based tools. Use of this technology comes with significant challenges. Synchronous technology (i.e., audio and audiovisual tools) requires that all parties be available at a particular time. Some research has shown that it may be difficult to ascertain a remote collaborator’s availability for a synchronous meeting [ 101 ] and electronic-communication dependence constrains informal, spontaneous interaction [ 61 ], while others argue that CMC is dynamic and can be used on an ad-hoc and as-needed basis with no need for scheduling, presenting fewer logistical challenges [ 234 ]. However, it is important to note that, like in the case of the telephone, initiating spontaneous communication could be perceived as intrusive [ 144 ]. In addition, audio technology ‘distorts’ verbal cues and removes visual cues [ 20 ]. Audiovisual technology is also known to mask both verbal and visual cues in addition to constraining the visual field [ 20 ]. CMC often lacks support for non-direct and nonverbal interactions (e.g., body language, facial expressions) which greatly hinders communication in geographically dispersed virtual teams [ 67 ] by making interactions more difficult [ 92 ]. Thus, the choice of CMC technology has a heavy influence on communication because each method offers a different capacity to convey verbal and nonverbal cues [ 178 ]. It is therefore recommended to use several types of CMC technologies either concurrently (e.g., face-to-face communication accompanied by documents; telephone conferencing with synchronous electronic conferencing) or consecutively (e.g., conveying information via e-mail first, followed by con verging over the phone) [ 60 ].

Virtual teams that rely on CMC in lieu of face-to-face communication are more likely to experience less positive affect and have a diminished affective commitment to their teams [ 126 ]. Furthermore, compared to face-to-face feedback, computer-mediated feedback reduces perceptions of fairness [ 3 ]. This lack of face-to-face contact results in virtual teams having a lower sense of cohesion and personal rapport between team members [ 263 ]. Members of virtual teams may also divide their attention between various tasks while simultaneously participating in teamwork interactions due to the asynchronous nature of communication media, resulting in a lack of investment in the tasks [ 163 ]. As a result, communication timeliness has a higher influence on performance in virtual teams [ 163 ]. Furthermore, virtual teams that rely on CMC technology (e.g., instant messaging) to supplement communication in the absence of face-to-face interactions may have difficulties in their decision-making processes [ 173 ].

However, overall, communication technologies (including text-based tools) take more time and effort to effectively communicate information and are missing important social information and nonverbal cues that help establish ties between collaborators [ 64 ]. This has important implications for situations where a high volume of communication is necessary. Due to the extra effort required to communicate through computer-mediated modalities (e.g., email), virtual teams must put in extra effort to manage high volumes of messages, which can hinder performance [ 163 ]. Furthermore, when teams use email for communication, it becomes difficult to determine whether the information contained within the email was understood in the absence of vocal and nonverbal cues [ 163 ]. To combat this, Marlow et al. [ 163 ] suggest using closed-loop communication to prevent misunderstandings by providing opportunities for clarification that would otherwise not accompany virtual communication. They argue that the use of closed-loop communication will enhance performance in virtual teams [ 163 ].

Since remote collaborations must rely on technology in lieu of face-to-face communication, the level of technical competence of the team members can pose an additional challenge [ 193 ]. Teams that are unable to adopt and integrate basic technology into their everyday workflow are unlikely to use more complicated and sophisticated collaboration technology (e.g., multi-pane videoconferencing) [ 191 ] that may better support visual and verbal cues, enriching distance communication. Furthermore, the level of technical infrastructure can also create collaboration challenges [ 193 ]. Technology for remote work fails without adequate technical support or resources. Reliability is also an issue with communication technology—new technology must be stable enough to ‘compete with the well-established reliability of the telephone’ [ 15 ].

There are some advantages to using commuter-mediated communication technology in virtual teams. For example, asynchronous technology (e.g., text-based tools) provide provide the ability to take one’s time when asking a question or crafting a response [ 144 , 261 ], which leads to efficient, focused conversations [ 77 , 144 ] that can be quicker than other forms of communication. CMC is also shown to increase participation among team members [ 212 ], facilitate unique ideas [ 86 , 212 ], and reduce the number of dominant members [ 212 ]. In a similar vein, Fjermestad [ 79 ] found that groups that relied on CMC experienced higher decision quality, depth of analysis, equality of participation, and satisfaction than groups that primarily met face to face. Finally, virtual teams that do not meet face to face may be better at adapting their conceptualization of a task in response to a team member completing a task in a novel manner [ 163 ]

Additional factors, such as experience with a task, interdependence, and the temporal stage of team development can impact team performance when relying on CMC technology. For example, when teams have experience with the task at hand, with each other, and with their communication method, there is less of a need for synchronous CMC technology (e.g., video conferencing) [ 60 ]. In contrast, when teams do not have this extensive experience, there is a greater need for synchronous CMC technology [ 60 ]. Organizational structure, levels of interdependence, and media richness (which ranges from face-to-face communication to simple documents) also influence the effectiveness of communication [ 140 ]. These factors vary depending on the communication method’s capacity for immediate feedback, ability to facilitate nonverbal cues, and level of personalization [ 140 ]. In addition to this, Maruping and Agarwal [ 165 ] found that matching the functionalities of the CMC technology to specific tasks will result in higher levels of effectiveness in virtual teams. Furthermore, stage at which a virtual team is at in their development will also affect communication [ 165 ]. Teams in their early stages of development should use CMC technologies that facilitate expression in order to mitigate relationship conflict [ 165 ]. Video-conferencing technologies are particularly suited for this situation being both synchronous and media rich [ 165 ].

From the identification of these challenges, we can clearly see that existing tools and infrastructures have limitations that are preventing communication technology from fully supporting informal interactions. Thus, we are left with a need for other methods that support informal communication in geographically dispersed collaborations.

5.1.4 Intra-team conflict

In Jehn et al.’s exploration of everyday conflict through qualitative investigation of six organizational work teams, intra-team conflict is categorized as being either affective (i.e., interpersonal), task-based, or process-based (i.e., relating to responsibilities and delegation of workers for tasks) [ 125 ]. All three types of conflict have been investigated within the context of geographically distributed versus co-located teams, with mixed results. Several researchers have concluded that geographically distributed teams experience higher levels of conflict [ 8 , 46 , 103 , 108 , 188 , 261 ]. In particular, geographically distributed teams are more susceptible to interpersonal [ 108 ] and task-based conflict [ 108 , 179 ]. There is some evidence that conflict has a more ‘extreme’ [ 107 , 159 ] or ‘detrimental’ [ 179 ] effect on distributed teams as opposed to co-located ones. This effect can likely be attributed to the evidence that conflict in distributed teams is known to escalate and often remains unidentified and unaddressed for long periods of time [ 8 ]. As a result of reliance on computer-mediated communication, virtual teams featuring high geographical dispersion have higher perceptions of unfairness, which also leads to internal conflict [ 244 ].

One pervasive issue is the development of geographically based subgroups within a collaboration that provoke us-versus-them attitudes [ 8 , 46 ]. Armstrong and Cole observed that the word ‘we’ was often used to refer to co-located workers, regardless of which group the workers were assigned [ 8 ]. In another case, a team of international collaborators spread across four sites ‘fought among themselves as if they were enemies’. Interviews exposed that the team was actually comprised of four groups under one manager and did not act or feel like one cohesive team [ 8 ]. These conflicts are similar to those associated with communicating at a distance. Conflicts frequently occur as a consequence of assumptions and incorrectly interpreted communications [ 103 ]. Furthermore, missing information and miscommunications between geographically distant sites result in teammates making harsh attributions about their collaborators at other locations [ 46 ]. These types of intra-group conflicts can have important ramifications for distant collaborations. Us-versus-them attitudes often lead to limited information flow, which in turn leads to reduced cohesion and faulty attributions [ 46 ]. Moreover, intra-team conflict causes problems that result in delays in work progress [ 8 ] and resolution of work issues [ 103 ].

Researchers have identified several things that can mitigate conflict in virtual teams. Both shared context [ 108 ] and a shared sense of team identity have a moderating effect on conflict [ 108 , 179 ], particularly task and affective conflict [ 108 , 179 ]. Familiarity, in addition, has been shown to reduce conflict [ 107 ]. Spontaneous communication—which, as previously discussed, is primarily achieved face-to-face—has been demonstrated to mitigate conflict in virtual teams, particularly due to its role in facilitating the identification and handling of conflict [ 108 ]. There are also more instances of task conflict in teams that rely heavily on communication technology [ 179 ]. Specific types of conflict can be managed through different forms of computer-mediated communication technology. Task related conflict, for example, is best managed through synchronous communication technologies such as video-conferencing [ 165 ]. Conflict related to processes can be effectively handled using asynchronous communication technologies that also document the team’s agreements regarding tasks and responsibilities [ 165 ]. In this case, immediate feedback is not as necessary [ 165 ].

Although the above work has come to an agreement as to whether geographic distance has a negative effect on conflict, contradictions do exist in the literature. In particular, Mortensen and Hinds’ [ 179 ] examination of 24 product development teams found no significant difference in affective and task-based conflict between co-located and distributed teams, which is in direct conflict with their later work [ 108 ]. This discrepancy is particularly interesting given that the participants in both studies did research and product development, and are therefore comparable. Thus, it is uncertain as to which conclusion is accurate, presenting an open question.

5.2 Temporal distance

Temporal distance is distinctly different than geographical distance and should be treated as a separate dimension [ 49 ]. While geographical distance measures the amount of work needed for one collaborator to visit another at that collaborator’s place of work, temporal distance is considered to be a directional measurement of the temporal displacement experienced by two collaborators who want to interact with each other [ 2 ]. Temporal distance can be caused by both time shifts in work patterns and differences in time zones [ 219 ]. In fact, time zone differences and time shifts in work patterns can be manipulated to either decrease or increase temporal distance [ 2 ]. It can be argued that temporal distance is more influential than geographic distance [ 75 , 213 , 243 , 250 ] due to the challenges it poses on coordination [ 49 , 74 , 75 , 141 , 183 , 213 , 243 ].

One key disadvantage to high temporal distance is the reduced number of overlapping work hours between collaboration sites [ 11 , 33 , 132 ]. Although in an ideal situation, having team members dispersed across time zones can allow continual progress on a project as each team member works within their respective workdays [ 256 ], this isn’t always the case. In fact, temporal distance can lead to incompatible schedules that result in project delays and can only be overcome with careful planning [ 230 ]. Fewer overlapping work hours results in communication breakdowns, such as an increased need for rework and clarifications, and difficulties adjusting to new problems [ 73 , 74 ]. Additionally, reduced overlap in work hours results in coordination delays [ 49 ]. For example, a distant teammate may not be available when their expertise is needed [ 2 ]. In some cases, this unavailability causes the collaborator in need of help to make assumptions based on local culture and preferences in order to reach an immediate resolution of issues—which can cause rework when these assumptions are incorrect [ 250 ]. The issue of the lack of overlapping work hours also causes problems with synchronization; synchronous communication is often significantly limited in temporally dispersed collaborations, which can delay vital feedback [ 2 ] and increase response time [ 219 ]. In fact, scheduling global meetings can be virtually impossible for this reason [ 250 ]. Furthermore, as with geographic distance, temporal distance decreases the number of opportunities for informal communication [ 93 , 132 ] since the window in which all collaborators are available is small.

Communication can be disrupted by temporal distance in other ways. Bjørn and Ngwenyama found that in some virtual teams, communication would become limited to temporally co-located teammates because it was easier, bypassing teammates at other sites who should have been included [ 14 ]. This invisible communication would result in collaborators feeling left out of key decisions, which had toxic effects on the project. This effect is especially unfortunate given that temporal distance makes repairing the consequences of misunderstandings and reworking portions of the project more costly [ 73 ].

In addition to these issues, temporally dispersed collaborations are often plagued by delays, while co-located collaborations are considered more efficient [ 19 ]. Coordination delay increases with temporal distance—delay between collaborators located in the same city was smaller than that for collaborators in different cities, which was smaller than the delay found in collaborators located in different countries [ 49 ]. Delays in responses from collaborators can be especially frustrating and problematic [ 116 ] and can lengthen the amount of time required to resolve issues [ 19 ], sometimes dragging problems out across multiple days [ 120 , 132 ]. When work is organized such that a team member’s contribution is dependent upon a task completed by a team member in an earlier time zone, a failure to complete the earlier task can result in the loss of an entire workday [ 250 ]. Thus, timely completion of tasks in temporally dispersed collaborations is crucial [ 250 ]. Coordination delays are also shown to cause additional problems, particularly decreased performance in terms of meeting key requirements, staying within the budget, and completing work on time [ 49 ].

There are several social approaches to mitigating these issues. For example, collaborators can cultivate flexible work schedules [ 116 ], often by modifying a ‘typical’ workday by working either extremely early in the morning or very late at night so that there are overlapping work hours [ 250 ]. In contrast, Holmstrom et al. found that both Hewlett Packard (HP) and Fidelity employed a ‘follow-the-sun’ concept where work is handed off at the end of the day in one time-zone to workers beginning their day in another [ 116 ]. Follow-the-sun methodologies, if used effectively, can result in efficient, 24/7 productivity since work can be completed by one team member during another’s off hours [ 2 , 93 , 103 ]. However, this technique requires additional oversight time to facilitate the transfer of work from one team to the other, including time to discuss arising issues [ 250 ]. A competing technique is to limit the number of time zones in which sites are located [ 116 ]. Additionally, some coordination issues can be mitigated by careful division of work which takes into account being separated by several time zones [ 49 ].

Technology also plays a key role in mitigating the effects of temporal distance. Asynchronous communication tools (e.g., email, fax [ 19 , 57 ]) allow collaborators to coordinate shared efforts across time and distance with the additional benefits of leaving a written communication history [ 31 ] that supports accountability and traceability [ 2 ]. However, using asynchronous tools is known to increase the amount of time that a collaborator has to wait for a response [ 2 ] and make temporal boundaries more difficult to overcome than spatial boundaries in instances where sites do not have overlap in their workdays [ 49 ]. Furthermore, the process of writing ideas in emails increases the risk of misunderstandings between collaborators [ 57 ] over talking in person or via the telephone. Finally, developers starting their workday may become overwhelmed by the number of asynchronous messages left during the previous night [ 19 ]. Given these drawbacks to current technology and the unlikelihood that global collaboration is going to stop, it is worthwhile to ask how can we better support communication in temporally distant work.

There is also some question as to whether coordination costs are higher in teams that are temporally distributed. Both Ågerfalk et al. [ 2 ] and Battin et al. [ 11 ] assert that temporal distance greatly increases the cost and effort of coordination due to the added difficulties of dividing work across multiple time zones. Espinosa and Carmel [ 73 ], however, state that temporal distance reduces coordination costs when team members are not working concurrently because no direct coordination takes place when the two teammates are not working at the same time [ 2 ]. Clearly, this discrepancy needs to be resolved.

5.3 Perceived distance

As previously discussed in Sects. 5.1 and 5.2 , distance is commonly conceptualized in terms of geography or time zones [ 4 ] (i.e., spatio-temporal distance). In contrast, perceived (a.k.a. subjective) distance is characterized by a person’s impression of how near or how far another person is [ 270 ]. These perceptions of proximity have both an affective and a cognitive component [ 189 ]. In this case, the cognitive component refers to a mental judgement of how near or distance a virtual teammate seems while the affective component is concerned with the idea that a person’s sense of perceived proximity is neither purely conscious or rational but is instead dependent on emotions [ 189 ]. Perceived distance is a distinctly different idea than spatio-temporal distance and one is not necessarily related to the other [ 215 ]. Rather, perceived distance is the “symbolic meaning” of proximity rather than physical proximity and is suggested to have a greater effect on relationship outcomes [ 189 ]. This symbolic meaning is defined by the teams sense of shared identity and their use of communication media, which is primarily synchronous [ 189 ]. In fact, as people interact strongly and frequently with other team members, they can create a sense of closeness independent of physical proximity [ 214 ]. For example, free and open source software developers often perceive high levels of proximity due to their strong and intense communication and “hacker” identities [ 214 ]. The concept of perceived distance is why collaborators may be geographically distant and yet feel as though they are proximally near [ 162 ]. Perceived proximity can have a profound influence on team interaction [ 34 , 82 , 189 ] For example, perceptions of proximity are known to influence decision making in virtual teams [ 198 ].

In 2014, Siebdrat et al. surveyed 678 product developers and team leaders in the software industry to investigate perceived distance and challenge the notion that geographic and temporal distance directly translates to perceived distance. They found that perceived distance was more strongly affected by a team’s national heterogeneity than by their spatio-temporal distance. Furthermore, Siebdrat et al. found that perceived distance had a significant effect on collaboration while spatio-temporal distance had no impact. As a result, they concluded that perceived distance is more indicative of collaboration challenges than spatio-temporal distance.

Findings from other work implies that distance can affect collaborators that are all in the same country at a single site [ 4 ], with low national heterogeneity and low spatio-temporal distance. It is uncertain whether this situation would still have high perceived distance given the limited work available. Therefore, there is a clear need for a better understanding of the relationship between perceived distance, spatio-temporal distance, and collaboration.

6 Contributing factors

In addition to the challenges associated with the three main types of distance discussed previously in this paper (i.e., geographic, temporal, and perceived distance), several contributing factors intersect with distance to cause additional challenges for virtual teams. To answer Question 1b (What other factors contribute to the factors and challenges that impact distance collaboration?), this paper will discuss these key factors, namely the nature of work, the need for explicit management, configuration, and diversity of workers in a collaboration.

6.1 Nature of work

Work can be categorized as either loosely or tightly coupled [ 191 ]. Tightly coupled work relies heavily on the skills of groups of workers with exceedingly interdependent components; this type of work necessitates frequent, rich communication and is usually non-routine. Loosely coupled work, in contrast, is typically either routine or has fewer dependencies than tightly coupled work. Interdependence between components, and thus tightly coupled work, is at the heart of collaboration [ 225 ]. In addition, complex tasks lead to higher trust and collaboration than simple tasks and task complexity is a critical factor that molds the interactions and relationships between team members [ 42 ]. Furthermore, interdependence is not merely an issue of sharing resources, but instead ‘being mutually dependent in work means that A relies positively on the quality and timeliness of B s’ work and vice versa and should primarily be conceived of as a positive, though by no means necessarily harmonious interdependence’ [ 225 ]. Marlow et al. [ 163 ] found that as interdependence increases, communication becomes increasingly critical. They therefore suggest that communication becomes increasingly important to promoting high levels of performance. In 1988, Strauss described the additional work necessary for collaborators to negotiate, organize, and align their cooperative (yet individual) activities that occur as a result of interdependence. In doing so, Strauss discusses the concept of articulation work—by his definition, work concerned with assembling tasks and adjusting larger groups of tasks (e.g., sub-projects and lines of work) as a part of managing workflow. Articulation work is further described as the additional work needed to handle the interdependencies in work between multiple collaborators [ 72 ].

Virtual teams face greater challenges when managing these dependencies as a result of distance, both spatial and temporal, and culture [ 72 ]. Because interdependent (i.e., tightly coupled) work requires a high amount of interaction and negotiation, it is very difficult to do at a distance [ 191 ]. In contrast, loosely coupled work does not require as much communication as tightly coupled work, and so is easier to complete in geographically distant collaborations. Thus, tightly coupled work in virtual teams leads to less successful projects [ 193 ]. This observation is important since most projects have both varieties of work [ 191 ].

To combat the challenges associated with relying on tightly coupled work, many organizations take a social approach that arranges for co-located team members to work on tightly coupled aspects of the project while distance workers tackle loosely coupled parts [ 64 , 193 ], facilitated by deconstructing tasks into smaller pieces [ 93 ]. For tightly coupled work, some organizations choose to use extreme [ 161 ] or radical [ 246 ] collaboration setups where teams work in an enclosed environment in order to maximize communication and facilitate the flow of information. In contrast, for loosely coupled work, some organizations choose to minimize interaction [ 104 ]. Creating rules and norms for communication between team members early in the team’s life cycle can also increase effective communication and therefore improve performance during complex tasks [ 262 ]. This is essential for managing highly complex tasks and avoiding misunderstandings that can arise as a result of high task complexity combined with high virtuality [ 163 ].

However, the idea that tightly coupled work challenges collaboration is contested by Bjørn et al. [ 15 ]. This case study is centered on a large research project investigating global software development with several geographically dispersed partners. This study also provides evidence that tightly coupled work resulted in stronger collaborations. They observed that tightly coupled work required collaborators to frequently interact to do their work and, as a result, forced these collaborators to know more about each other, help each other, and cultivate strong engagement despite being at geographically distant sites. In contrast, loosely coupled work did not require the same level of engagement, resulting in collaborators feeling more detached from the project. Thus, Bjørn et al. proposed that tightly coupled work in geographically distributed teams involves processes that help collaboration [ 15 ].

Complex, tightly coupled tasks may be more difficult to the reliance of virtual teams on virtual tools and tendency to disband after a task has been completed [ 12 ]. Furthermore, the combination of high task complexity and high levels of virtuality lends itself to misunderstandings and mistakes [ 163 ]. As a result, effective communication is more critical for high performance in virtual teams for these tasks [ 163 ]. Despite this, Marlow et al. suggest that virtual teams can successfully complete these tasks if team members cultivate shared cognition. Given the characteristics of CMC technologies like video conferencing, which preserve much of the nuances present in face-to-face communication, we posit that shared cognition can be developed through the frequent, consistent use of this medium for communication.

Given the contrast between the work suggesting that tightly coupled work hinders distance collaboration [ 72 , 191 , 193 ] and work by Bjørn et al. [ 15 ] that suggests the opposite, there is clearly room for further research on the subject. This is especially true since Bjørn et al. focused only on global software development, and thus their findings might not generalize to other types of collaboration.

6.2 Explicit management and leadership

One of the largest challenges faced by virtual teams is the management of team effort [ 207 ]. Explicit management is needed for distributed, collaborative work, particularly by leaders trained in project management, in order to ensure the success of a project [ 150 , 193 ]. Collaborative projects are considered difficult to manage, especially as the number of workers associated with the project increases. Leadership is challenging in geographically dispersed teams because effective leadership is highly dependent on quality interactions that are more difficult across distance [ 157 ]. For example, Hoch and Kozlowski [ 111 ] found that hierarchical leadership is less effective in geographically dispersed teams than in co-located teams. It is also more challenging to ensure that the team’s work is given priority by the team members in geographically dispersed teams [ 131 ]. Furthermore, distributed projects face even more obstacles, such as increased coordination problems [ 188 ] including identifying and overcoming cultural differences, ensuring that all team members are heard [ 193 ], and regulating the inter-dependencies between resources, task components, and personnel [ 158 ].

Virtual teams face challenges related to leadership, such as nourishing an environment that fosters creativity [ 96 ] and emergent leadership [ 35 ]. Effective leadership benefits geographically dispersed virtual teams in a multitude of ways, including helping virtual teams overcome many of the challenges caused by distance, including facilitating satisfaction and motivation [ 88 , 169 ]. Virtual leadership can help collaboration within the team through providing training, guidance, resources, coaching, and facilitating relationship building [ 150 ]. Furthermore, leadership in virtual teams can facilitate knowledge sharing and the building of shared mental models [ 150 ]. Mental models are defined by Johnson-Laird [ 126 ] as internal representations of knowledge that match the situation they represent and consist of both abstract concepts and perceptible objects and images. These mental models may reflect detailed information about how the task is to be performed (i.e., task-related team mental models) or information about team member’s roles, tendencies, expertise, and patterns of interaction (i.e., teamwork-related mental models) [ 226 ]. These benefits, in turn enhance virtual team effectiveness [ 150 ]. Task complexity can be a mitigating factor in the effectiveness of leadership. Leadership benefits the team more in an environment where tasks are highly interdependent and/or highly complex [ 150 ]. In addition to this, team members’ perceptions of their leaders’ use of communication tools and techniques can impact their perceptions of overall team performance [ 182 ]). In particular, positive perceptions of leadership communication results in positive perceptions of performance [ 182 ].

Leadership can have a strong influence on interpersonal team dynamics and trust as well. Prior work indicates that leaders play an important role in enhancing team performance by demonstrating empathy and understanding [ 131 ], monitoring and reducing tensions [ 260 ], and clearly articulating role and relationship expectations for team members [ 131 ]. Leaders in virtual teams have the capacity to prevent and resolve team relationship and task conflicts [ 150 ]. Furthermore, effective leadership can have a positive influence on affection, cognition, and motivation [ 150 ]. It is particularly important for leaders to bridge co-located and remote team members in order to promote team effectiveness [ 150 ]. Leaders can build trust within virtual teams by engaging in behaviors such as early face-to-face meetings, using rich communication channels, and facilitating synchronous information exchange [ 150 ]. High levels of consistent communication between leaders and team members is positively related to trust and engagement within virtual teams [ 80 ].

Individual leadership styles have their own impact on virtual team productivity. Prior work has focused on four key types of leadership: transformational, empowering, emergent, and shared. Transformational leadership is characterized by idealized influence, inspirational motivation, individual consideration, and intellectual stimulation [ 65 ]. This type of leadership enables followers to reach their potential and maximize performance [ 65 ]. However, transformational leadership, while effective in co-located or slightly dispersed teams, is less effective in improving the performance of highly geographically dispersed teams [ 69 ]. This may be due to the difficulties associated with facilitating communication across distance, which can cause the leader’s influence to have counterproductive effects [ 69 ]. In this case, the leader is likely to be “too far removed” to authentically want to make a difference [ 69 ]. In fact, a transformational leader’s influence on team communication decreases as the team becomes more and more dispersed [ 69 ].

Empowering leadership combines sharing power with individual team members while also providing a facilitative and supportive environment [ 236 ]. High empowering leadership has the effect of positively influencing team members’ situational judgement on their virtual collaboration behaviors and, ultimately, individual performance [ 105 ]. Moreover, empowering leadership has a positive effect on team performance at high levels of team geographic dispersion [ 105 ]. However, it is important to note that teams may miss out on the benefits provided by empowering leadership if they lack situational judgement [ 105 ]

Emergent leaders are people who exert significant influence over other members of a team, even though they may not be vested with formal authority [ 227 ]. Emergent leadership has a positive relationship with virtual team performance [ 110 ]. In particular, emergent leadership has positive relationships with team agreeableness, openness to experience at the individual team member level, and emotional stability [ 110 ]. In addition, emergent leadership has a positive relationship to individual conscientiousness, which is associated with being careful, responsible, and organized [ 110 ]. These all have positive influences on virtual team performance [ 110 ].

Shared leadership is a collective leadership processing featuring multiple team members participating in team leadership functions [ 110 ]. This form of leadership can be described as a “mutual influence process” where members of a team lead each other towards the accomplishment of goals [ 109 ]. Shared leadership has a positive influence on the performance of virtual teams [ 110 , 150 ]. The structural support provided by shared leadership can supplement traditional leadership; in this situation, shared leaders assume the responsibility of building trust and relationships among team members [ 150 ]. Shared leadership provides many benefits to virtual teams such as emotional stability, agreeableness, mediating effects on the relationship between personality composition and team performance [ 110 ]. Shared and emergent leadership styles share some effects on virtual teams. Specifically, these types of leadership will affect the relationships between team conscientiousness, emotional stability, and team openness such that they will be stronger in teams with higher levels of virtuality than in teams with lower levels of virtuality [ 110 ]. However, shared leadership is facilitated by the socially-related exchange of information that creates commitment, trust, and cohesion among team members [ 110 ]. In co-located teams, this exchange of knowledge is enabled through social interactions like informal conversations, socializing outside of work, and through meetings [ 110 ]. However, this type of informal and face-to-face communication is less common and feasible in virtual teams for reasons that will be discussed later. As a result, it is necessary for organizations to make efforts to facilitate shared leadership through training [ 110 ].

In addition to leadership style, the level of authority differentiation and skill level of the team members have an affect on team-level outcomes. Among teams with less skilled members, centralized authority (i.e., high authority differentiation) will have a positive influence on efficiency and performance in virtual teams [ 223 ]. In contrast, centralized authority has a negative influence on team innovation, learning, adaption, and performance as well as member satisfaction and identification among teams with highly skilled members [ 223 ]. Decentralized authority (i.e., low authority differentiation) when combined with careful intervention of a formal or informal leader can benefit coordination, learning, and adaptation in virtual teams with high skill differentiation and high temporal stability [ 150 ].

Other studies showed that virtual teams face challenges that could be mitigated with explicit management [ 83 , 188 , 243 , 261 ]. O’Leary and Mortensen investigated the effects of configuration (i.e., the distribution of team members across multiple sites) on team dynamics at the individual, subgroup, and team level [ 188 ]. They found that geographically defined subgroups led to significantly negative outcomes with regards to coordination problems (e.g., difficulties with coordination-related decisions about schedules, deadlines, and task assignments). The effects of configuration on distance work will be discussed further in this section. Similarly, problems of coordination (e.g., ‘reaching decisions’ and ‘division of labor”) were significantly increased by distance [ 261 ]. These results are complemented by findings that distance hampers the coordination of virtual teams via synchronous meetings [ 243 ]. Similarly, coordination in distance collaborations is hindered by difficulties in scheduling synchronous meetings due to limited windows of time where all parties are able to be present [ 83 ]. These findings complement those of Sect.  5.2 discussing the effect of temporal distance on collaboration.

Prior work has suggested various strategies for effective leadership and explicit management. For example, Hill and Bartol [ 105 ] suggest team training that focuses on strategies for overcoming challenges encountered in dispersed teamwork. Another, related, strategy is to focus more attention on setting norms for behavior that may aid appropriate situational judgment among team members when launching geographically dispersed teams [ 105 ]. A different approach is to consider personality dimensions such as agreeableness, conscientiousness, openness, emotion stability, and moderate extroversion, which all have positive influences on team performance, when selecting virtual team members [ 110 ].

However, some types of collaborations, particularly research collaborations consisting mainly of scientists, avoid the application of explicit management in their projects [ 193 ]. There is an opportunity for research to investigate how to support explicit management in distance collaborations that typically reject this type of administration.

6.3 Configuration

Like O’Leary et al. [ 188 ], in this paper, configuration is subdivided into three dimensions: site, imbalance, and isolation. Site dispersion is best characterized as the degree to which collaborators are at distinct geographic locations [ 187 ]. There is an inverse relationship between the number of sites and project success [ 50 , 51 , 133 ]. High site dispersion is associated with higher amounts of faultlines (i.e., theoretical divisions within a group that create subgroups) which damage team collaboration [ 47 , 210 ]. Specifically, faultlines escalate polarization, subgrouping, and the effect of causing collaborators in other locations to feel more distant [ 47 ]. Having a large number of sites, in particular, increases the odds that differences in demographics will create these divisions [ 47 ]. Additionally, greater numbers of sites predict fewer coordination activities and decreased outcomes [ 133 ]. Knowledge sharing decreases [ 40 , 83 ] and the cost of managing team goals increases [ 97 ] as the number of sites increases.

Imbalance refers to the proportion of collaborators dispersed across a set of sites and can have negative effects on collaboration, such as conflicts between large and small sites [ 8 ]. For example, imbalanced teams often have unequal amounts of contribution towards shared team tasks [ 188 ]. Furthermore, levels of conflict and trust differ between imbalanced and balanced teams [ 188 , 210 ]. In particular, larger subgroups in imbalanced teams feel stronger effects from faultlines on conflict and trust [ 210 ]. However, it is unclear what the ramifications are of these differences in trust and conflict [ 188 , 210 ], presenting an opportunity for research.

Imbalanced teams consisting of one isolated collaborator working with a co-located team function differently than highly dispersed, balanced teams [ 188 ]. For instance, communication in these imbalanced teams is different because the co-located team members communicate both face-to-face and electronically with each other, but, in the absence of travel, only communicate electronically with the isolated team member [ 231 ]. This disparity in communication methods impedes informal interaction and spontaneous communication [ 45 ]. This also has a unique effect on communication where the co-located team feels compelled to communicate with those isolated collaborators more frequently to make up for this difference [ 188 ]. Also, isolated members tend to contribute more frequently than their co-located counterparts because they feel as though they need to ‘speak up’ and be ‘heard’ over the co-located team [ 141 , 188 ].

Furthermore, isolation negatively affects a worker’s awareness of collaborator’s activities [ 187 ]. Isolated workers are also more likely to feel the effects of a lack of motivational sense of the presence of others [ 193 ]. These isolated workers identify less with the team and feel less like they are part of the group, leading to a feeling of distance from the rest of the team [ 45 ], which translates to feeling differently about group processes and outcomes [ 27 ]. Furthermore, isolation and feelings of alienation can have a negative effect on relationships among workers in geographically dispersed virtual teams, increasing the likelihood of feeling discomfort and reducing the likelihood of trusting team members that they do not know well [ 67 ].

Configurationally imbalanced teams (i.e., teams that have an uneven distribution of members across sites) tend to have lower identification with teammates and higher levels of conflict [ 188 ]. Conflict can be reduced by a shared sense of team identity [ 108 , 179 ], meaning that fostering this sense of identification with the team can mitigate both problems. Since team identification can be built via face-to-face communication [ 54 ]; we posit that in the absence of face-to-face communication, imbalanced teams should make use of CMC technologies that facilitate nuanced expression, such as video conferencing tools.

6.4 Group composition

The diversity of a team encompasses several factors that correlate with a set of challenges that greatly affect virtual teams. This section will focus on the issues of common ground, socio-cultural distance, and work culture. In the process, this section will discuss the remaining challenges identified by Olson and Olson [ 191 , 193 ], (continued from Sect.  5 ): common ground, the competitive/cooperative culture, and alignment of incentives and goals.

6.4.1 Common ground

Distance collaboration becomes easier if team members have common ground (i.e., have worked together before [ 54 ], have shared past experiences [ 54 ], vocabulary [ 191 ], or mental models [ 168 ] etc.) since it allows them to communicate via technology without requiring frequent clarification [ 193 ]. This challenge is also referred to as the ‘mutual knowledge problem’ [ 46 ]. The concept of mutual knowledge between teammates is based on the idea of ‘grounding’ in communication [ 43 ], which is done by both communicating and confirming understanding using words or body language [ 43 ]. Schmidtke and Cummings [ 226 ] found that as virtualness increases in a team, mental models become more complex, which negatively affects teamwork. They also found that as virtualness increases, similarity and accuracy of mental models decreases [ 226 ]. Accuracy and similarity play vital roles in reducing the negative effect of complexity on teamwork behaviors [ 226 ]. Fortunately, specialized training can increase mental model accuracy [ 226 ].

As virtual teams rely more on computer mediated communication, temporal stability (i.e. “the degree to which team members have a history of working together in the past and an expectation of working together in the future” [ 115 ]) more strongly influences teamwork [ 223 ]. High temporal stability is associated with positive team outcomes related to related to adaptation, learning, innovation, and performance, as well as satisfaction and identification with the team [ 223 ]. In addition to this, the extent to which virtual team members share common goals is critical in determining the success of the team [ 42 , 230 ]. For this reason, team leaders should ensure that team members commit to the task and common goals [ 10 ].

Research [ 168 ] has shown that it is more difficult for virtual teams that are geographically dispersed to develop a shared mental model. In particular, the process of grounding is made more difficult when there is a higher risk of misinterpretation, such as in the presence of multiple cultural practices and languages [ 191 ].The significant amount of time required to establish common conceptual frameworks and personal relationships can pose a significant constraint on collaboration in virtual teams [ 54 ].

The consequences of lack of common ground are primarily difficulty building trust [ 123 , 202 , 273 ] and difficulties associated with communication. Lack of common ground can limit the ability to communicate about and retain contextual information about teammates located at other sites, including their teammates situation and constraints, especially as the number of sites increases, in turn hindering their collaborative interactions and performance [ 46 , 230 ]. This contextual information includes, but is not limited to, local holidays and customs, site-specific processes and standards, competing responsibilities, and pressure from supervisors and teammates [ 46 ]. Common ground is also necessary to understand which messages or parts of messages are the most salient, which is particularly problematic because there may be restricted feedback [ 46 ]. The lack of common ground can also create problems interpreting the meaning of silence, which makes it difficult to know when a decision has been made [ 46 ]. Furthermore, lack of common ground can result in an uneven distribution of information and differences in speed of access to that information, which causes teammates at different sites to have different information and creates misunderstandings that are nontrivial to rectify [ 46 ].

Thus, the establishment of common ground is of utmost importance to virtual teams.

6.4.2 Socio-cultural distance

Socio-cultural distance has been defined as a measurement of a team member’s perception of their teammate’s values and usual practices [ 2 ]. This concept encompasses national culture and language, politics, and the motivations and work values of an individual [ 2 ]. It is known that geographically distributed collaborations are more socio-culturally diverse than co-located ones [ 179 ] because distance typically increases demographic heterogeneity (especially racial or ethnic heterogeneity) [ 107 ]. Members of a virtual team with different cultural backgrounds are likely to have different behaviors within the teams, including how they interact with their teammates [ 123 ]. For this and other reasons, virtual team’s cultural composition is the key predictor of the team’s performance [ 242 ].

Cultural differences go beyond national differences. There is a tendency for researchers studying cross-cultural organizational behavior to focus on national issues or use nation as a substitute for cultural values [ 245 ]. However, nation is not the only meaningful source of culture [ 84 , 149 ]. In addition to this, there may be multiple subcultures within a nation and the national culture may not be completely shared [ 135 ]. In fact, variation of cultural values within a country may be higher than variation between countries [ 114 ]. Therefore, a virtual team with high national diversity may not necessarily be culturally diverse [ 86 ].

Prior research has identified three levels of diversity: surface-level, deep-level, and functional-level [ 99 , 177 ]. Surface-level diversity is primarily observable differences such as race, age, and sex, while deep-level diversity is comprised of more subtle differences in personal characteristics such as attitudes, beliefs, and values, which are communicated through interaction between team members and information gathering [ 177 ]. Functional-level diversity, in contrast, refers to the degree to which team members have vary in knowledge, information, expertise, and skills [ 10 ].

The individualism-collectivism dichotomy is a ‘major dimension of cultural variability’ [ 112 ] that contributes to high socio-cultural distance. Socio-cultural distance is associated with higher levels of conflict as well as lower levels of satisfaction and cohesion [ 238 ] and has a profound impact on team performance [ 70 ]. Hardin et al. [ 98 ] found that the individualistic-collectivist dichotomy results in some cultures being more open to working in geographically dispersed environments due to their levels of self-efficacy beliefs about virtual teamwork.

Collectivist cultures place the needs, beliefs, and goals of the team over the those of an individual [ 94 , 112 ]. Virtual teams characterized by collectivist culture are less likely to use CMC technologies [ 143 ]. When they do choose to adopt CMC technologies, collectivist teams tend to choose synchronous methods that provide high relationship-related informational value [ 143 ]. Informational value in this context refers to the extent to which CMC technologies convey information benefits team effectiveness [ 143 ]. Virtual teams that favor in-group members and accept perceptions of inequality are said to be characterized by “vertical collectivism” [ 254 ]. These teams are less likely to rely on CMC technologies, and are more likely to accept varying forms of informational value [ 143 ]. They are also more likely to employ asynchronous methods [ 143 ]. In contrast, teams that perceive equality amongst team members regardless of their role within the organization experience “horizontal collectivism” [ 253 ]. In this case, members of the team view themselves as being part of a collective and treat all team members as equal. [ 253 ]. While these teams are also likely to limit reliance on CMC technologies, they tend to require higher informational values and prefer synchronous methods [ 143 ].

In contrast to collectivist cultures, individualist cultures place the needs, beliefs, and goals of the individual over the those of an team [ 112 ]. Virtual teams with high levels of individualism are more likely to use CMC technologies, especially those that are high in task-related informational value, and tend to work asynchronously [ 143 ] Furthermore, team members from individualist cultures tend to communicate more openly and precisely [ 112 , 113 ] and are more willing to respond to ‘ambiguous messages’ [ 94 ], which is considered to be an indicator of trust [ 203 ]. This observation indicates that team members from individualistic cultures may be more ready to trust other teammates when communicating via technology than team members from collectivist cultures [ 123 ]. Thus, the issues and recommendations regarding technology and trust are applicable.

Teams with members that prioritize their own intrinsic and extrinsic goals while also favoring status differences are said to be “vertically individualistic” [ 156 ]. These teams are characterized by competitive members that are motivated to “win” [ 156 ]. In addition, while these individuals tend to belong to more in-groups than collectivists, they are not very emotionally connected to these groups [ 181 ]. Virtual teams with high levels of vertical individualism are more likely to adopt CMC technologies, tolerate varying forms of informational value, and will use asynchronous methods when required by superiors than teams characterized by horizontal individualism or any type of collectivism [ 143 ]. Team members with horizontal individualistic orientation prioritize their own self-interest while also viewing their teammates as equals [ 143 ]. Virtual teams with high levels of horizontal individualism are more likely to adopt CMC technologies, tend to require higher informational value, and will use synchronous methods when required by superiors as opposed to teams characterized by vertical individualism or any type of collectivism [ 143 ].

Socio-cultural diversity can also be characterized by the temporal orientation of their goals. Teams that focus upon the future and are willing to delay success or gratification for the purposes of future gain have a “long-term orientation” culture [ 143 ]. Cultures with long-term orientation tend to value perseverance, persistence, and focus on future-oriented goals [ 143 ]. In contrast, cultures characterized by “short-term orientation” are focused on the immediate needs of their teams with little consideration of the impact of their decisions on the future [ 143 ]. Virtual teams defined by long-term orientation are more likely to adopt asynchronous tools with high informational value and tend to be slower to rely on CMC technologies than short-term orientated teams, which prefer synchronous tools with low informational value [ 143 ].

Cultures can also be characterized by the amount of contextualizing is performed by an individual during communication [ 95 ]. For example, Japan, a high-context culture, relies more on the use of indirect communication via contextual cues (e.g., body language) to convey information [ 139 ]. Contextualization also affects choice of CMC technologies. High-context teams tend not to rely on CMC technologies and will prefer tools that high high informational value [ 143 ]. Low-context teams, in contrast, will rely on CMC technologies and will prefer those with low informational value [ 143 ].

Virtual teams are also affected by the levels of affectiveness/neutrality present in their culture. Affectiveness in this context refers to the amount of emotion that individuals usually express when they communicate [ 143 ]. For example, individuals from affective cultures such as Italy commonly exhibit their emotions publicly. [ 143 ]. In addition, individuals from affective cultures often feel that more neutral cultures (e.g., Japan) are more intentionally deceitful because they tend to hold back on their emotions [ 240 ]. Affective teams will be less likely to rely on CMC technologies and will prefer ones with high informational value [ 143 ]. In contrast, teams with neutral cultures will highly rely on CMC technologies and will prefer tools with low informational value [ 143 ].

Other types of socio-cultural diversity influence the performance of virtual team. For example, heterogeneity in the extent to which gender roles are traditional is positively related to team performance [ 70 ]. In a similar vein, heterogeneity in the extent to which there is discomfort with the unknown has a positive effect on issue-based conflict [ 70 ]. Uncertainty avoidance also affects tool use in virtual teams. Teams that have high amounts of uncertainty avoidance are more likely to use a synchronous CMC technology with high informational value. In contrast, teams with low uncertainty avoidance are unlikely to have a preference [ 143 ]. In addition to this, the degree of inequality that exists among members of virtual teams has an affect on the tools chosen for communication [ 143 ]. Teams with a high degree of inequality (i.e., high power distance) are more likely to use synchronous tools while teams with a low degree of inequality (i.g., low power distance) will prefer asynchronous tools [ 143 ]. Specificity also plays a role in virtual team performance. Someone from a specific culture (e.g., the United Kingdom) is more likely to view their coworkers as people with whom they only have a business relationship with, [ 87 ]. In contrast, more diffuse cultures (e.g., China) are more likely to view their teammates as friends and include them in their social lives [ 143 ]. This affects the choice communication methods employed by the team as teams characterized by high specificity are more likely to rely on CMC technologies than diffuse teams [ 143 ].

High socio-cultural distance is the cause of several types of collaboration problems. For example, high socio-cultural distance reduces communication and increases risk [ 2 ] caused by relationship breakdowns between distributed teams [ 250 ] and results in more processes challenges and lower team performance [ 86 ]. Socio-cultural distance also tends to worsen the way leaders sense, interpret, and respond to problems [ 271 ]. Cultural heterogeneity also tends to result in divergent subgroup identification [ 68 ] that may subsequently have a negative effect on team interactions and performance [ 67 ]. Furthermore, in accordance with similarity/attraction theory, team members attribute positive traits to team members that they believe are similar to themselves and prefer to interact with them [ 216 , 255 ]. Negative traits are thus associated with teammates that they believe are dissimilar from them and sometimes actively avoid interactions with those teammates [ 24 ]. As a result, the belief that others are different in terms of education, race, and attitudes (i.e., perceived diversity) is frequently associated with the negative consequences of team heterogeneity [ 100 ], such as unwillingness to cooperate and coordinate activities [ 56 , 117 , 148 ].

Furthermore, teams with high socio-cultural distance are more likely to have issues with integration and communication and have more conflict [ 269 ]. Both task and affective conflict are increased as a result of the differences in perspectives and approaches related to work, which further exacerbates differences in expectations, attitudes, and beliefs [ 195 , 204 ]. These differences in belief structures are particularly common in heterogeneous groups (i.e., groups with high socio-cultural distance) [ 268 ] which, in turn, increases conflict due to differences in interpretations and opinions of work processes [ 205 ]. Thus, there is a vicious cycle between differences in belief and intra-group conflict that is detrimental to collaboration.

The most commonly experienced problems correlating with socio-cultural distance are difficulties associated with diversity in language preferences, proficiency, and interpretation, which can create barriers for many projects [ 116 ], such as requiring increased effort [ 74 , 170 , 183 ]. This challenge is not just a matter of different languages, even native speakers of one language may have problems because of differences in dialects and local accents [ 33 ]. In many global collaborations, some (if not all) of the collaborators only speak English as a second language [ 132 , 219 ]. This situation causes problems when collaborators need to synchronously communicate via teleconferencing—these team members can become overwhelmed with trying to keep up with the conversation [ 132 , 219 ]. Furthermore, this language-based disadvantage can cause non-native speakers of the dominant language to feel alienated and as though they have a disadvantage when speaking [ 219 ]. Prior work has also shown that virtual teams whose members have different first languages have more conflict and lower levels of satisfaction and cohesion [ 238 ].

Misunderstandings can occur even in cases where all collaborators are fluent in a language if there are other differences in culture—a seemingly harmless joke could have a massively detrimental impact on the success of a project if it is misunderstood as an insult [ 250 ]. Olson and Olson observed one such misunderstanding where team members in the United States ended a video conference without expressing a ‘proper farewell’ to a European teammate [ 191 ]. In this case, the curtness was due to pressure on the American team, who were unaware of the cultural expectations regarding farewells, to cut costs by conducting short video conferences [ 191 ]. The European team, however, was unaware of this pressure and perceived the lack of a proper farewell as an insult [ 191 ]. Also, conflicts can arise when teammates from a culture where saying ‘no’ is considered impolite (even when saying ‘yes’ is a problematic answer) interact with teammates who do not share this compunction [ 116 ]. Treinen and Miller-Frost encountered an instance where collaborators from one culture did not ask many questions of their teammates and instead affirmed that they had a clear understanding of requirements, but were in reality too polite to express concerns [ 250 ]. In this situation, the other collaborators were unaware of this cultural difference and did not realize that their questions should not have formulated as ‘yes or no,’ but rather should have elicited responses that indicated understanding.

Other types of socio-cultural differences such as those caused by religion, generation, and doing orientation, can also affect virtual team success. Religious differences, for example, can make it difficult for team members to understand each others norms and traditions, which has a negative influence on collaboration [ 221 ]. Generational differences can affect how a team member responds to collaborating via CMC technology because not every has the high levels of technical expertise that makes them a “digital native” [ 129 ]. Finally, differences in the extent to which work is valued as a central life interest (i.e., “doing orientation”) is negatively linked to productivity [ 135 ]. However, differences in the extent to which team members have a sense of personal control over their work and life events are positively linked to team productivity, cooperation, and empowerment [ 135 ].

A review of literature reviews and meta-analyses suggests that the “main-effects” approach, where researchers focus on relationships between outcomes and diversity dimensions, ignoring moderating variables, cannot truly account for the effects of diversity [ 86 ]. The effect of socio-cultural diversity depends on other features of the team [ 272 ], such as how long members have interacted, the types of diversity investigated, and the types of outcomes under scrutiny [ 86 ]. High task complexity, high tenure, large team size, and low levels of geographic dispersion are found to moderate the effects of socio-cultural diversity on virtual teams [ 237 ]. Experience with CMC technology can also moderate socio-cultural diversity; high heterogeneity in technical experience heightens the negative effect that differences in nationality has on creativity [ 164 ]. Socio-economic variables (e.g., human development index (HDI)) has a significant impact on a country’s scientific production and collaboration patterns [ 118 , 152 , 199 ]. Kramer et al. found that socioeconomic similarities and economic agreements between countries have contributed to increased collaboration in the scientific field [ 143 ], which is likely to be virtual. The phase in which a virtual team is at in the project life-cycle affects assessment of team performance in culturally diverse teams. Culturally heterogeneous virtual teams will outperform culturally homogeneous teams during the later part of the project life-cycle [ 264 ]. This is likely a result of teams becoming more homogeneous over time as shared team values, associated norms, and identity enables the team to overcome process challenges that occur when team members encounter cultural differences [ 86 , 264 ].

Computer-mediated communication technology (e.g., email, video-conferencing) can reduce the negative effects of socio-cultural diversity early on in the life of a diverse virtual team due to their reductive capabilities [ 32 ]. In fact, use of these tools may even be beneficial for diverse teams for this reason [ 32 ]. Many issues regarding language barriers are surmounted by the use of asynchronous technology that allows workers to reflect and carefully consider their position before answering a question posed by a collaborator that primarily speaks another language [ 2 , 116 ]. These benefits result in the heavier use of asynchronous tools, which introduces the disadvantages of asynchronous tools (e.g., increased time and effort to effectively communicate, absence of important social information and nonverbal cues) [ 2 ]. Furthermore, asynchronous communication is not feasible in every situation. And, as discussed above, language barriers can cause problems during synchronous communication. Thus, developing technology that better supports synchronous communication across a language barrier is a promising opportunity for research in supporting collaboration.

Contradictions exist in the literature with regard to the effect of socio-cultural diversity on team performance. Edwards and Shridhar [ 66 ], for example, found no relationship between a team’s socio-cultural diversity and the learning, satisfaction, or performance of its members. Other research has suggested that socio-cultural diversity is unrelated to conflict [ 108 ]. Finally, Weijen found that whether or not members of a virtual team spoke English (specifically) did not have an influence on international collaboration, likely due to the pervasiveness of English as the default language for many international journals and indexed databases [ 259 ].

It is also recommended that the addition of basic cultural awareness [ 250 ] and language training [ 120 ] be incorporated into the beginning of every project to mitigate these issues before they become major problems. One specific suggestion is to employ some of the guidelines from agile development methodology (i.e., Scrum), such as daily status meetings, to mitigate the effect of assumptions by providing an opportunity to address issues or questions during the hand-off and allocation of tasks [ 250 ]. Given the plethora of tools developed for supporting Scrum (e.g., [ 209 , 229 , 251 ]), it would be interesting to see how these tools could be adapted to smooth over collaboration issues arising from cultural differences.

6.4.3 Work culture

Socio-cultural distance can be highly influenced by the work culture dimension. For example, there may be conflicts from high socio-cultural distance between two teammates from the same country that come from very different company backgrounds [ 8 ], while the opposite may be true of teammates with different cultural and national backgrounds who share a common work culture [ 2 ]. The success of a virtual team can hinge on factors such as differences in understanding with regards to processes and knowledge, institutional bureaucracy, status differences between team members, unworkable expectations reagarding shared goals and products, and conflicting or competing institutional priorities [ 54 ]. Power asymmetries in particular can create systemic bariers that need to be explicitly navigated (as opposed to expecting perfect process design will resolve them) [ 54 ]. While differences in work culture have the potential for stimulating innovation, proving access to richer skill sets, and sharing best practices, it also has the potential to cause misunderstandings [ 2 ] and communication breakdowns [ 14 ] between teammates. This influence is partly due to the difficulties associated with communicating subtl aspects of the team culture over distance (e.g., ‘how we do things around here’ [ 8 ]). For example, differences in the competitive or cooperative culture of a workplace can pose challenges [ 191 ]. Workers are less likely to be motivated to share their skills or ‘cover for each other (p. 1)’ in organizations or cultures that promote individual competition rather than cooperation. In contrast, cooperative cultures facilitate sharing skills and effort. This issue is particularly difficult to overcome in virtual teams.

Other differences in organizational structure and leadership can have a profound impact on successful collaboration in distributed groups. The characteristics of authority and authoritative roles vary across cultures [ 8 , 145 ] which can cause conflicts and undermine morale [ 2 ]. For example, [ 33 ] observed that in a collaboration between teams located in Ireland and the United States, the Irish workers required that authority figures earn their respect while the American workers were more likely to unquestioningly give respect to superiors. Another study that focused on a collaboration between teams in the United States and Europe had contrasting results [ 8 ]. Instead of the unquestioned respect found by Casey and Richardson, [ 8 ] saw that American workers were more confrontational with their superiors and verbally expressed objections and questions while the European teams had a more formal, hierarchical management structure. These differences indicate that support for differing work cultures needs to focus on the needs and conventions of the individual organizations and refrain from imposing standards based solely on the country in which the organization resides. The degree to which an organization allows autonomous decision-making afects relationships and behaviors between teammates and can inpact things like readiness to use technology in the collaboration or willingness to exchange knowledge [ 166 , 180 ].

Teams can also vary in their goals, norms, and incentives. A lack of alignment of incentives and goals as well as differences in expectations can pose very serious problems for a collaboration [ 191 ]. These misalignment’s are difficult to detect at a distance and require substantial negotiation to overcome [ 191 ], which is nontrivial using today’s technology. For example, collaborators may have different perceptions of time as a result of temporal discontinuities caused by differences in time zones, which may further reflect differences in the value systems of collaborators at each site [ 222 ]. Tensions may arise between workers at an American site that views time as a scarce commodity and perceives time as being something that can be spent, wasted, or lost, and collaborators at a Japanese site that view time as a cyclical, recurrent entity that is in unlimited supply [ 222 ]. Along with this finding comes different expectations with regards to how many hours a day team members are expected to work, or differing definitions of what it means to work hard [ 14 ], which often varies between countries [ 22 ]. These differences in expectations are particularly problematic when one team expects that another work more hours than they previously had been working [ 14 ]. Building a sense of shared goals and expectations happens more slowly in distributed groups [ 8 ], a process that could likely be assisted by the development of new communication technology. In addition, competing incentives can undermine a team’s performance [ 54 ].

Competitive funding models may affect willingness to collaborate and disincentivize team members to share skills, knowledge, and unpublished data [ 247 ]. For example, for the Collaborative Adaptation Research Initiative in Africa and Asia project, the core partners each created an individual grant agreement with the International Development Research Centre [ 54 ]. However, while the expectation was that partners would collaborate with each other, the partners were disincentivized to collaborate due to the individual grant agreements since the partners reported individually to the funding agency, rather than collectively [ 54 ]. Unfortunately, it is frequently unrealistic to expect these dynamics to resolve themselves in a short period of time and shift into an open and trusting relationship [ 54 ].

Expectations can be strongly influenced by the language used by different groups (e.g., ‘test procedure,’ ‘phase completion’) within a virtual team, sometimes creating animosity [ 8 ]. Language is further associated with methodology—for example, disparities in definitions of quality can be reflected in different assessment procedures [ 8 ]. Misunderstandings caused by differences in work practices and methodologies can affect coordination and cooperation [ 2 ], causing delays and conflicts [ 8 ]. In these situations, a common technical language must be developed to ensure understanding, which can be an extremely difficult task [ 15 , 122 , 172 , 252 ]. This need provides an opportunity for the development of technology to assist the creation and use of project-specific technical language.

In addition to differences in technical language, various groups within a virtual team may have different backgrounds that need to be reconciled, as different organizations within a group may have different expertise and experience that create incompatible views [ 55 ]. This issue is often unavoidable since one group may have specific knowledge necessary for the project to succeed [ 120 ]. Furthermore, differences in discipline and background have a stronger effect for distributed collaborations [ 211 ]. However, there are inconsistencies in the literature with regards to the effects of discipline on collaboration. Cummings and Kiesler, for example, found that field heterogeneity has a positive effect on distributed project success [ 50 ]. Specifically, they showed that projects including many disciplines had disclosed as many positive outcomes as did projects that involved fewer. However, in an earlier study, they found that projects incorporating many disciplines were less successful than projects that relied on fewer disciplines [ 133 ]. Thus, it is uncertain as to which conclusion is accurate, presenting open questions.

The way that administrative communication is managed [ 250 ] and tasks are allocated can play a big role [ 8 ] in the success of a virtual team. For example, a project manager could assign tasks differently and adjust the way that he or she communicates with management in accordance with the team’s culture and nationality [ 8 ]. Collaborations can further benefit from creating structured understandings about how to best work together by establishing expectations and definitions to undercut assumptions [ 8 ]. The challenge then becomes finding ways to develop technology that supports these structures while still facilitating innovation, ingenuity, and ‘rapid response to organizational threats or opportunities’ [ 64 ]. However, there are also inconsistencies between studies exploring the effects of work culture on collaboration. While Walsh and Maloney [ 261 ] stated that remote collaborations did not experience more work culture problems than co-located teams, McDonough et al. [ 170 ] found that differences in work culture and practices resulted in management problems in virtual teams. This disparity presents another open question.

7 Summary of findings and open questions

In this literature review, the major factors and challenges that impact collaboration in virtual teams were identified. Section  5 discussed distance factors (geographical, temporal, and perceived distance) and their associated challenges, including reduced motivation and awareness and difficulty establishing trust. In addition, barriers to informal and face-to-face communication, particularly the team’s technical competence and access to the appropriate technical infrastructure as well as prevalence of intra-team conflict were reviewed. Additional factors that particularly affect distance collaborations were outlined in Sect.  6 , namely the nature or coupling of the work, the need for explicit management, the configuration of dispersed sites and intra-team diversity along the dimensions of common ground, socio-cultural distance, and work culture. Several open questions and directions for future research were identified in the process of conducting the review; these are divided into questions of theory, questions of technology, and recommendations for future research. These findings are used to create design implications for the development of groupware targeted towards virtual teams later in Sect.  8 .

7.1 Questions of theory

7.1.1 should future research pursue ‘awareness’.

There is currently disagreement within the community as to whether or not ‘awareness’ should be taken as a conceptual approach to investigating collaboration challenges. Critics of ‘awareness’ describe the term as ‘ambiguous and unsatisfactory’ [ 224 ] and point towards it’s tendency to be paired with an adjective (e.g., ‘passive awareness’ [ 62 ]) in an attempt to lend some specificity [ 224 ]. Despite this, the awareness approach is still a commonly explored method [ 7 , 134 ], which suggests that there is a research opportunity to address this controversy.

7.1.2 Are coordination costs higher in teams that are temporally distributed?

There is also a lack of consensus within the community as to whether coordination costs are higher in teams that are temporally distributed. For example, while Espinosa and Carmel [ 73 ] state that coordination costs are reduced when team members are not working concurrently because no direct coordination takes place when the two teammates are not working at the same time, Ågerfalk et al. [ 2 ] and Battin et al. [ 11 ] assert that temporal distance significantly increases the cost and effort of coordination due to the added difficulties of dividing work across multiple time zones.

7.1.3 How do the disparities in levels of conflict and trust between balanced and imbalanced teams affect collaboration?

As previously discussed, levels of conflict and trust differ between balanced and imbalanced teams [ 188 , 210 ]. Specifically, subgroups in balanced teams experience weaker effects from faultlines on conflict and trust than large subgroups in imbalanced teams [ 210 ]. However, the ramifications are of these differences in trust and conflict are unknown, suggesting an opportunity for research.

7.1.4 Does tightly coupled work have a negative or a positive effect on collaboration?

Several studies [ 72 , 191 , 193 ] suggest that that tightly coupled work hinders distance collaboration. However, [ 15 ] found that tightly coupled work required collaborators to frequently interact to do their work and, as a result, forced these collaborators to know more about each other, help each other, and cultivate strong engagement despite being at geographically distant sites—which actually helps distance collaboration. Given the contrast between these conclusions, there is an opportunity for further research to investigate the effects of tightly coupled work, particularly in domains other than global software development.

7.1.5 What effect does geographic dispersion have on task and affective conflict?

Contradictions exist in the current literature as to the effect of geographic distance on affective and task-based conflict. Specifically, [ 179 ] found no significant difference in affective and task-based conflict between co-located and distributed teams. This, however, is in direct conflict with their later work [ 108 ]. These contradictions are particularly interesting given that the participants in both studies did research and product development, and are therefore directly comparable. It is therefore uncertain as to which conclusion is accurate.

7.1.6 Does background heterogeneity have a positive or a negative effect on collaboration?

This question is also currently unresolved, given the contradictions in literature. In 2002, Kiesler and Cummings found that projects incorporating many disciplines were less successful than projects that relied on fewer disciplines [ 133 ]. However, later they found that field heterogeneity has a positive effect on distributed project success [ 50 ].

7.1.7 Do virtual teams encounter more work-culture related problems than co-located teams?

This is yet another example of the community’s lack of consensus on issues surrounding collaboration. For example, while McDonough et al. [ 170 ] found that differences in work culture and practices resulted in management problems in virtual teams, Walsh and Maloney [ 261 ] stated that remote collaborations did not experience more work culture problems than co-located teams.

7.2 Questions of technology

7.2.1 how can we better support communication in temporally distant work.

Due to the differences in work schedule caused by differences in time zones, particularly when sites do not have overlapping workdays, distance workers rely on asynchronous technology (e.g., email, fax) to communicate with their collaborators. However, this method has several drawbacks. Asynchronous tools tend to increase the amount of time that a collaborator has to wait for a response [ 2 ] and can leave the recipient feeling overwhelmed by the number of asynchronous messages left during the previous night [ 19 ]. Moreover, the process of writing ideas in emails increases the risk of misunderstandings between collaborators [ 57 ] over talking in person or via the telephone.

7.2.2 How can we better support informal communication?

There is an additional challenge associated with communication technology in that there is insufficient support for determining a collaborator’s availability for spur-of-the-moment, informal communication [ 101 ]. This drawback, in particular, hampers informal communication that would otherwise happen during chance encounters in a co-located environment.

7.2.3 How can we design technology to assist in the development of trust?

Research shows that body language, subtle voice inflections, facial expressions, etc., which are notably more difficult to convey via communication technology, are essential to the development of trust [ 20 , 193 ]. Furthermore, communication technology is frequently used in an irregular, unpredictable, and inequitable manner, which hampers trust [ 123 ]. As a result, it is clear that current technology needs to be updated to better assist the development of trust in distance collaborations.

7.2.4 How do we support explicit management in teams that reject formal administration?

Explicit management is necessary for successful distributed, collaborative work [ 193 ]. However, some particular types of collaboration, such as research collaborations consisting mainly of scientists, avoid the application of explicit management in their projects [ 193 ].

7.2.5 How can we support synchronous communication across language barriers?

Language barriers are of significant concern in collaborations where collaborators have different socio-cultural backgrounds (i.e., speak different languages) [ 116 ] or different work backgrounds (i.e., use different jargon) [ 8 ]. In these cases, asynchronous communication allows collaborators to reflect before responding to each other, giving them a chance to look up unfamiliar terminology or become familiar with new ideas. However, asynchronous communication has several drawbacks, as mentioned earlier, and is not feasible in every situation.

7.2.6 How do we develop technology that supports structures for negotiating terminologies and methodologies while still facilitating flexibility?

Along with the issue of surmounting technical language barriers in synchronous communication comes the need to create and use a common technical language to ensure understanding in meaning and methodology. The development of a project-specific technical language is not an easy task [ 17 , 55 , 172 , 252 ], but is important enough to collaboration to warrant assistance from technology. It is also important to ensure that this technology is flexible enough to withstand changes that may be made to the project.

7.2.7 How can we leverage existing tools developed for supporting Scrum to mitigate problems caused by cultural differences?

It has been suggested that distance collaborations employ guidelines from agile development methodology, such as daily status meetings, to mitigate the effect of incorrect assumptions caused by socio-cultural or work culture differences. The existence of a vast number of tools developed specifically to assist Scrum (e.g., [ 209 , 229 , 251 ]) presents an opportunity to investigate how these technologies can be adapted to mitigate collaboration issues arising from cultural differences.

7.2.8 How can we design communication technology to support building a sense of shared goals and expectations?

Variances between times with regards to goals, norms, incentives, and expectations can pose very serious problems for a collaboration [ 191 ]. Overcoming these differences by building a sense of universal goals and standards is a slow, but vital, process for distributed groups [ 53 ]. Furthermore, these types of misalignments are hard to recognize in distance collaborations and require substantial negotiation to overcome [ 191 ], which is nontrivial given the limitations of today’s technology

7.3 Recommendations for future research

Siebdrat et al found that perceived distance was more strongly affected by a team’s national heterogeneity than by their spatio-temporal distance, and subsequently asserted that perceived distance is more indicative of collaboration challenges than spatio-temporal distance [ 231 ]. However, other work has demonstrated that distance can affect collaborators that are all in the same country at a single site [ 4 ], with low national heterogeneity and low spatio-temporal distance. Despite this, it is unclear whether perceived distance was high or low in this case due to the context of the study. Given the apparent influence of distance on collaboration, whether it is perceived, temporal, or spatial, it is therefore important to gain a better understanding of the relationship between these types of distance and their effects on collaboration.

8 Implications for design

This section uses the findings of this LR to address the final question, Research Question 2: How can we design technology for supporting virtual teams? To do so, the following four design implications for the development of groupware that supports collaboration in virtual teams are outlined.

8.1 Assist creation of common ground and work standards

Virtual teams consisting of workers with different expertise and organizational backgrounds require conversations about project-specific technical language, methodologies, and best practices. Technology should expedite and document these conversations and decisions to both create and facilitate the everyday use of technical language. Furthermore, since systems often incorrectly assume a shared knowledge of information [ 1 ] as recommended by [ 192 ], systems should document in a manner that allows users to search for abstract representations of information. Moreover, since methodologies, best practices, and technical language tend to evolve over time, this technology needs to also support the resulting negotiation and discussion processes, as opposed to only facilitating the initial decision-making process.

8.2 Facilitate communication

Both rich discourse (i.e., containing social information and nonverbal cues as well as words, typically provided by face-to-face communication), and spontaneous, informal communication have been identified as key to preventing conflict and improving trust in virtual teams. Thus, it is imperative that technology is designed to provide the benefits of face-to-face conversations (e.g., video conferencing), such as ease in immediately detecting confusion. This is important not only for synchronous communication but also asynchronous conversations since those are the most likely to have misunderstandings that could be mitigated with additional non-verbal information. Mechanisms for supporting informal communication (e.g., chance encounters) is similarly necessary. In addition, given the difficulties experienced by virtual teams where workers are required to speak in a language that is not native to them, it is important to consider means for supporting synchronous communication across language barriers.

8.3 Provide mechanisms for work transparency

One of the key challenges faced by virtual teams is feeling a sense of connectedness to the rest of the team. This is both due to the motivational effects of not feeling isolated and the increased effort required to feel heard and acknowledged by the rest of the team located at another site. Thus, technology should be designed to provide transparency that allows workers to feel aware of their teammates, Furthermore, this technology should highlight and encourage the contributions of an individual and boost visibility within the team.

However, technology that promotes transparency, particularly technology that creates the sense of a shared workspace through open video connections, should be wary of infringing on the privacy of the team since the more information a person sends, the greater the impact on one’s privacy [ 119 ]. Furthermore, the more information a person receives, the greater the chance of disturbing work [ 119 ]. Thus, it is important to reach a good balance between providing awareness and preserving privacy and limiting distractions.

8.4 Design lightweight, familiar technology

Technical infrastructure varies across organizations—teams may not have the resources to support data-heavy communication tools, limiting their access to sophisticated collaboration technology (e.g., multiplane video conferencing). Furthermore, infrastructure may even vary within a virtual team, limiting tool use for the entire group since it is important that communication capabilities be evenly distributed [ 193 ]. Thus, care should be taken to engineer technology that is as lightweight as possible, maximizing the number of potential users. Virtual teams also face challenges related to the technical competence of their team members. It is therefore recommended that designers create technology with enough similarities to the technology currently employed by the team to facilitate adoption. New technology also needs to be compatible with existing tools, to promote adoption [ 194 ].

9 Conclusion

This literature review provided an overview of the collaboration challenges experienced by virtual teams as well as current mitigation strategies. This review utilized a well-planned search strategy to identify a total of 255 relevant studies, which chiefly concentrated on computer supported cooperative work (CSCW). Using the selected studies, we described challenges as belonging to five categories: geographical distance, temporal distance, perceived distance, the configuration of dispersed teams, and diversity of workers. Findings also revealed opportunities for research and open questions. Finally, opportunities and implications for designing groupware that better support collaborative tasks in virtual teams was discussed through the description of four design implications: assist the creation of common ground and work standards; facilitate communication; provide mechanisms for work transparency; and design lightweight, familiar technology.

Ackerman MS (2000) The intellectual challenge of CSCW: the gap between social requirements and technical feasibility. Hum Comput Interact 15(2–3):179–203

Google Scholar  

Ågerfalk PJ, Fitzgerald B, Holmstrom Olsson H, Lings B, Lundell B, Ó Conchúir E (2005) A framework for considering opportunities and threats in distributed software development. In: Proceedings of the of DiSD’05. Austrian Computer Society, pp 47–61

Alder GS, Noel TW, Ambrose ML (2006) Clarifying the effects of internet monitoring on job attitudes: the mediating role of employee trust. Inf Manag 43(7):894–903

Allen TJ (1984) Managing the flow of technology: technology transfer and the dissemination of technological information within the R&D organization. MIT Press Books 1, London

Alsharo M, Gregg D, Ramirez R (2017) Virtual team effectiveness: the role of knowledge sharing and trust. Inf Manag 54(4):479–490

Apple Inc (2017) Use FaceTime with your iPhone, iPad, or iPod touch. https://support.apple.com/en-us/HT204380

Ardissono L, Bosio G (2012) Context-dependent awareness support in open collaboration environments. UMUAI 22(3):223–254

Armstrong DJ, Cole P (1995) Managing distances and differences in geographically distributed work groups. In: Jackson SE, Ruderman MN (eds) Diversity in work teams: research paradigms for a changing workplace. American Psychological Association, pp 187–215. https://doi.org/10.1037/10189-007

Barczak G, Lassk F, Mulki J (2010) Antecedents of team creativity: an examination of team emotional intelligence, team trust and collaborative culture. Creat Innov Manag 19(4):332–345

Batarseh FS, Usher JM, Daspit JJ (2017) Collaboration capability in virtual teams: examining the influence on diversity and innovation. Int J Innov Manag 21(04):1750034

Battin RD, Crocker R, Kreidler J, Subramanian K (2001) Leveraging resources in global software development. IEEE Softw 18(2):70–77

Bell BS, Kozlowski W (2002) Goal orientation and ability: interactive effects on self-efficacy, performance, and knowledge. J Appl Psychol 87(3):497

Berry GR (2011) Enhancing effectiveness on virtual teams: understanding why traditional team skills are insufficient. J Bus Commun (1973) 48(2):186–206

Bjørn P, Ngwenyama O (2009) Virtual team collaboration: building shared meaning, resolving breakdowns and creating translucence. Inf Syst J 19(3):227–253

Bjørn P, Esbensen M, Jensen RE, Matthiesen S (2014) Does distance still matter? Revisiting the CSCW fundamentals on distributed collaboration. TOCHI 21(5):27

Blaskovich JL (2008) Exploring the effect of distance: an experimental investigation of virtual collaboration, social loafing, and group decisions. J Inf Syst 22(1):27–46

Bly SA, Harrison SR, Irwin S (1993) Media spaces: bringing people together in a video, audio, and computing environment. Commun ACM 36(1):28–46

Bodemer D, Dehler J (2011) Group awareness in CSCL environments. Comput Hum Behav 27(3):1043–1045

Boland D, Fitzgerald B (2004) Transitioning from a co-located to a globally-distributed software development team: a case study at Analog Devices Inc. In: Proceedings of the international workshop on global software development at ICSE’04. IET, pp 4–7

Bos N, Olson J, Gergle D, Olson G, Wright Z (2002) Effects of four computer—mediated communications channels on trust development. In: Proceedings the of CHI’02. ACM, New York, pp 135–140

Bradner E, Mark G (2002) Why distance matters: effects on cooperation, persuasion and deception. In: Proceedings of CSCW’02. ACM, New York, CSCW’02, pp 226–235

Brannen MY, Salk JE (2000) Partnering across borders: negotiating organizational culture in a German–Japanese joint venture. Hum Relat 53(4):451–487

Breuer C, Hüffmeier J, Hertel G (2016) Does trust matter more in virtual teams? A meta-analysis of trust and team effectiveness considering virtuality and documentation as moderators. J Appl Psychol 101(8):1151

Brewer MB (1979) In-group bias in the minimal intergroup situation: a cognitive-motivational analysis. Psychol Bull 86(2):307

Buder J (2011) Group awareness tools for learning: current and future directions. Comput Hum Behav 27(3):1114–1117

Budgen D, Burn AJ, Brereton OP, Kitchenham BA, Pretorius R (2011) Empirical evidence about the UML: a systematic literature review. Softw Pract Exp 41(4):363–392

Burke K, Aytes K, Chidambaram L, Johnson JJ (1999) A study of partially distributed work groups: the impact of media, location, and time on perceptions and performance. Small Group Res 30(4):453–490

Buvik MP, Tvedt SD (2017) The influence of project commitment and team commitment on the relationship between trust and knowledge sharing in project teams. Proj Manag J 48(2):5–21

Calefato F, Lanubile F (2017) Establishing personal trust-based connections in distributed teams. Internet Technol Lett 1:e6

Calefato F, Lanubile F, Novielli N (2017) A preliminary analysis on the effects of propensity to trust in distributed software development. In: Proceedings of ICGSE’17. IEEE, New York, pp 56–60

Carmel E, Agarwal R (2001) Tactical approaches for alleviating distance in global software development. IEEE Softw 18(2):22–29

Carte T, Chidambaram L (2004) A capabilities-based theory of technology deployment in diverse teams: leapfrogging the pitfalls of diversity and leveraging its potential with collaborative technology. J Assoc Inf Syst 5(11):4

Casey V, Richardson I (2004) Practical experience of virtual team software development. https://ulir.ul.ie/bitstream/handle/10344/2149/2004_Casey.pdf?sequence=2

Chae SW (2016) Perceived proximity and trust network on creative performance in virtual collaboration environment. Proc Comput Sci 91(Itqm):807–812

Charlier SD, Stewart GL, Greco LM, Reeves CJ (2016) Emergent leadership in virtual teams: a multilevel investigation of individual communication and team dispersion antecedents. Leadersh Q 27(5):745–764

Cheng X, Fu S, Druckenmiller D (2016) Trust development in globally distributed collaboration: a case of us and chinese mixed teams. J Manag Inf Syst 33(4):978–1007

Cheng X, Fu S, Sun J, Han Y, Shen J, Zarifis A (2016) Investigating individual trust in semi-virtual collaboration of multicultural and unicultural teams. Comput Hum Behav 62:267–276

Cheng X, Yin G, Azadegan A, Kolfschoten G (2016) Trust evolvement in hybrid team collaboration: a longitudinal case study. Group Decis Negot 25(2):267–288

Chidambaram L, Tung LL (2005) Is out of sight, out of mind? An empirical study of social loafing in technology-supported groups. Inf Syst Res 16(2):149–168

Chinowsky PS, Taylor JE (2011) Distance matters: a social network analysis of geographic dispersion in engineering organizations. In: Proceedings of EPOC’11

Cho J (2006) The mechanism of trust and distrust formation and their relational outcomes. J Retail 82(1):25–35

Choi OK, Cho E (2019) The mechanism of trust affecting collaboration in virtual teams and the moderating roles of the culture of autonomy and task complexity. Comput Hum Behav 91:305–315

Clark HH, Brennan SE (1991) Grounding in communication. In: Perspectives on socially shared cognition. American Psychological Association, Washington, DC, pp 127–149

Colquitt JA, Scott BA, LePine JA (2007) Trust, trustworthiness, and trust propensity: a meta-analytic test of their unique relationships with risk taking and job performance. J Appl Psychol 92(4):909

Cooper CD, Kurland NB (2002) Telecommuting, professional isolation, and employee development in public and private organizations. J Organ Behav 23(4):511–532

Cramton CD (2001) The mutual knowledge problem and its consequences for dispersed collaboration. Organ Sci 12(3):346–371

Cramton CD, Hinds PJ (2004) Subgroup dynamics in internationally distributed teams: ethnocentrism or cross-national learning? Res Organ Behav 26:231–263

Crisp CB, Jarvenpaa SL (2013) Swift trust in global virtual teams: trusting beliefs and normative actions. J Pers Psychol 12(1):45

Cummings JN (2011) Geography is alive and well in virtual teams. Commun ACM 54(8):24–26

Cummings JN, Kiesler S (2005) Collaborative research across disciplinary and organizational boundaries. Soc Stud Sci 35(5):703–722

Cummings JN, Kiesler S (2007) Coordination costs and project outcomes in multi-university collaborations. RP 36(10):1620–1634

Cummings JN, Kiesler S (2008) Who collaborates successfully? Prior experience reduces collaboration barriers in distributed interdisciplinary research. In: Proceedings of CSCW’08. ACM, New York, pp 437–446

Cummings L, Bromiley P (1996) The organizational trust inventory (OTI): development and validation. In: Kramer RM, Tyler TR (eds) Trust in organizations: frontiers of theory and research. Sage, Thousand Oaks, pp 302–330

Cundill G, Harvey B, Tebboth M, Cochrane L, Currie-Alder B, Vincent K, Lawn J, Nicholls RJ, Scodanibbio L, Prakash A et al (2019) Large-scale transdisciplinary collaboration for adaptation research: challenges and insights. Glob Chall 3(4):1700132

Curtis B, Krasner H, Iscoe N (1988) A field study of the software design process for large systems. Commun ACM 31(11):1268–1287

Dahlin KB, Weingart LR, Hinds PJ (2005) Team diversity and information use. Acad Manag J 48(6):1107–1123

Damian DE, Zowghi D (2002) The impact of stakeholders’ geographical distribution on managing requirements in a multi-site organization. In: Proceedings of RE’02. IEEE, New York, pp 319–328

Darics E (2014) The blurring boundaries between synchronicity and asynchronicity: new communicative situations in work-related instant messaging. Int J Bus Commun 51(4):337–358

De Jong BA, Dirks KT, Gillespie N (2016) Trust and team performance: a meta-analysis of main effects, moderators, and covariates. J Appl Psychol 101(8):1134

Dennis AR, Fuller RM, Valacich JS (2008) Media, tasks, and communication processes: a theory of media synchronicity. MIS Q 32(3):575–600

Desanctis G, Monge P (1999) Introduction to the special issue: communication processes for virtual organizations. Organ Sci 10(6):693–703

Dourish P, Bellotti V (1992) Awareness and coordination in shared workspaces. In: Proceedings of CSCW’92. ACM, New York, pp 107–114

Duarte DL, Snyder NT (2006) Mastering virtual teams: strategies, tools, and techniques that succeed. Wiley, Berlin

Dubé L, Robey D (2009) Surviving the paradoxes of virtual teamwork. ISJ 19(1):3–30

Dvir T, Eden D, Avolio BJ, Shamir B (2002) Impact of transformational leadership on follower development and performance: a field experiment. Acad Manag J 45(4):735–744

Edwards HK, Sridhar V (2005) Analysis of software requirements engineering exercises in a global virtual team setup. J Glob Inf Manag (JGIM) 13(2):21–41

Eisenberg J, Krishnan A (2018) Addressing virtual work challenges: learning from the field. Organ Manag J 15(2):78–94

Eisenberg J, Mattarelli E (2017) Building bridges in global virtual teams: the role of multicultural brokers in overcoming the negative effects of identity threats on knowledge sharing across subgroups. J Int Manag 23(4):399–411

Eisenberg J, Post C, DiTomaso N (2019) Team dispersion and performance: the role of team communication and transformational leadership. Small Group Res 50(3):348–380

Elron E (1997) Top management teams within multinational corporations: effects of cultural heterogeneity. Leadersh Q 8(4):393–412

Erickson T, Smith DN, Kellogg WA, Laff M, Richards JT, Bradner E (1999) Socially translucent systems: social proxies, persistent conversation, and the design of “babble”. In: Proceedings of CHI’99. ACM, New York, pp 72–79

Esbensen M, Bjørn P (2014) Routine and standardization in global software development. In: Proceedings of GROUP’14. ACM, New York, pp 12–23

Espinosa JA, Carmel E (2004) The effect of time separation on coordination costs in global software teams: a dyad model. In: Proceedings of HICSS’04. IEEE, New York, p 10

Espinosa JA, Pickering C (2006) The effect of time separation on coordination processes and outcomes: a case study. In: Proceedings of HICSS’06, vol 1. IEEE, New York, pp 25b–25b

Espinosa JA, Cummings JN, Pickering C (2011) Time separation, coordination, and performance in technical teams. IEEE Trans Eng Manag 59(1):91–103

Ferrell JZ, Herb KC (2012) Improving communication in virtual teams, pp 1–7. https://www.siop.org/Research-Publications/SIOP-White-Papers

Finholt T, Sproull L, Kiesler S (1990) Communication and performance in ad hoc task groups. In: Galegher J, Kraut RE (eds) Intellectual teamwork: social and technological foundations of cooperative work. Psychology Press, New York, pp 291–325

Finholt TA, Olson GM (1997) From laboratories to collaboratories: a new organizational form for scientific collaboration. Psychol Sci 8(1):28–36

Fjermestad J (2004) An analysis of communication mode in group support systems research. Decis Support Syst 37(2):239–263

Gajendran RS, Harrison DA, Delaney-Klinger K (2015) Are telecommuters remotely good citizens? Unpacking telecommuting’s effects on performance via i-deals and job resources. Pers Psychol 68(2):353–393

Gaver WW, Sellen A, Heath C, Luff P (1993) One is not enough: multiple views in a media space. In: Proceedings of INTERACT’93 and CHI’93. ACM, New York, pp 335–341

Gibbs JL, Kim H, Boyraz M (2017) Virtual teams. In: The international encyclopedia of organizational communication, pp 1–14. https://www.researchgate.net/profile/Jennifer_Gibbs/publication/314712225_Virtual_Teams/links/5a3d942a0f7e9ba8688e91f6/Virtual-Teams.pdf

Gibson CB, Gibbs JL (2006) Unpacking the concept of virtuality: the effects of geographic dispersion, electronic dependence, dynamic structure, and national diversity on team innovation. Adm Sci Q 51(3):451–495

Gibson CB, McDaniel DM (2010) Moving beyond conventional wisdom: advancements in cross-cultural theories of leadership, conflict, and teams. Perspect Psychol Sci 5(4):450–462

Gibson CB, Gibbs JL, Stanko TL, Tesluk P, Cohen SG (2011) Including the “i” in virtuality and modern job design: extending the job characteristics model to include the moderating effect of individual experiences of electronic dependence and copresence. Organ Sci 22(6):1481–1499

Gibson CB, Huang L, Kirkman BL, Shapiro DL (2014) Where global and virtual meet: the value of examining the intersection of these elements in twenty-first-century teams. Annu Rev Organ Psychol Organ Behav 1(1):217–244

Gilbert D, Tsao J (2000) Exploring Chinese cultural influences and hospitality marketing relationships. Int J Contemp Hosp Manag 12:45–54

Gilson LL, Maynard MT, Jones Young NC, Vartiainen M, Hakonen M (2015) Virtual teams research: 10 years, 10 themes, and 10 opportunities. J Manag 41(5):1313–1337

Glikson E, Wolley AW, Gupta P, Kim YJ (2019) Visualized automatic feedback in virtual teams. Front Psychol 10:814

Google Inc (2017) Google Hangouts. https://hangouts.google.com/

Greenhalgh T, Peacock R (2005) Effectiveness and efficiency of search methods in systematic reviews of complex evidence: audit of primary sources. BMJ 331(7524):1064–1065

Gressgård LJ (2011) Virtual team collaboration and innovation in organizations. Team Perform Manag Int J. https://doi.org/10.1108/dlo.2011.08125daa.007

Article   Google Scholar  

Grinter RE (2003) Recomposition: coordinating a web of software dependencies. J CSCW 12(3):297–327

Gudykunst WB (1997) Cultural variability in communication: an introduction. Commun Res 24(4):327–348

Hall ET (1976) Beyond culture. Anchor, Garden City

Han SJ, Chae C, Macko P, Park W, Beyerlein M (2017) How virtual team leaders cope with creativity challenges. Eur J Train Dev. https://doi.org/10.1108/EJTD-10-2016-0073

Handley SM, Benton W (2013) The influence of task-and location-specific complexity on the control and coordination costs in global outsourcing relationships. JOM 31(3):109–128

Hardin AM, Fuller MA, Davison RM (2007) I know i can, but can we? Culture and efficacy beliefs in global virtual teams. Small Group Res 38(1):130–155

Harrison DA, Price KH, Gavin JH, Florey AT (2002) Time, teams, and task performance: changing effects of surface-and deep-level diversity on group functioning. Acad Manag J 45(5):1029–1045

Harrison DA, Price KH, Gavin JH, Florey AT (2002) Time, teams, and task performance: changing effects of surface-and deep-level diversity on group functioning. AMJ 45(5):1029–1045

Herbsleb JD, Grinter RE (1999) Splitting the organization and integrating the code: Conway’s law revisited. In: Proceedings of ICSE’99. IEEE, New York, pp 85–95

Herbsleb JD, Mockus A (2003) An empirical study of speed and communication in globally distributed software development. IEEE Trans Softw Eng 29(6):481–494

Herbsleb JD, Mockus A, Finholt TA, Grinter RE (2000) Distance, dependencies, and delay in a global collaboration. In: Proceedings of CSCW’00. ACM, New York, pp 319–328

Hertzum M, Pries-Heje J (2011) Is minimizing interaction a solution to cultural and maturity inequality in offshore outsourcing? In: Balancing sourcing and innovation in information systems development, pp 77–97

Hill NS, Bartol KM (2016) Empowering leadership and effective collaboration in geographically dispersed teams. Pers Psychol 69(1):159–198

Hinds P, Kiesler S (2002) Distributed work. MIT Press, Cambridge

Hinds PJ, Bailey DE (2003) Out of sight, out of sync: understanding conflict in distributed teams. Organ Sci 14(6):615–632

Hinds PJ, Mortensen M (2005) Understanding conflict in geographically distributed teams: the moderating effects of shared identity, shared context, and spontaneous communication. Organ Sci 16(3):290–307

Hoch JE (2013) Shared leadership and innovation: the role of vertical leadership and employee integrity. J Bus Psychol 28(2):159–174

Hoch JE, Dulebohn JH (2017) Team personality composition, emergent leadership and shared leadership in virtual teams: a theoretical framework. Hum Resour Manag Rev 27(4):678–693

Hoch JE, Kozlowski SW (2014) Leading virtual teams: hierarchical leadership, structural supports, and shared team leadership. J Appl Psychol 99(3):390

Hofstede G (1980) Culture’s consequence international differences in work-related values. Sage, Thousand Oaks

Hofstede G (1991) Organizations and cultures: software of the mind. McGraw-Hill, New York

Hofstede G (2001) Culture’s consequences: comparing values, behaviors, institutions and organizations across nations. Sage, Thousand Oaks

Hollenbeck JR, Beersma B, Schouten ME (2012) Beyond team types and taxonomies: a dimensional scaling conceptualization for team description. Acad Manag Rev 37(1):82–106

Holmstrom H, Conchúir EÓ, Agerfalk J, Fitzgerald B (2006) Global software development challenges: a case study on temporal, geographical and socio-cultural distance. In: Proceedings of ICGSE’06. IEEE, New York, pp 3–11

Homan AC, Van Knippenberg D, Van Kleef GA, De Dreu CK (2007) Bridging faultlines by valuing diversity: diversity beliefs, information elaboration, and performance in diverse work groups. J Appl Psychol 92(5):1189

Huang D (2015) Temporal evolution of multi-author papers in basic sciences from 1960 to 2010. Scientometrics 105(3):2137–2147

Hudson SE, Smith I (1996) Techniques for addressing fundamental privacy and disruption trade-offs in awareness support systems. In: Proceedings of CSCW’96. ACM, New York, CSCW’96, pp 248–257. https://doi.org/10.1145/240080.240295

Imsland V, Sahay S, Wartiainen Y (2003) Key issues in managing a global software outsourcing relationship between a Norwegian and Russian firm: some practical implications. In: Proceedings of IRIS26

Inc ZC (2020) Zoom for video, conferencing, and phones. https://zoom.us/

Jakobsen CH, McLaughlin WJ (2004) Communication in ecosystem management: a case study of cross-disciplinary integration in the assessment phase of the Interior Columbia Basin Ecosystem Management Project. Environ Manag 33(5):591–605

Jarvenpaa SL, Leidner DE (1998) Communication and trust in global virtual teams. JCMC 3(4):791–815

Jarvenpaa SL, Shaw TR, Staples DS (2004) Toward contextualized theories of trust: the role of trust in global virtual teams. Inf Syst Res 15(3):250–267

Jehn KA (1997) A qualitative analysis of conflict types and dimensions in organizational groups. Adm Sci Q 42:530–557

Johnson SK, Bettenhausen K, Gibbons E (2009) Realities of working in virtual teams: affective and attitudinal outcomes of using computer-mediated communication. Small Group Res 40(6):623–649

Johnson-Laird PN (1989) Mental models. The MIT Press, London

Kanawattanachai P, Yoo Y (2002) Dynamic nature of trust in virtual teams. J Strateg Inf Syst 11(3–4):187–213

Kaplan AM, Haenlein M (2010) Users of the world, unite! the challenges and opportunities of social media. Bus Horiz 53(1):59–68

Kayworth T, Leidner D (2000) The global virtual manager: a prescription for success. Eur Manag J 18(2):183–194

Kayworth TR, Leidner DE (2002) Leadership effectiveness in global virtual teams. J Manag Inf Syst 18(3):7–40

Kiel L (2003) Experiences in distributed development: a case study. In: Proceedings of international workshop on global software development at ICSE’03

Kiesler S, Cummings JN (2002) What do we know about proximity and distance in work groups? A legacy of research. In: Distributed work, vol 1. MIT Press, Cambridge, pp 57–80

Kimmerle J, Cress U (2007) Group awareness and self-presentation in the information-exchange dilemma: an interactional approach. In: Proceedings of CSCL’07. International Society of the Learning Sciences, New York, pp 370–378

Kirkman BL, Shapiro DL (2005) The impact of cultural value diversity on multicultural team performance. Adv Int Manag 18:33–67

Kirkman BL, Rosen B, Tesluk PE, Gibson CB (2004) The impact of team empowerment on virtual team performance: the moderating role of face-to-face interaction. Acad Manag J 47(2):175–192

Kitchenham B, Brereton P (2013) A systematic review of systematic review process research in software engineering. Inf Softw Technol 55(12):2049–2075. https://doi.org/10.1016/j.infsof.2013.07.010

Kitchenham B, Charters S (2007) Guidelines for performing systematic literature reviews in software engineering version 2.3. Engineering 45(4ve):1051

Kittler MG, Rygl D, Mackinnon A (2011) Special review article: beyond culture or beyond control? Reviewing the use of Hall’s high-/low-context concept. Int J Cross Cult Manag 11(1):63–82

Klitmøller A, Lauring J (2013) When global virtual teams share knowledge: media richness, cultural difference and language commonality. J World Bus 48(3):398–406

Koehne B, Shih PC, Olson JS (2012) Remote and alone: coping with being the remote member on the team. In: Proceedings of CSCW’12. ACM, New York, pp 1257–1266

Kotlarsky J, Oshri I (2005) Social ties, knowledge sharing and successful collaboration in globally distributed system development projects. Eur J Inf Syst 14(1):37–48

Kramer WS, Shuffler ML, Feitosa J (2017) The world is not flat: examining the interactive multidimensionality of culture and virtuality in teams. Hum Resour Manag Rev 27(4):604–620

Kraut RE, Fussell SR, Brennan SE, Siege J (2002) Understanding effects of proximity on collaboration: implications for technologies to support remote collaborative work. In: Hinds P, Kiesler S (eds) Distributed work. MIT Press, Cambridge, pp 137–162

Krishna S, Sahay S, Walsham G (2004) Managing cross-cultural issues in global software outsourcing. Commun ACM 47(4):62–66

Kroll J, Hashmi SI, Richardson I, Audy JL (2013) A systematic literature review of best practices and challenges in follow-the-sun software development. In: Proceedings of international workshop on global software development at ICSE’13. IEEE, New York, pp 18–23

Kuo Fy, Yu Cp (2009) An exploratory study of trust dynamics in work-oriented virtual teams. J Comput Med Commun 14(4):823–854

MathSciNet   Google Scholar  

Lau DC, Murnighan JK (2005) Interactions within groups and subgroups: the effects of demographic faultlines. Acad Manag J 48(4):645–659

Leung K, Bhagat R, Buchan N, Erez M, Gibson C (2011) Beyond national culture and culture-centricism: an integrating perspective on the role of culture in international business. J Int Bus Stud 42:177–181

Liao C (2017) Leadership in virtual teams: a multilevel perspective. Hum Resour Manag Rev 27(4):648–659

Lipnack J, Stamps J (1997) Virtual teams: reaching across space, time, and organizations with technology. Wiley, New York

Livingston G, Waring B, Pacheco LF, Buchori D, Jiang Y, Gilbert L, Jha S (2016) Perspectives on the global disparity in ecological science. Bioscience 66(2):147–155

López G, Guerrero LA (2014) Notifications for collaborative documents editing. In: Proceedings of UCAmI’14. Springer, Berlin, pp 80–87

López G, Guerrero LA (2017) Awareness supporting technologies used in collaborative systems: a systematic literature review. In: Proceedings of CSCW’17. ACM, New York, pp 808–820

Lowry PB, Zhang D, Zhou L, Fu X (2010) Effects of culture, social presence, and group composition on trust in technology-supported decision-making groups. Inf Syst J 20(3):297–315

Lu LC, Chang HH, Yu ST (2011) The role of individualism and collectivism in consumer perceptions toward e-retailers’ ethics. In: 2011 international conference on information management, innovation management and industrial engineering, vol 2. IEEE, New York, pp 194–197

Malhotra A, Majchrzak A, Rosen B (2007) Leading virtual teams. Acad Manag Perspect 21(1):60–70

Malone TW, Crowston K (1994) The interdisciplinary study of coordination. CSUR 26(1):87–119

Mannix EA, Griffith T, Neale MA (2002) The phenomenology of conflict in distributed work teams. In: Hinds P, Kiesler S (eds) Distributed work. The MIT Press, Cambridge, pp 213–233

Mantei MM, Baecker RM, Sellen AJ, Buxton WA, Milligan T, Wellman B (1991) Experiences in the use of a media space. In: Proceedings of CHI’91. ACM, New York, pp 203–208

Mark G (2002) Extreme collaboration. Commun ACM 45(6):89–93

Marlow J, Dabbish L (2012) Designing interventions to reduce psychological distance in globally distributed teams. In: Proceedings of CSCW’12 companion. ACM, New York, pp 163–166

Marlow SL, Lacerenza CN, Salas E (2017) Communication in virtual teams: a conceptual framework and research agenda. Hum Resour Manag Rev 27(4):575–589

Martins LL, Shalley CE (2011) Creativity in virtual work: effects of demographic differences. Small Group Res 42(5):536–561

Maruping LM, Agarwal R (2004) Managing team interpersonal processes through technology: a task-technology fit perspective. J Appl Psychol 89(6):975

Maruping LM, Magni M (2015) Motivating employees to explore collaboration technology in team contexts. Mis Quarterly 39(1):1–16

Mattessich PW, Monsey BR (1992) Collaboration: what makes it work. A review of research literature on factors influencing successful collaboration. ERIC, St. Paul

Maynard MT, Gilson LL (2014) The role of shared mental model development in understanding virtual team effectiveness. Group Organ Manag 39(1):3–32

Maynard MT, Mathieu JE, Rapp TL, Gilson LL (2012) Something (s) old and something (s) new: modeling drivers of global virtual team effectiveness. J Organ Behav 33(3):342–365

McDonough EF, Kahnb KB, Barczaka G (2001) An investigation of the use of global, virtual, and colocated new product development teams. J Prod Innov Manag 18(2):110–120

McGuffin LJ, Olson GM (1992) ShrEdit: a shared electronic work space. University of Michigan, Cognitive Science and Machine Intelligence Laboratory, Ann Arbor

McIntyre NE, Knowles-Yánez K, Hope D (2000) Urban ecology as an interdisciplinary field: differences in the use of “‘urban” between the social and natural sciences. Urban Ecosys 4(1):5–24

McNamara K, Dennis AR, Carte TA (2008) It’s the thought that counts: the mediating effects of information processing in virtual team decision making. Inf Syst Manag 25(1):20–32

Meyerson D, Weick KE, Kramer RM et al (1996) Swift trust and temporary groups. Trust Organ Front Theory Res 166:195

Microsoft (2017) Skype. http://www.skype.com/en/

Microsoft (2020) Microsoft teams. https://products.office.com/en-us/microsoft-teams/group-chat-software

Milliken FJ, Martins LL (1996) Searching for common threads: understanding the multiple effects of diversity in organizational groups. Acad Manag Rev 21(2):402–433

Montoya MM, Massey AP, Hung YTC, Crisp CB (2009) Can you hear me now? Communication in virtual product development teams. J Prod Innov Manag 26(2):139–155

Mortensen M, Hinds PJ (2001) Conflict and shared identity in geographically distributed teams. Int J Confl Manag 12(3):212–238

Navimipour NJ, Charband Y (2016) Knowledge sharing mechanisms and techniques in project teams: literature review, classification, and current trends. Comput Hum Behav 62:730–742

Neuliep JW (2020) Intercultural communication: a contextual approach. Sage, Thousand Oaks

Newman SA, Ford RC, Marshall GW (2019) Virtual team leader communication: employee perception and organizational reality. Int J Bus Commun. https://doi.org/10.1177/2329488419829895

Nguyen-Duc A, Cruzes D, Conradi R (2012) Dispersion, coordination and performance in global software teams: a systematic review. In: Proceedings of ESEM’12. ACM, New York, pp 129–138

Nguyen-Duc A, Cruzes DS, Conradi R (2015) The impact of global dispersion on coordination, team performance and software quality—a systematic literature review. Inf Softw Technol 57:277–294

Noll J, Beecham S, Richardson I (2010) Global software development and collaboration: barriers and solutions. ACM Inroads 1(3):66–78

O’Hara-Devereaux M, Johansen R (1994) Globalwork: bridging distance, culture, and time. Jossey-Bass Pub, San Francisco

O’Leary MB, Cummings JN (2007) The spatial, temporal, and configurational characteristics of geographic dispersion in teams. Manag Inf Syst Q 31(3):433–452

O’Leary MB, Mortensen M (2010) Go (con) figure: subgroups, imbalance, and isolates in geographically dispersed teams. Organ Sci 21(1):115–131

O’Leary MB, Wilson JM, Metiu A (2012) Beyond being there: the symbolic role of communication and identification in the emergence of perceived proximity in geographically dispersed work. ESSEC working paper 1112

Olson G, Ackerman M, Atkins D, Bos N, Derrick C, Cohen M, Finholt T, Furnas G, Hedstrom M, Herbsleb J, Myers J, Olson J, Prakash A, Radev D, Teasley S, Trimble J, Weymouth T, Elizabeth Yakel, Zimmerman A, Cooney D, Hardin J, Hofer E, Knoop P, Peters G, Verhey-Henke A, Bietz M, Birnholtz J, Luo A, Potter A, Puetz M, Yew J (2006) Science of collaboratories. http://soc.ics.uci.edu/

Olson GM, Olson JS (2000) Distance matters. Hum Comput Interact 15(2):139–178

Olson GM, Zimmerman A, Bos N (2008) Scientific collaboration on the Internet. The MIT Press, Cambridge

Olson JS, Olson GM (2006) Bridging distance: empirical studies of distributed teams. In: Proceedings of human factors in MIS’06, vol 2, pp 27–30

Olson JS, Olson GM (2013) Working together apart: collaboration over the internet. Synth Lect Hum Center Inform 6(5):1–151

O’Reilly CA, Williams KY, Barsade S (1997) Demography and group performance: does diversity help? Graduate School of Business, Stanford University, Stanford

Orlikowski WJ (2002) Knowing in practice: enacting a collective capability in distributed organizing. Organ Sci 13(3):249–273

Otjacques B, McCall R, Feltz F (2006) An ambient workplace for raising awareness of internet-based cooperation. In: Proceedings of CDVE’06. LNCS, London, pp 275–286

O’Neill TA, Hancock SE, Zivkov K, Larson NL, Law SJ (2016) Team decision making in virtual and face-to-face environments. Group Decis Negot 25(5):995–1020

Pan RK, Kaski K, Fortunato S (2012) World citation and collaboration networks: uncovering the role of geography in science. Sci Rep 2:902

Parreira MR, Machado KB, Logares R, Diniz-Filho JAF, Nabout JC (2017) The roles of geographic distance and socioeconomic factors on international collaboration among ecologists. Scientometrics 113(3):1539–1550

Patel H, Pettitt M, Wilson JR (2012) Factors of collaborative working: a framework for a collaboration model. Appl Ergon 43(1):1–26

Paul DL, McDaniel RR Jr (2004) A field study of the effect of interpersonal trust on virtual collaborative relationship performance. Manag Inf Syst Q 28:183–227

Pearce WB (1974) Trust in interpersonal communication. CM 41(3):236–44

Pelled LH (1996) Demographic diversity, conflict, and work group outcomes: an intervening process theory. Organ Sci 7(6):615–631

Pelled LH, Eisenhardt KM, Xin KR (1999) Exploring the black box: an analysis of work group diversity, conflict and performance. Adm Sci Q 44(1):1–28

Pe narroja V, Orengo V, Zornoza A, Hernández A (2013) The effects of virtuality level on task-related collaborative behaviors: the mediating role of team trust. Comput Hum Behav 29(3):967–974

Pe narroja V, Orengo V, Zornoza A (2017) Reducing perceived social loafing in virtual teams: the effect of team feedback with guided reflexivity. J Appl Soc Psychol 47(8):424–435

Pinjani P, Palvia P (2013) Trust and knowledge sharing in diverse global virtual teams. Inf Manag 50(4):144–153

Pivotal Software (2017) Agile project management. https://www.pivotaltracker.com/

Polzer JT, Crisp CB, Jarvenpaa SL, Kim JW (2006) Extending the faultline model to geographically dispersed teams: how colocated subgroups can impair group functioning. Acad Manag J 49(4):679–692

Ponds R, Van Oort F, Frenken K (2007) The geographical and institutional proximity of research collaboration. Pap Reg Sci 86(3):423–443

Rains SA (2005) Leveling the organizational playing field-virtually: a meta-analysis of experimental research assessing the impact of group support system use on member influence behaviors. Commun Res 32(2):193–234

Ramasubbu N, Cataldo M, Balan RK, Herbsleb JD (2011) Configuring global software teams: a multi-company analysis of project productivity, quality, and profits. In: Proceedings of ICSE’11. ACM, New York, pp 261–270

Raymond E (1999) Homesteading the Noosphere, the Cathedral, and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary. O’Reilly & Associates, Sebastopol Calf

Robert LP (2016) Far but near or near but far? The effects of perceived distance on the relationship between geographic dispersion and perceived diversity. In: Proceedings of CHI’16. ACM, New York, pp 2461–2473. https://doi.org/10.1145/2858036.2858534

Robert LP, Denis AR, Hung YTC (2009) Individual swift trust and knowledge-based trust in face-to-face and virtual team members. J Manag Inf Syst 26(2):241–279

Robert LP Jr, You S (2018) Are you satisfied yet? Shared leadership, individual trust, autonomy, and satisfaction in virtual teams. J Assoc Inf Sci Technol 69(4):503–513

Rusman E, Van Bruggen J, Sloep P, Koper R (2010) Fostering trust in virtual project teams: towards a design framework grounded in a trustworthiness antecedents (TWAN) schema. Int J Hum Comput Stud 68(11):834–850

Sarker S, Sahay S (2004) Implications of space and time for distributed work: an interpretive study of US–Norwegian systems development teams. Eur J Inf Syst 13(1):3–20

Sarker S, Ahuja M, Sarker S, Kirkeby S (2011) The role of communication and trust in global virtual teams: a social network perspective. J Manag Inf Syst 28(1):273–310

Saunders C, Van Slyke C, Vogel DR (2004) My time or yours? Managing time visions in global virtual teams. Acad Manag Perspect 18(1):19–37

Saunders C, Van Slyke C, Vogel DR (2004) My time or yours? Managing time visions in global virtual teams. Acad Manag J 18(1):19–37

Schaubroeck JM, Yu A (2017) When does virtuality help or hinder teams? Core team characteristics as contingency factors. Hum Resour Manag Rev 27(4):635–647

Schmidt K (2002) The problem with “awareness”: introductory remarks on “awareness in CSCW”. Comput Supported Coop Work 11(3):285–298. https://doi.org/10.1023/A:1021272909573

Schmidt K, Bannon L (1992) Taking CSCW seriously. J CSCW 1(1–2):7–40

Schmidtke JM, Cummings A (2017) The effects of virtualness on teamwork behavioral components: the role of shared mental models. Hum Resour Manag Rev 27(4):660–677

Schneier CE, Goktepe JR (1983) Issues in emergent leadership: the contingency model of leadership, leader sex, leader behavior. Small Groups Soc Interact 1:413–421

Scott CPR, Wildman JL (2015) Culture, communication, and conflict: a review of the global virtual team literature. Springer, New York, pp 13–32

Scrumwise Inc (2017) The easiest scrum tool you’ll find. https://www.scrumwise.com/

See M (2018) 18 international collaboration: are the challenges worth the benefits? J Anim Sci 96(suppl–3):2–2

Siebdrat F, Hoegl M, Ernst H (2014) Subjective distance and team collaboration in distributed teams. J Prod Innov Manag 31(4):765–779

Slack (2017) Where work happens. https://slack.com/

Šmite D, Wohlin C, Gorschek T, Feldt R (2010) Empirical evidence in global software engineering: a systematic review. Empir Softw Eng 15(1):91–118

Sole D, Edmondson A (2002) Situated knowledge and learning in dispersed teams. Br J Manag 13(S2):S17–S34

Solomon C (2016) Trends in global virtual teams. https://www.rw-3.com/resource-center/2016-survey-report-trends-in-global-virtual-teams

Srivastava A, Bartol KM, Locke EA (2006) Empowering leadership in management teams: effects on knowledge sharing, efficacy, and performance. Acad Manag J 49(6):1239–1251

Stahl GK, Maznevski ML, Voigt A, Jonsen K (2010) Unraveling the effects of cultural diversity in teams: a meta-analysis of research on multicultural work groups. J Int Bus Stud 41(4):690–709

Staples DS, Zhao L (2006) The effects of cultural diversity in virtual teams versus face-to-face teams. Group Decis Negot 15(4):389–406

Steinmacher I, Chaves AP, Gerosa MA (2013) Awareness support in distributed software development: a systematic review and mapping of the literature. J CSCW 22(2–3):113–158

Straub D, Loch K, Evaristo R, Karahanna E, Srite M (2002) Toward a theory-based measurement of culture. J Glob Inf Manag (JGIM) 10(1):13–23

Strauss A (1988) The articulation of project work: an organizational process. Sociol Q 29:163–178

Swigger K, Alpaslan F, Brazile R, Monticino M (2004) Effects of culture on computer-supported international collaborations. Int J Hum Comput Stud 60(3):365–380

Tang JC, Zhao C, Cao X, Inkpen K (2011) Your time zone or mine? A study of globally time zone-shifted collaboration. In: Proceedings of CSCW’11. ACM, New York, pp 235–244

Tangirala S, Alge BJ (2006) Reactions to unfair events in computer-mediated groups: a test of uncertainty management theory. Organ Behav Hum Decis Process 100(1):1–20

Taras V, Kirkman BL, Steel P (2010) Examining the impact of culture’s consequences: a three-decade, multilevel, meta-analytic review of Hofstede’s cultural value dimensions. J Appl Psychol 95(3):405

Teasley S, Covi L, Krishnan MS, Olson JS (2000) How does radical collocation help a team succeed? In: Proceedings of CSCW’00. ACM, New York, pp 339–346

Tenopir C, Allard S, Douglass K, Aydinoglu AU, Wu L, Read E, Manoff M, Frame M (2011) Data sharing by scientists: practices and perceptions. PLoS ONE 6(6):e21101

Tenzer H, Pudelko M, Harzing AW (2014) The impact of language barriers on trust formation in multinational teams. J Int Bus Stud 45(5):508–535

Tran H, Zdun U et al (2017) Systematic review of software behavioral model consistency checking. CSUR 50(2):17

Treinen JJ, Miller-Frost SL (2006) Following the sun: case studies in global software development. IBM J Res Dev 45(4):773–783

Trello Inc (2017) Trello. https://trello.com/

Tress G, Tress B, Fry G (2007) Analysis of the barriers to integration in landscape research projects. Land Use Policy 24(2):374–385

Triandis HC, Singelis TM (1998) Training to recognize individual differences in collectivism and individualism within culture. Int J Intercult Relat 22(1):35–47

Triandis HC, Bontempo R, Villareal MJ, Asai M, Lucca N (1988) Individualism and collectivism: cross-cultural perspectives on self-ingroup relationships. J Pers Soc Psychol 54(2):323

Umphress EE, Smith-Crowe K, Brief AP, Dietz J, Watkins MB (2007) When birds of a feather flock together and when they do not: status composition, social dominance orientation, and organizational attractiveness. J Appl Psychol 92(2):396

Vaccaro A, Veloso F, Brusoni S (2009) The impact of virtual technologies on knowledge-based processes: an empirical study. Res Policy 38(8):1278–1287

Van den Bulte C, Moenaert RK (1998) The effects of R&D team co-location on communication patterns among R&D, marketing, and manufacturing. Manag Sci 44(11–part–2):S1–S18

MATH   Google Scholar  

van Solingen R, Basili V, Caldiera G, Rombach HD (2002) Goal question metric (GQM) approach. In: Marciniak JJ (ed) Encyclopedia of software engineering. https://doi.org/10.1002/0471028959.sof142

Van Weijen D (2012) The language of (future) scientific communication. Res Trends 31(11):2012

Wakefield RL, Leidner DE, Garrison G (2008) Research note—a model of conflict, leadership, and performance in virtual teams. Inf Syst Res 19(4):434–455

Walsh JP, Maloney NG (2007) Collaboration structure, communication media, and problems in scientific work teams. J Comput Mediat Commun 12(2):712–732

Walther JB, Bunz U (2005) The rules of virtual groups: trust, liking, and performance in computer-mediated communication. J Commun 55(4):828–846

Warkentin ME, Sayeed L, Hightower R (1997) Virtual teams versus face-to-face teams: an exploratory study of a web-based conference system. Decis Sci 28(4):975–996

Watson WE, Kumar K, Michaelsen LK (1993) Cultural diversity’s impact on interaction process and performance: comparing homogeneous and diverse task groups. Acad Manag J 36(3):590–602

Watson-Manheim MB, Chudoba KM, Crowston K (2002) Discontinuities and continuities: a new way to understand virtual work. ITP 15(3):191–209

Watson-Manheim MB, Chudoba KM, Crowston K (2012) Perceived discontinuities and constructed continuities in virtual work. Inf Syst J 22(1):29–52

Weinel M, Bannert M, Zumbach J, Hoppe HU, Malzahn N (2011) A closer look on social presence as a causing factor in computer-mediated collaboration. Comput Hum Behav 27(1):513–521

Wiersema MF, Bantel KA (1992) Top management team demography and corporate strategic change. Acad Manag J 35(1):91–121

Williams K, O’Reilly C III (1998) Demography and diversity in organisations: a review of 40 years of research. In: Staw BM, Cummings LL (eds) Research in organisational behaviour. Jai Pres, Greenwich

Wilson JM, Boyer O’Leary M, Metiu A, Jett QR (2008) Perceived proximity in virtual work: explaining the paradox of far-but-close. Organ Stud 29(7):979–1002

Zander L, Zettinig P, Mäkelä K (2013) Leading global virtual teams to success. Org Dyn 42(3 SI):228–237

Zellmer-Bruhn ME, Gibson CB (2013) How does culture matter. In: Yuki M, Brewer M (eds) Culture and group processes, p 166. https://books.google.com/books?hl=en&lr=&id=DtI8BAAAQBAJ&oi=fnd&pg=PA166&dq=Zellmer-Bruhn+ME,+Gibson+CB+(2013)+How+does+culture+matter.+In:+Culture+and+group+processes,+p+166&ots=wE-qqLV173&sig=svs8MQKVi40vMB_fixB86FyRmdQ#v=onepage&q&f=false

Zolin R, Hinds PJ, Fruchter R, Levitt RE (2004) Interpersonal trust in cross-functional, geographically distributed work: a longitudinal study. Inf Organ 14(1):1–26

Download references

Author information

Authors and affiliations.

Barnard College, New York, NY, USA

Sarah Morrison-Smith

University of Florida, Gainesville, FL, USA

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Sarah Morrison-Smith .

Ethics declarations

Conflict of interest.

The authors declare no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Tables  2 ,  3 ,  4 ,  5 ,  6 ,  7 and  8 .

Rights and permissions

Reprints and permissions

About this article

Morrison-Smith, S., Ruiz, J. Challenges and barriers in virtual teams: a literature review. SN Appl. Sci. 2 , 1096 (2020). https://doi.org/10.1007/s42452-020-2801-5

Download citation

Received : 09 September 2019

Accepted : 22 April 2020

Published : 20 May 2020

DOI : https://doi.org/10.1007/s42452-020-2801-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Collaboration
  • Virtual teams
  • Literature review

Advertisement

  • Find a journal
  • Publish with us
  • Track your research

COMMENTS

  1. Designing Immersive Virtual Environments for Human Behavior Research

    Introduction The Influence of Surrounding Environments on Behavior: Research Limitations. Our surrounding physical environment can influence behavior (Waterlander et al., 2015) as it "affords" (per Gibson, 1979) the activities of the broader social, political, and cultural world.By understanding how our surrounding environment affects occupants, researchers can identify evidence-based ...

  2. (PDF) Using Technologies as Virtual Environments for ...

    This systematic review aims to identify applications that use technologies to. represent virtual environments and support the teaching and learning of Computer Science. subjects. A protocol was ...

  3. Immersive Virtual Environment Technology to Supplement Environmental

    The paper also describes a relatively simple workflow for creating and displaying 360° virtual environments of built and natural settings and presents two freely-available and customizable applications that scientists from a variety of disciplines, including public health, can use to advance their research into human preferences, perceptions ...

  4. Effects of virtual learning environments: A scoping review of

    The purpose of this scoping review is to isolate and investigate the existing data and research that identifies if the synchronous face-to-face visual presence of a teacher in a virtual learning environment (VLE) is a significant factor in a student's ability to maintain good mental health. While the present research on this explicit interaction among VLE implementation and student mental ...

  5. Virtual, mixed, and augmented reality: a systematic review for

    2.1 Immersion "Immersion" and "presence" are important concepts for research in immersive systems. Nilsson et al. note that "the term immersion continues to be applied inconsistently within and across different fields of research connected with the study of virtual reality and interactive media."This observation is confirmed by our review of the literature.

  6. Effects of immersive virtual nature on nature connectedness: A

    Terms such as "immersive virtual environment,""natural setting*," and "contact with nature" were searched in Scopus, WebOfScience, GoogleScholar, Medline, and GreenFILE (22 -28 November 2021). Papers in English,describing experi- ... the existing literature and recommendations for future research. Keywords Nature connectedness ...

  7. Effects of exposure to immersive computer-generated virtual ...

    Previous research has shown that exposure to immersive virtual nature environments is able to induce positive affective and physiological effects. However, research on the effects on cognitive ...

  8. Full article: Immersive virtual reality for science learning: Design

    The advanced visualisation and interactive capabilities make immersive virtual reality (IVR) attractive for educators to investigate its educational benefits. This research reviewed 64 studies published in 2016-2020 to understand how science educators designed, implemented, and evaluated IVR-based learning.

  9. Augmented reality and virtual reality displays: emerging ...

    With rapid advances in high-speed communication and computation, augmented reality (AR) and virtual reality (VR) are emerging as next-generation display platforms for deeper human-digital ...

  10. How Virtual Reality Technology Has Changed Our Lives: An Overview of

    Virtual reality (VR) refers to a computer-generated, three-dimensional virtual environment that users can interact with, typically accessed via a computer that is capable of projecting 3D information via a display, which can be isolated screens or a wearable display, e.g., a head-mounted display (HMD), along with user identification sensors .

  11. Immersive Environments and Virtual Reality: Systematic Review and

    Today, virtual reality and immersive environments are lines of research which can be applied to numerous scientific and educational domains. Immersive digital media needs new approaches regarding ...

  12. Immersive virtual reality as a pedagogical tool in education: a

    The adoption of immersive virtual reality (I-VR) as a pedagogical method in education has challenged the conceptual definition of what constitutes a learning environment. High fidelity graphics and immersive content using head-mounted-displays (HMD) have allowed students to explore complex subjects in a way that traditional teaching methods cannot. Despite this, research focusing on learning ...

  13. Social Interaction With Agents and Avatars in Immersive Virtual

    Immersive virtual reality technologies are used in a wide range of fields such as training, education, health, and research. Many of these applications include virtual humans that are classified into avatars and agents. An overview of the applications and the advantages of immersive virtual reality and virtual humans is presented in this survey, as well as the basic concepts and terminology.

  14. A systematic review of immersive virtual reality applications for

    Immersion describes the involvement of a user in a virtual environment during which his or her awareness of time and the real world often becomes disconnected, thus providing a sense of "being" in the task environment instead. ... [45] point out that a mapping study reviews a broader topic and classifies the primary research papers within ...

  15. Effects of virtual learning environments: A scoping review of

    Abstract. The purpose of this scoping review is to isolate and investigate the existing data and research that identifies if the synchronous face-to-face visual presence of a teacher in a virtual learning environment (VLE) is a significant factor in a student's ability to maintain good mental health. While the present research on this ...

  16. The use of virtual reality in environment experiences and the

    Research comparing environmental experiences in a real environment with a two-dimensional (2D) video of the same environment and a 3D virtual simulation of that environment is rare (e.g., Palanica et al., 2019) and we have found no examples of this type of research within the context of environmental restoration and stress. Additionally, no ...

  17. Immersive virtual reality: An effective strategy for reducing stress in

    VR platforms are becoming increasingly affordable and accessible as an intervention, and there has been an increasing amount of literature evaluating the effectiveness of immersive VR for a range of mental health disorders (Valmaggia et al., 2016).It has been recognised that immersive VR effectively reduces stress levels in individuals through the use of natural virtual environments (Anderson ...

  18. PDF Using Technologies as Virtual Environments for Computer Teaching: A

    : computer science learning, virtual environment, virtual reality, augmented reality, mixed reality, systematic review. 1. Introduction. Educational institutions have been reviewing the use of traditional teaching methods . and are focusing on more productive ways to increase students' intellectual experience (Martín-Gutiérrez . et al., 2015).

  19. Virtual Research Environments: An Overview and a Research Agenda

    Abstract. Virtual Research Environments are innovative, web-based, community-oriented, comprehensive, flexible, and secure working environments conceived to serve the needs of modern science. We ...

  20. The Impact of Virtual Reality in Education: A Comprehensive Research

    This research paper aims to investigate the use of VR technologies in the field of education, exploring their potential benefits, challenges, and implications. ... By immersing students in a virtual environment, VR can provide unique and interactive learning experiences that go beyond traditional methods. VR allows students to step into a ...

  21. Full article: Teaching and learning using virtual labs: Investigating

    While research suggests that Virtual Labs may improve academic performance, the impact on the students' independence is limited. ... a virtual environment, ... Afacan, Y., & Sürer, E. (2021). Usability of virtual reality for basic design education: A comparative study with paper-based design. International Journal of Technology and Design ...

  22. (PDF) The Influence of Virtual Learning Environments in Students

    This paper focuses mainly on the relation between the use of a virtual learning environment (VLE) and students' performance. Therefore, virtual learning environments are characterised and a study ...

  23. Challenges and barriers in virtual teams: a literature review

    Virtual teams (i.e., geographically distributed collaborations that rely on technology to communicate and cooperate) are central to maintaining our increasingly globalized social and economic infrastructure. "Global Virtual Teams" that include members from around the world are the most extreme example and are growing in prevalence (Scott and Wildman in Culture, communication, and conflict ...