Wednesday, May 14, 2025

AR, VR, MR, XR, Metaverse and More, what does it all mean in 2024? | by Muadh Al Kalbani | Samsung Internet Developers

Share

Samsung Internet Developers
An artistic arrangement of stars in the shape of a person wearing an XR headset and holding controllers.
Photo by julien Tromeur on Unsplash

The field of immersive technologies is ever evolving and extremely fast-paced due to the significant technological advances in terms of hardware and software in recent years. These advances are significantly increasing what immersive technologies can offer to developers and end users, but also resulted in a great deal of ambiguity and confusion over terms such as Augmented Reality (AR), Virtual Reality (VR), Mixed Reality (MR), Metaverse and more.

In this post I try to demystify some of that confusion around different terms for immersive “realities”, where they come from, how they are classified and why they keep changing over time. My goal from this is to provide a fundamental basic understanding of what each of these terms mean that you can fall back on in this ever-evolving field. Before we jump in, there are a couple of disclaimers that I need to make:

  1. As much as I want to, I will not be covering the historical timeline of immersive technologies since the beginning of time. While I fully appreciate the great minds behind creative pieces of work such as Star Wars, Star Trek, The Matrix and others that gave us glimpses into an immersive future, and at times predicted it to a high degree of accuracy, this article will be focused on the science behind these terms and how they came to life based on research and scientific developments in this field (sorry Neo, an absolute childhood hero of mine). To this end, you won’t find the term “coined” being used much here because identifying precedence in presenting these terms isn’t the focus of this post.
  2. Reaching consensus on classifying these terms and technologies is a near impossible task, nor is it my goal, because revising terms and pushing the boundaries of our understanding of immersive technologies will be a constant effort by all stakeholders invested in this field. And that is great as we shouldn’t ever stop pushing those boundaries and having these conversations.

Now that we’ve touched on the whirlwind pace of advancement in immersive technologies and the resulting maze of terminology, let’s delve into the heart of the matter: understanding the core definitions that underpin this ever-evolving landscape. To navigate this, it’s essential to establish a solid foundation by unravelling the nuances of terms such as AR, VR, and MR.

In 1994 and 1995, two papers titled “A Taxonomy of Mixed reality Visual Displays” (by Paul Milgram and Fumio Kishino) and “Augmented Reality: A class of displays on the reality-virtuality continuum” (by Paul Milgram, Haruo Takemura, Akira Utsumi, Fumio Kishino) introduced what is known as the “Reality-Virtuality Continuum” and a taxonomy that classifies different display types that blend real and digital worlds. You may have seen the illustration of this continuum below or redesigned versions of it at some point during your immersive journey.

A depiction of the Reality-Virtuality Continuum that spans from real environments (left) to virtual reality environments (right), passing through Augmented Reality and Augmented Virtuality, both of which are highlighted as Mixed Reality environments. Captions defining these terms are repeated in the article below.
Reality-Virtuality Continuum as introduced by Milgram et al., in 1994

The continuum expands between two extremes:

  1. Reality: that is our real physical world.
  2. Virtual Reality (VR): that is a completely virtual, synthetic, computer generated environment.

In between these extremes lies the following realities:

  • Augmented Reality (AR): overlaying virtual objects on a real environment (where the majority of the environment is real).
  • Augmented Virtuality (AV): overlaying virtual objects on a real environment (where the majority of the environment is virtual).

AV is usually the least well known category on the continuum and I almost always get the “what is that?” question whenever the notion of AV shows up. This confusion around AV is actually warranted because ever since the introduction of the continuum, most research and development efforts were focused on complete VR or AR as they presented the most exciting opportunities and use cases. Accordingly most devices at the time were geared towards either full VR or AR, and not on producing true AV experiences (e.g. segment and show your real arms and hands in a full VR environment). This resulted in a lack of AV use cases and prototypes, becoming a less adopted term by developers and users.

So far, the premise of the continuum is simple and straight forward: you move from the left-hand side of the spectrum to the right hand side as you add more and more virtual elements to your view of the real world, and you eventually reach complete VR if your entire view is virtual.

In this continuum MR encompasses everything between AR and AV, thus surprisingly, it’s not actually a separate category in this classification. As per the diagram above, MR is defined by Milgram et al. as combining real and virtual elements in a single display, meaning that any immersive experience where you can see real and virtual elements on/through a single display or device is MR. At this point you might be thinking, but how can we, for example, classify a simple model-viewer AR experience on a mobile phone and an advanced, “smarter”, spatial experience using a state of the art headset under the same “MR” umbrella? Surely they’re different? There are two answers to that question:

  1. Regardless of complexity, it’s important to keep in mind that both experiences are viewed through some kind of display/monitor/screen. Granted the displays may differ in complexity, but they’re still displays.
  2. This is where the second, and arguably more important, part of Milgram et al’s work comes into play…

The continuum is arguably the most popular illustration in AR/VR/MR literature, and while it’s valuable in simplifying the concept of MR, it also often leads to overlooking the main contribution in this work that is a taxonomy to classifying different displays and realities. To put it in the words of the authors when referring to the continuum:

“…its one dimensionality is too simple to highlight the various factors which distinguish one AR/AV system from another” — Milgram, Paul, et al. “Augmented reality: A class of displays on the reality-virtuality continuum”.

In addition to the continuum, the taxonomy to classify different display types and realities based on three key parameters is essential to bring understanding these terms home (brace yourself for 1994–1995 lingo with some 2024 contextualisation):

  1. Extent of World Knowledge: the extent to which the device/system recognises its real environmental conditions and is able to react to alterations in these conditions. A simple AR experience on a mobile phone may be able to detect simple planes and walls in a room (low extent of world knowledge), but a more advanced AR/VR headset would be able to track hands, eyes, handle occlusion and re-arrange virtual objects based on real environment changes (high extent of world knowledge).
  2. Reproduction Fidelity: the quality with which the display/system can recreate the virtual and real components of an environment/experience. This covers both virtual and real parts of a scene (i.e., how realistic the virtual elements are and the quality of the real environment in the case of pass through displays or mobile phone cameras for example). Important to note here that the quality of virtual content is context dependent and is usually controlled by the developer, some aim for super realistic content while others may willingly opt for lower quality experiences. Quality of the real environment is usually bounded by the limitations of the current technologies (i.e., how good the cameras, depth sensors etc. are).
  3. Extent of Presence Metaphor: the degree to which the user is supposed to feel immersed and present in the environment/experience. This is where the device type comes into play, a simple model-viewer AR experience on a mobile phone or a 360 video on YouTube will have low presence, whereas an MR game that is aware of your surroundings on a headset would have high presence.

By coupling the continuum and taxonomy, you can now classify effectively any display type or experience that blends reality and virtuality, but more importantly understand the strengths and limitations of what it can do in terms of real world knowledge, realism and presence.

Great question. I have some bad news and some great news — the bad news is terms will most likely keep changing. The great news is terms changing (or evolving) is actually really great news! For some context, early AR/VR developments used to be bounded to research and academic labs for very specialised applications such as military, medical imaging and remote exploration. We have come so far ever since with immersive technologies now becoming much more inclusive, more commonplace, more affordable, more socially acceptable and most importantly invested in by many stakeholders. Here are some examples you may have heard of or came across during the years that illustrate why immersive terms sometimes change or why entirely new ones at times become more popular:

  • Extended Reality (XR): became a popular umbrella term to unify different realities on the continuum, making it easier for the consumers of the technology to understand and use.
  • Assisted Reality: became an adopted term due to the use of AR for providing on-site instructions and real time data for maintenance and site/equipment monitoring applications to “assist” on field workers.
  • Diminished Reality: becoming more popular due to its use in improving accessibility by removing virtual, audio or/and real elements from an immersive experience to lower immersion and allowing some level of customisation. It can also be used for user protection in immersive environments (e.g. blurring explicit content).
  • Metaverse: popularised due to the merging of VR with various other emerging technologies such as blockchain/cryptocurrency, haptics, spatial audio and others to reimagine social interaction in virtual spaces.

This evolution of terms do not necessarily invalidate the continuum and taxonomy because the underlying technology (or “display”) remain the same, they simply showcase a great level of maturity and ownership from different stakeholders involved in using and developing immersive technologies. We now as users, developers, industry leaders, researchers and scientists are steering this ship, its applications, its acceptance and what we should call it at different points in time together.

The work of Milgram and co has always been (and still is) the true North for me when it comes to classifying immersive technologies throughout my journey as a user and developer. The reason for this is because they used an absolute constant when it comes to using immersive technologies — the display. This is key to forming a baseline understanding of these terms (and potential future ones), because the taxonomy is not concerned with user input, impact of the different technologies on user perceptions or the coupling of these environments/realities with other technologies (it may be argued that classification of realities should take those parameters into account, but that’s a philosophical battle for another day. See works [1]-[2] that provide great insights over this topic in their revisions of the continuum).

Immersive experiences are always viewed through a certain display, this could be a simple mobile/PC screen or the most advanced headset, there is always a display/screen/mirror through which the computer generated content is viewed. The strength of the taxonomy and its accompanying continuum lies within its simplicity.

Read more

Trending News