Video Clips of Augmented Reality System in Operation
The links below are to files showing the Augmented Reality system in operation. These
are in MPEG format acquired at a frame rate of 15 frames/second.
One note about the videos available here. Our method relies on tracking features
in the scene and using those features to create an affine coordinate system in which the
virtual objects are represented. These clips are from two implementations of the
system. One used the corners of two black rectangles as the tracked feature points.
The second implementation used green colored markers. If there are green
markers in the image you are looking at the operation of the second implementation.
There still are high-contrast rectangular areas in the images but those were not used as
tracking features.
- Overall view 1 (640 kbytes), Overall view
2 (954 kbytes)
- This shows the overall view of the augmented reality system. There is the frame with the
two black rectangles that is used to define the affine reference frame. The monitor shows
the augmented view of the scene. Initially, there is no object shown and then a globe
appears within the frame. The image on the video monitor may not be too clear. For video
clips of just the augmented view follow the links below.
-
- Basic operation
- This shows the basic operation of the augmented reality system. A virtual object is
positioned on the frame. It appears to stay fixed to the real object as that object is
moved around in front of the video camera. Slight movements of the virtual object are due
to inaccuracies in feature tracking and delays in the system.
The orientation of the feature points in the previous video segments is done just for
convenience. It allows automatic location of the object on the L-frame. This
is not a requirement of the method. These two segments illustrate augmenting a scene
where the feature points are placed in a more arbitrary arrangement.
Construction example (839 kbytes) This is a two dimensional
example showing an example in construction. A blueprint is applied over the area on
a wall to give an augmented view of the interior of the wall. Distortions of the
blueprint due to our affine approximation can be seen. The viewer can also see
improper occulsions of foreground objects by the virtual blueprint.
- Animation in affine space (585 kbytes)
- An important feature of an augmented reality system would be the ability to animate the
virtual objects. This clip shows a virtual cube whose translation is animated in the the
affine coordinate frame.
-
- Handling occlusions
- This method of augmenting reality uses the computer graphics system to resolve hidden
surfaces in the virtual objects and to properly handle virtual objects occulding other
virtual objects. Due to the nature of the merging of the virtual scene with the live
video scene, a virtual object drawn at a particular pixel location will always occlude the
live video at that pixel location. By defining real objects in the affine coordinate
system real objects that are closer to the viewer in 3D space can correctly occlude a
virtual object.
-
- Dealing with latency
- Latency is as much of a problem in augmented reality systems as it is in virtual reality
systems. Other than just getting faster equipment, some researchers are
investigating predictive methods to help mitigate the latency effects. Most of these
efforts use models of the human operator and position measurements to predict forward in
time. Our system does not have position measurements available. Instead we
experimented with simple forward prediction on the location of the feature points the
system tracks. We assumed a constant velocity motion in image space and did a simple
first order forward prediction. To filter some of the jitter introduced by noisy
feature trackers we add Kalman filtering to the output of our color feature trackers.
- Filtering results (2.1 Mbytes) This segment shows three
sequences. The first is the unfiltered system with no prediction applied. The
second applyies three frames of forward prediction. (We measured latencies in the
range of 70 - 90 msec. or 2 to 3 video frames.) The jitters are due to errors in
velocity computation associated with noise in the tracker output. The last sequence
shows the results of adding a Kalman filter to the feature tracker output prior to
velocity calculation. The clock tower stays in position much better but still does
exhibit some jerky motion.
- Registration test (631 kbytes) This segment shows how we
measured registration error. A real scene was constructed in black with a nail in
the center of the scene. The tip of the nail is painted white. A virtual point
was placed at the tip of the nail. A tracker was locked onto the tip of the nail in
the live video. As the L-frame is moved, the Euclidean distance between this tracked
location and the reprojected location of the virtual point (using our method of affine
representation) was calculated as the registration error. With our forward
prediction of three video frames we found a factor of 2 to 3 decrease in registration
error.
-
- Video see-through operation (831 kbytes)
- Because our system only uses the input from video cameras for defining its common
coordinate system, switching to a video see-through head-mounted display (HMD) was a
simple as placing two cameras on the HMD. These cameras view the real scene in
stereo. The augmented view is presented to the user on the display of the HMD.
In this sequence the monitor shows what the user is viewing.
-
- Haptics in Augmented Reality
- One of the areas that has not been investigated in augmented reality systems is the
incorporation of interaction with the virtual objects. Using a PHANToM haptic
interface device manufactured by Sensable Technologies.
This interface allowed the user to operate the system in WYSIWYF mode (What You See
is What You Feel).
- Touching the globe (1.2 Mbytes) Here the user is tracing
the coastline based on feel. When the active point of the Phantom is over land the
user feels a rough sensation. Over water the point sinking into the globe simulating
the soft surface of the water. The proper registration of sight and touch is
maintained even while the globe is rotating.
- Spinning the globe (514 kbytes) The user has control of the
orientation of the globe. Whenever the user's finger is in contact with the globe it
will spin about its center.
- Cube hockey (2.1 Mbytes) The user is able to knock this
cube around in the workspace. The cube correctly collides with the vertical part of
the L-frame and can rest on top of it. It also is properly occluded by the frame
when it passes behind it. The user feels the weight of the cube when lifting it with the
"magnetic finger" and also senses the momentum of colisions between the cube and
the vertical wall.
-
- Handling occlusions (again)
- In the examples of haptic interactions with the virtual objects it is easily seen that
the user's hand and the Phantom are not properly occluding the virtual objects when they
are in front. We do not model the hand and the Phantom in the virtual scene.
(Barely visible in these sequences is a red marker that was added to assist the user in
identifying where the active point of the phantom was located.)
It might be possible to define a model of the Phantom for the system but the user's hand
and arm would still be a problem. We decided to explore the use of foreground
detection using color statistics. At runtime, in any area of the video image where
the color is statistically different than in a background scene analysed prior to starting
operation we assume it represents a foreground motion at the 3D depth of the Phantom.
- Phantom plane (557 kbytes) This shows the plane on
which are detected foreground activity is assumed to take place. Only a small
segment of the plane is shown. It actually is assumed to cover the entire scene.
- Foreground (2.9 Mbytes) This blue areas of the image
are where color statistically different from the empty scene was found. The
computation is then performed only in the area of the bounding box for the graphics.
Finally, the augmented image is shown when the live video is shown in the detected
areas that are in closer to the viewer than any virtual objects.
Forward to
the next augmented reality section
Back to
Augmented Reality Home Page
Page last modified: 22 August 2002;