Stereoscopy (also called stereoscopics or 3D imaging) is a technique for creating or enhancing the illusion of depth in an image by means of stereopsis for binocular vision. The word stereoscopy derives from the Greek "στερεός" (stereos), "firm, solid" + "σκοπέω" (skopeō), "to look", "to see".
Most stereoscopic methods present two offset images separately to the left and right eye of the viewer. These two-dimensional images are then combined in the brain to give the perception of 3D depth. This technique is distinguished from 3D displays that display an image in three full dimensions, allowing the observer to increase information about the 3-dimensional objects being displayed by head and eye movements.
- 1 Background
- 2 Side-by-side
- 3 3D viewers
- 4 Other display methods without viewers
- 5 Stereo photography techniques
- 6 Base line selection
- 6.1 Longer base line for distant objects "Hyper Stereo"
- 6.2 Shorter baseline for ultra closeups "Macro stereo"
- 6.3 Baseline tailored to viewing method
- 6.4 Variable base for "geometric stereo"
- 7 Stereo Window
- 8 Bibliography
- 9 External links
Stereoscopy creates the illusion of three-dimensional depth from given two-dimensional images. Human vision, including the perception of depth, is a complex process which only begins with the acquisition of visual information taken in through the eyes; much processing ensues within the brain, as it strives to make intelligent and meaningful sense of the raw information provided. One of the very important visual functions that occur within the brain as it interprets what the eyes see is that of assessing the relative distances of various objects from the viewer, and the depth dimension of those same perceived objects. The brain makes use of a number of cues to determine relative distances and depth in a perceived scene, including:
- Accommodation of the eye
- Overlapping of one object by another
- Subtended visual angle of an object of known size
- Linear perspective (convergence of parallel edges)
- Vertical position (objects higher in the scene generally tend to be perceived as further away)
- Haze, desaturation, and a shift to bluishness
- Change in size of textured pattern detail
(All the above cues, with the exception of the first two, are present in traditional two-dimensional images such as paintings, photographs, and television.)
Stereoscopy is the production of the illusion of depth in a photograph, movie, or other two-dimensional image by presenting a slightly different image to each eye, and thereby adding the first of these cues (stereopsis) as well. Both of the 2D offset images are then combined in the brain to give the perception of 3D depth. It is important to note that since all points in the image focus at the same plane regardless of their depth in the original scene, the second cue, focus, is still not duplicated and therefore the illusion of depth is incomplete. There are also primarily two effects of stereoscopy that are unnatural for the human vision: first, the mismatch between convergence and accommodation, caused by the difference between an object's perceived position in front of or behind the display or screen and the real origin of that light and second, possible crosstalk between the eyes, caused by imperfect image separation by some methods.
Although the term "3D" is ubiquitously used, it is also important to note that the presentation of dual 2D images is distinctly different from displaying an image in three full dimensions. The most notable difference is that, in the case of "3D" displays, the observer's head and eye movement will not increase information about the 3-dimensional objects being displayed. Holographic displays or volumetric display are examples of displays that do not have this limitation. Similar to the technology of sound reproduction, in which it is not possible to recreate a full 3-dimensional sound field merely with two stereophonic speakers, it is likewise an overstatement of capability to refer to dual 2D images as being "3D". The accurate term "stereoscopic" is more cumbersome than the common misnomer "3D", which has been entrenched after many decades of unquestioned misuse. Although most stereoscopic displays do not qualify as real 3D display, all real 3D displays are also stereoscopic displays because they meet the lower criteria as well.
Wheatstone originally used his stereoscope (a rather bulky device) with drawings because photography was not yet available, yet his original paper seems to foresee the development of a realistic imaging method:
For the purposes of illustration I have employed only outline figures, for had either shading or colouring been introduced it might be supposed that the effect was wholly or in part due to these circumstances, whereas by leaving them out of consideration no room is left to doubt that the entire effect of relief is owing to the simultaneous perception of the two monocular projections, one on each retina. But if it be required to obtain the most faithful resemblances of real objects, shadowing and colouring may properly be employed to heighten the effects. Careful attention would enable an artist to draw and paint the two component pictures, so as to present to the mind of the observer, in the resultant perception, perfect identity with the object represented. Flowers, crystals, busts, vases, instruments of various kinds, &c., might thus be represented so as not to be distinguished by sight from the real objects themselves.
Stereoscopy is used in photogrammetry and also for entertainment through the production of stereograms. Stereoscopy is useful in viewing images rendered from large multi-dimensional data sets such as are produced by experimental data. An early patent for 3D imaging in cinema and television was granted to physicist Theodor V. Ionescu in 1936. Modern industrial three-dimensional photography may use 3D scanners to detect and record three-dimensional information. The three-dimensional depth information can be reconstructed from two images using a computer by corresponding the pixels in the left and right images (e.g.,). Solving the Correspondence problem in the field of Computer Vision aims to create meaningful depth information from two images.
Anatomically, there are 3 levels of binocular vision required to view stereo images:
- Simultaneous perception
- Fusion (binocular 'single' vision)
These functions develop in early childhood. Some people who have strabismus disrupt the development of stereopsis, however orthoptics treatment can be used to improve binocular vision. A person's stereoacuity determines the minimum image disparity they can perceive as depth. It is believed that approximately 12% of people are unable to properly see 3D images, due to a variety of medical conditions. According to another experiment up to 30% of people have very weak stereoscopic vision preventing them from depth perception based on stereo disparity. This nullifies or greatly decreases immersion effects of stereo to them.
Traditional stereoscopic photography consists of creating a 3D illusion starting from a pair of 2D images, a stereogram. The easiest way to enhance depth perception in the brain is to provide the eyes of the viewer with two different images, representing two perspectives of the same object, with a minor deviation equal or nearly equal to the perspectives that both eyes naturally receive in binocular vision.
To avoid eyestrain and distortion, each of the two 2D images should be presented to the viewer so that any object at infinite distance is perceived by the eye as being straight ahead, the viewer's eyes being neither crossed nor diverging. When the picture contains no object at infinite distance, such as a horizon or a cloud, the pictures should be spaced correspondingly closer together.
The principal advantages of side-by-side viewers is the lack of diminution of brightness, allowing the presentation of images at very high resolution and in full spectrum color, simplicity in creation, and little or no additional image processing is required. Under some circumstances, such as when a pair of images are presented for freeviewing, no device or additional optical equipment is needed.
The principal disadvantage of side-by-side viewers is that large image displays are not practical and resolution is limited by the lesser of the display medium or human eye. This is because as the dimensions of an image are increased, either the viewing apparatus or viewer themselves must move proportionately further away from it in order to view it comfortably. Moving closer to an image in order to see more detail would only be possible with viewing equipment that adjusted to the difference.
Freeviewing is viewing a side-by-side image pair without using a viewing device.
- The parallel viewing method uses an image pair with the left-eye image on the left and the right-eye image on the right. The fused three-dimensional image appears larger and more distant than the two actual images, making it possible to convincingly simulate a life-size scene. The viewer attempts to look through the images with the eyes substantially parallel, as if looking at the actual scene. This can be difficult with normal vision because eye focus and binocular convergence are habitually coordinated. One approach to decoupling the two functions is to view the image pair extremely close up with completely relaxed eyes, making no attempt to focus clearly but simply achieving comfortable stereoscopic fusion of the two blurry images by the "look-through" approach, and only then exerting the effort to focus them more clearly, increasing the viewing distance as necessary. Regardless of the approach used or the image medium, for comfortable viewing and stereoscopic accuracy the size and spacing of the images should be such that the corresponding points of very distant objects in the scene are separated by the same distance as the viewer's eyes, but not more; the average interocular distance is about 63 mm. Viewing much more widely separated images is possible, but because the eyes never diverge in normal use it usually requires some previous training and tends to cause eye strain.
- The cross-eyed viewing method swaps the left and right eye images so that they will be correctly seen cross-eyed, the left eye viewing the image on the right and vice-versa. The fused three-dimensional image appears to be smaller and closer than the actual images, so that large objects and scenes appear miniaturized. This method is usually easier for freeviewing novices. As an aid to fusion, a fingertip can be placed just below the division between the two images, then slowly brought straight toward the viewer's eyes, keeping the eyes directed at the fingertip; at a certain distance, a fused three-dimensional image should seem to be hovering just above the finger. Alternatively, a piece of paper with a small opening cut into it can be used in a similar manner; when correctly positioned between the image pair and the viewer's eyes, it will seem to frame a small three-dimensional image.
Prismatic, self-masking glasses are now being used by some cross-eyed-view advocates. These reduce the degree of convergence required and allow large images to be displayed. However, any viewing aid that uses prisms, mirrors or lenses to assist fusion or focus is simply a type of stereoscope, excluded by the customary definition of freeviewing.
Stereoscopically fusing two separate images without the aid of mirrors or prisms while simultaneously keeping them in sharp focus without the aid of suitable viewing lenses inevitably requires an unnatural combination of eye vergence and accommodation. Simple freeviewing therefore cannot accurately reproduce the physiological depth cues of the real-world viewing experience. Different individuals may experience differing degrees of ease and comfort in achieving fusion and good focus, as well as differing tendencies to eye fatigue or strain.
An autostereogram is a single-image stereogram (SIS), designed to create the visual illusion of a three-dimensional (3D) scene within the human brain from an external two-dimensional image. In order to perceive 3D shapes in these autostereograms, one must overcome the normally automatic coordination between focusing and vergence.
Stereoscope and stereographic cards
The stereoscope is essentially an instrument in which two photographs of the same object, taken from slightly different angles, are simultaneously presented, one to each eye. A simple stereoscope is limited in the size of the image that may be used. A more complex stereoscope uses a pair of horizontal periscope-like devices, allowing the use of larger images that can present more detailed information in a wider field of view.
Some stereoscopes are designed for viewing transparent photographs on film or glass, known as transparencies or diapositives and commonly called slides. Some of the earliest stereoscope views, issued in the 1850s, were on glass. In the early 20th century, 45x107 mm and 6x13 cm glass slides were common formats for amateur stereo photography, especially in Europe. In later years, several film-based formats were in use. The best-known formats for commercially issued stereo views on film are Tru-Vue, introduced in 1931, and View-Master, introduced in 1939 and still in production. For amateur stereo slides, the Stereo Realist format, introduced in 1947, is by far the most common.
The user typically wears a helmet or glasses with two small LCD or OLED displays with magnifying lenses, one for each eye. The technology can be used to show stereo films, images or games, but it can also be used to create a virtual display. Head-mounted displays may also be coupled with head-tracking devices, allowing the user to "look around" the virtual world by moving their head, eliminating the need for a separate controller. Performing this update quickly enough to avoid inducing nausea in the user requires a great amount of computer image processing. If six axis position sensing (direction and position) is used then wearer may move about within the limitations of the equipment used. Owing to rapid advancements in computer graphics and the continuing miniaturization of video and other equipment these devices are beginning to become available at more reasonable cost.
Head-mounted or wearable glasses may be used to view a see-through image imposed upon the real world view, creating what is called augmented reality. This is done by reflecting the video images through partially reflective mirrors. The real world view is seen through the mirrors' reflective surface. Experimental systems have been used for gaming, where virtual opponents may peek from real windows as a player moves about. This type of system is expected to have wide application in the maintenance of complex systems, as it can give a technician what is effectively "x-ray vision" by combining computer graphics rendering of hidden elements with the technician's natural vision. Additionally, technical data and schematic diagrams may be delivered to this same equipment, eliminating the need to obtain and carry bulky paper documents.
Virtual retinal displays
A virtual retinal display (VRD), also known as a retinal scan display (RSD) or retinal projector (RP), not to be confused with a "Retina Display", is a display technology that draws a raster image (like a television picture) directly onto the retina of the eye. The user sees what appears to be a conventional display floating in space in front of them. For true stereoscopy, each eye must be provided with its own discrete display. To produce a virtual display that occupies a usefully large visual angle but does not involve the use of relatively large lenses or mirrors, the light source must be very close to the eye. A contact lens incorporating one or more semiconductor light sources is the form most commonly proposed. As of 2013, the inclusion of suitable light-beam-scanning means in a contact lens is still very problematic, as is the alternative of embedding a reasonably transparent array of hundreds of thousands (or millions, for HD resolution) of accurately aligned sources of collimated light.
There are two categories of 3D viewer technology, active and passive. Active viewers have electronics which interact with a display. Passive viewers filter constant streams of binocular input to the appropriate eye.
A shutter system works by openly presenting the image intended for the left eye while blocking the right eye's view, then presenting the right-eye image while blocking the left eye, and repeating this so rapidly that the interruptions do not interfere with the perceived fusion of the two images into a single 3D image. It generally uses liquid crystal shutter glasses. Each eye's glass contains a liquid crystal layer which has the property of becoming dark when voltage is applied, being otherwise transparent. The glasses are controlled by a timing signal that allows the glasses to alternately darken over one eye, and then the other, in synchronization with the refresh rate of the screen.
To present stereoscopic pictures, two images are projected superimposed onto the same screen through polarizing filters or presented on a display with polarized filters. For projection, a silver screen is used so that polarization is preserved. The viewer wears low-cost eyeglasses which also contain a pair of opposite polarizing filters. As each filter only passes light which is similarly polarized and blocks the opposite polarized light, each eye only sees one of the images, and the effect is achieved.
Interference filter systems
This technique uses specific wavelengths of red, green, and blue for the right eye, and different wavelengths of red, green, and blue for the left eye. Eyeglasses which filter out the very specific wavelengths allow the wearer to see a full color 3D image. It is also known as spectral comb filtering or wavelength multiplex visualization or super-anaglyph. Dolby 3D uses this principle. The Omega 3D/Panavision 3D system has also used an improved version of this technology In June 2012 the Omega 3D/Panavision 3D system was discontinued by DPVO Theatrical, who marketed it on behalf of Panavision, citing ″challenging global economic and 3D market conditions″. Although DPVO dissolved its business operations, Omega Optical continues promoting and selling 3D systems to non-theatrical markets. Omega Optical’s 3D system contains projection filters and 3D glasses. In addition to the passive stereoscopic 3D system, Omega Optical has produced enhanced anaglyph 3D glasses. The Omega’s red/cyan anaglyph glasses use complex metal oxide thin film coatings and high quality annealed glass optics.
Color anaglyph systems
Anaglyph 3D is the name given to the stereoscopic 3D effect achieved by means of encoding each eye's image using filters of different (usually chromatically opposite) colors, typically red and cyan. Anaglyph 3D images contain two differently filtered colored images, one for each eye. When viewed through the "color-coded" "anaglyph glasses", each of the two images reaches one eye, revealing an integrated stereoscopic image. The visual cortex of the brain fuses this into perception of a three dimensional scene or composition.
The ChromaDepth procedure of American Paper Optics is based on the fact that with a prism, colors are separated by varying degrees. The ChromaDepth eyeglasses contain special view foils, which consist of microscopically small prisms. This causes the image to be translated a certain amount that depends on its color. If one uses a prism foil now with one eye but not on the other eye, then the two seen pictures – depending upon color – are more or less widely separated. The brain produces the spatial impression from this difference. The advantage of this technology consists above all of the fact that one can regard ChromaDepth pictures also without eyeglasses (thus two-dimensional) problem-free (unlike with two-color anaglyph). However the colors are only limitedly selectable, since they contain the depth information of the picture. If one changes the color of an object, then its observed distance will also be changed.
The Pulfrich effect is based on the phenomenon of the human eye processing images more slowly when there is less light, as when looking through a dark lens. Because the Pulfrich effect depends on motion in a particular direction to instigate the illusion of depth, it is not useful as a general stereoscopic technique. For example, it cannot be used to show a stationary object apparently extending into or out of the screen; similarly, objects moving vertically will not be seen as moving in depth. Incidental movement of objects will create spurious artifacts, and these incidental effects will be seen as artificial depth not related to actual depth in the scene.
Stereoscopic viewing is achieved by placing an image pair one above one another. Special viewers are made for over/under format that tilt the right eyesight slightly up and the left eyesight slightly down. The most common one with mirrors is the View Magic. Another with prismatic glasses is the KMQ viewer. A recent usage of this technique is the openKMQ project.
Other display methods without viewers
Autostereoscopic display technologies use optical components in the display, rather than worn by the user, to enable each eye to see a different image. Because headgear is not required, it is also called "glasses-free 3D". The optics split the images directionally into the viewer's eyes, so the display viewing geometry requires limited head positions that will achieve the stereoscopic effect. Automultiscopic displays provide multiple views of the same scene, rather than just two. Each view is visible from a different range of positions in front of the display. This allows the viewer to move left-right in front of the display and see the correct view from any position. The technology includes two broad classes of displays: those that use head-tracking to ensure that each of the viewer's two eyes sees a different image on the screen, and those that display multiple views so that the display does not need to know where the viewers' eyes are directed. Examples of autostereoscopic displays technology include lenticular lens, parallax barrier, volumetric display, holography and light field displays.
Laser holography, in its original "pure" form of the photographic transmission hologram, is the only technology yet created which can reproduce an object or scene with such complete realism that the reproduction is visually indistinguishable from the original, given the original lighting conditions. It creates a light field identical to that which emanated from the original scene, with parallax about all axes and a very wide viewing angle. The eye differentially focuses objects at different distances and subject detail is preserved down to the microscopic level. The effect is exactly like looking through a window. Unfortunately, this "pure" form requires the subject to be laser-lit and completely motionless—to within a minor fraction of the wavelength of light—during the photographic exposure, and laser light must be used to properly view the results. Most people have never seen a laser-lit transmission hologram. The types of holograms commonly encountered have seriously compromised image quality so that ordinary white light can be used for viewing, and non-holographic intermediate imaging processes are almost always resorted to, as an alternative to using powerful and hazardous pulsed lasers, when living subjects are photographed.
Although the original photographic processes have proven impractical for general use, the combination of computer-generated holograms (CGH) and optoelectronic holographic displays, both under development for many years, has the potential to transform the half-century-old pipe dream of holographic 3D television into a reality; so far, however, the large amount of calculation required to generate just one detailed hologram, and the huge bandwidth required to transmit a stream of them, have confined this technology to the research laboratory.
Volumetric displays use some physical mechanism to display points of light within a volume. Such displays use voxels instead of pixels. Volumetric displays include multiplanar displays, which have multiple display planes stacked up, and rotating panel displays, where a rotating panel sweeps out a volume.
Other technologies have been developed to project light dots in the air above a device. An infrared laser is focused on the destination in space, generating a small bubble of plasma which emits visible light.
Integral imaging is an autostereoscopic or multiscopic 3D display, meaning that it displays a 3D image without the use of special glasses on the part of the viewer. It achieves this by placing an array of microlenses (similar to a lenticular lens) in front of the image, where each lens looks different depending on viewing angle. Thus rather than displaying a 2D image that looks the same from every direction, it reproduces a 4D light field, creating stereo images that exhibit parallax when the viewer moves.
Wiggle stereoscopy is an image display technique achieved by quickly alternating display of left and right sides of a stereogram. Found in
Stereo photography techniques
It is necessary to take two photographs from different horizontal positions to get a true stereoscopic image pair. This can be done with two separate side-by-side cameras; with one camera moved from one position to another between exposures; with one camera and a single exposure by means of an attached mirror or prism arrangement that presents a stereoscopic image pair to the camera lens; or with a stereo camera incorporating two or more side-by-side lenses.
As part of a wider 3-D craze that swept the US in the 1950s, stereoscopic photography enjoyed a surge of popularity and a new generation of stereoscopic cameras appeared on the market. More compact and convenient than their pre-World War II predecessors, they adopted the increasingly popular 135 film (35 mm) format that allowed the use of Kodachrome color film, which produced color transparencies ("slides") instead of prints on paper. The relative novelty of Kodachrome's vivid colors and the realism of 3-D were each attractive individually, but the astonishingly lifelike effect of the two combined proved irresistible to many consumers. The Stereo Realist camera, introduced in 1947, was the pioneer. Already advertised with celebrity endorsements and well-established when the surge arrived in the 1950s, it was widely copied but maintained its lead. Its 5P (five film perforations per image) format was adopted as a standard by most of its competitors, including Kodak.
The new cameras were marketed with corresponding two-lensed Realist-format slide viewers, which typically had a built-in light source and adjustable optics. With only these two items the owner could capture, relive and share multicolored and stereoscopically preserved memories. For group viewing and perhaps even greater realism, a polarized stereoscopic slide projector and silver screen could be added to the system. The popularity of stereoscopic photography waned along with the 1950s 3-D fad, but not so quickly or completely. Subsequent decades found new users replenishing the ranks of loyal devotees, and even today, despite the general transition from film to digital and from slide viewing and projection to slide scanning and video display, some of this sturdy equipment is still in use by a small core of enthusiasts of all ages.
The 1980s saw a minor revival of stereoscopic photography when several point-and-shoot stereo cameras were introduced. Most of these cameras suffered from poor optics and plastic construction, and were designed to produce lenticular prints, a format which never gained wide acceptance, so they never gained the popularity of the 1950s stereo cameras.
The beginning of the 21st century marked the coming of the age of digital photography. Stereo lenses were introduced which could turn an ordinary film camera into a stereo camera by using a special double lens to take two images and direct them through a single lens to capture them side by side on the film. Although current digital stereo cameras cost hundreds of dollars, cheaper models also exist, for example those produced by the company Loreo. It is also possible to create a twin camera rig, together with a "shepherd" device to synchronize the shutter and flash of the two cameras. By mounting two cameras on a bracket, spaced a bit, with a mechanism to make both take pictures at the same time. Newer cameras are even being used to shoot "step video" 3D slide shows with many pictures almost like a 3D motion picture if viewed properly. A modern camera can take ten pictures per second, with images that greatly exceed HDTV resolution.
If anything is in motion within the field of view, it is necessary to take both images at once, either through use of a specialized two-lens camera, or by using two identical cameras, operated as close as possible to the same moment.
A single camera can also be used if the subject remains perfectly still (such as an object in a museum display). Two exposures are required. The camera can be moved on a sliding bar for offset, or with practice, the photographer can simply shift the camera while holding it straight and level. This method of taking stereo photos is sometimes referred to as the "Cha-Cha" or "Rock and Roll" method. It is also sometimes referred to as the "astronaut shuffle" because it was used to take stereo pictures on the surface of the moon using normal monoscopic equipment.
For the most natural looking stereo most stereographers move the camera about 65mm or the distance between the eyes, but some experiment with other distances. A good rule of thumb is to shift sideways 1/30th of the distance to the closest subject for 'side by side' display, or just 1/60th if the image is to be also used for color anaglyph or anachrome image display. For example, when enhanced depth beyond natural vision is desired and a photo of a person in front of a house is being taken, and the person is thirty feet away, then the camera should be moved 1 foot between shots.
The stereo effect is not significantly diminished by slight pan or rotation between images. In fact slight rotation inwards (also called 'toe in') can be beneficial. Bear in mind that both images should show the same objects in the scene (just from different angles) – if a tree is on the edge of one image but out of view in the other image, then it will appear in a ghostly, semi-transparent way to the viewer, which is distracting and uncomfortable. Therefore, the images are cropped so they completely overlap, or the cameras 'toed-in' so that the images completely overlap without having to discard any of the images. However, too much 'toe-in' can cause 'keystoning' and eye strain for reasons best described here.
Digital stereo bases (baselines)
There are different cameras with different stereobase (distance between the two camera lenses) in the not professional market of 3D digital cameras used for video and also for stills:
- ? mm Inlife-Handnet HDC-810
- 10 mm Panasonic 3D Lumix H-FT012 lens (for the GH2, GF2, GF3, GF5, GF6 cams and also for the hybrid W8 cam).
- 12 mm DXG-5D8 cam and the clones Medion 3D and Praktica DMMC-3D.
- 20 mm Sony Blogie 3D.
- 23 mm Loreo 3D Macro lens.
- 25 mm LG Optimus 3D, LG Optimus 3D MAX (smartphones) and the Cyclopital3D close-up macro adapter (for the W1 and W3 Fujifilm cams).
- 28 mm Sharp Aquos SH80F and SHI12 (smartphones) and the Toshiba Camileo z100 camcorder.
- 30 mm Panasonic 3D1 camera.
- 32 mm HTC EVO 3D smartphone.
- 35 mm JVC TD1, DXG-5G2V, VTech Kidizoom 3D, GoPro HD Hero kit 3D, Nintendo 3D, Vivitar 790 HD (only for anagliph stills and video) camcorders.
- 40 mm Aiptek I2 (also the Viewsonic clone), Aiptek I2P Aiptek IS2 and Aiptek IH3 3D cams.
- 50 mm Loreo for full frame or non digital cams, and the 3D FUN cam of 3dInlife (also the clones Phenix PHC1, Phenix SDC821 and Rollei Powerflex 3D).
- 55 mm SVP dc-3D-80 cam (parallel & anagliph, stills & video).
- 60 mm Vivitar 3D cam (only for anagliph pictures).
- 65 mm Takara Tomy 3D ShotCam.
- 75 mm Fujifilm W3 cam.
- 77 mm Fujifilm W1 cam.
- 88 mm Loreo 3D lens for digital cams.
- 140mm Cyclopital3D base extender for the JVC TD1 and Sony TD10.
- 200mm Cyclopital3D base extender for the Panasonic AG-3DA1.
- 225mm Cyclopital3D base extender for the Fujifilm W1 and W3 cams.
Base line selection
For general purpose stereo photography, where the goal is to duplicate natural human vision and give a visual impression as close as possible to actually being there, the correct baseline (distance between where the right and left images are taken) would be the same as the distance between the eyes. When images taken with such a baseline are viewed using a viewing method that duplicates the conditions under which the picture is taken then the result would be an image pretty much the same as what would be seen at the site the photo was taken. This could be described as "ortho stereo."
An example would be the Realist format that was so popular in the late 1940s to mid-1950s and is still being used by some today. When these images are viewed using high quality viewers, or seen with a properly set up projector, the impression is, indeed, very close to being at the site of photography.
The baseline used in such cases will be about 50mm to 80mm. This is what is generally referred to as a "normal" baseline, used in most stereo photography. There are, however, situations where it might be desirable to use a longer or shorter baseline. The factors to consider include the viewing method to be used and the goal in taking the picture. Note that the concept of baseline also applies to other branches of stereography, such as stereo drawings and computer generated stereo images, but it involves the point of view chosen rather than actual physical separation of cameras or lenses.
Longer base line for distant objects "Hyper Stereo"
If a stereo picture is taken of a large, distant object such as a mountain or a large building using a normal base it will appear to be flat. This is in keeping with normal human vision, it would look flat if one were actually there, but if the object looks flat, there doesn't seem to be any point in taking a stereo picture, as it will simply seem to be behind a stereo window, with no depth in the scene itself, much like looking at a flat photograph from a distance.
One way of dealing with this situation is to include a foreground object to add depth interest and enhance the feeling of "being there", and this is the advice commonly given to novice stereographers. Caution must be used, however, to ensure that the foreground object is not too prominent, and appears to be a natural part of the scene, otherwise it will seem to become the subject with the distant object being merely the background. In cases like this, if the picture is just one of a series with other pictures showing more dramatic depth, it might make sense just to leave it flat, but behind a window.
For making stereo images featuring only a distant object (e.g., a mountain with foothills), the camera positions can be separated by a larger distance (called the "interaxial" or stereo base, often mistakenly called "interocular") than the adult human norm of 62–65mm. This will effectively render the captured image as though it was seen by a giant, and thus will enhance the depth perception of these distant objects, and reduce the apparent scale of the scene proportionately. However, in this case care must be taken not to bring objects in the close foreground too close to the viewer, as they will show excessive parallax and can complicate stereo window adjustment.
There are two main ways to accomplish this. One is to use two cameras separated by the required distance, the other is to shift a single camera the required distance between shots.
The shift method has been used with cameras such as the Stereo Realist to take hypers, either by taking two pairs and selecting the best frames, or by alternately capping each lens and recocking the shutter.
It is also possible to take hyperstereo pictures using an ordinary single lens camera aiming out an airplane. One must be careful, however, about movement of clouds between shots.
It has even been suggested that a version of hyperstereo could be used to help pilots fly planes.
In such situations, where an ortho stereo viewing method is used, a common rule of thumb is the 1:30 rule. This means that the baseline will be equal to 1/30 of the distance to the nearest object included in the photograph.
This technique can be applied to 3D imaging of the Moon: one picture is taken at moonrise, the other at moonset, as the face of the Moon is centered towards the center of the Earth and the diurnal rotation carries the photographer around the perimeter, though the results are rather poor, and much better results can be obtained using alternative techniques.
This is why high quality published stereos of the moon are done using libration,  the slight "wobbling" of the moon on its axis relative to the earth. Similar techniques were used late in the 19th century to take stereo views of Mars and other astronomical subjects.
Limitations of hyperstereo
Vertical alignment can become a big problem, especially if the terrain on which the two camera positions are placed is uneven.
Movement of objects in the scene can make syncing two widely separated cameras a nightmare. When a single camera is moved between two positions even subtle movements such as plants blowing in the wind and the movement of clouds can become a problem. The wider the baseline, the more of a problem this becomes.
Pictures taken in this fashion take on the appearance of a miniature model, taken from a short distance, and those not familiar with such pictures often cannot be convinced that it is the real object. This is because we cannot see depth when looking at such scenes in real life and our brains aren't equipped to deal with the artificial depth created by such techniques, and so our minds tell us it must be a smaller object viewed from a short distance, which would have depth. Though most eventually realize it is, indeed, an image of a large object from far away, many find the effect bothersome. This doesn't rule out using such techniques, but it is one of the factors that need to be considered when deciding whether or not such a technique should be used.
In movies and other forms of "3D" entertainment, hyperstereo may be used to simulate the viewpoint of a giant, with eyes a hundred feet apart. The miniaturization would be just what the photographer (or designer in the case of drawings/computer generated images) had in mind. On the other hand, in the case of a massive ship flying through space the impression that it is a miniature model is probably not what the film makers intended!
Hyper stereo can also lead to cardboarding, an effect that creates stereos in which different objects seem well separated in depth, but the objects themselves seem flat. This is because parallax is quantized.
Illustration of the limits of parallax multiplication, refer to image at lower right. Ortho viewing method assumed. The line represents the Z axis, so imagine that it is laying flat and stretching into the distance. If the camera is at X point A is on an object at 30 feet. Point B is on an object at 200 feet and point C is on the same object but 1 inch behind B. Point D is on an object 250 feet away. With a normal baseline point A is clearly in the foreground, with B,C, and D all at stereo infinity. With a one foot base line, which multiplies the parallax, there will be enough parallax to separate all four points, though the depth in the object containing B and C will still be subtle. If this object is the main subject, we may consider a baseline of 6 feet 8 inches but then the object at A would need to be cropped out. Now imagine that the camera is point Y, now the object at A is at 2,000 feet, point B is on an object at 2,170 feet C is a point on the same object 1 inch behind B. Point D is on an object at 2,220 feet. With a normal baseline, all four points are now at stereo infinity. With a 67 foot basline, the multiplied parallax allows us to see that all three objects are on different planes, yet points B and C, on the same object, appear to be on the same plane and all three objects appear flat. This is because there are discrete units of parallax, so at 2,170 feet the parallax between B and C is zero and zero multiplied by any number is still zero.
A practical example
In the red-cyan anaglyph example below, a ten-meter baseline atop the roof ridge of a house was used to image the mountain. The two foothill ridges are about four miles (6.5 km) distant and are separated in depth from each other and the background. The baseline is still too short to resolve the depth of the two more distant major peaks from each other. Owing to various trees that appeared in only one of the images the final image had to be severely cropped at each side and the bottom.
In the wider image, taken from a different location, a single camera was walked about one hundred feet (30 m) between pictures. The images were converted to monochrome before combination.(below)
Shorter baseline for ultra closeups "Macro stereo"
When objects are taken from closer than about 6 1/2 feet a normal base will produce excessive parallax and thus exaggerated depth when using ortho viewing methods. At some point the parallax becomes so great that the image is difficult or even impossible to view. For such situations, it becomes necessary to reduce the baseline in keeping with the 1:30 rule.
When still life scenes are stereographed, an ordinary single lens camera can be moved using a slide bar or similar method to generate a stereo pair. Multiple views can be taken and the best pair selected for the desired viewing method.
For moving objects, a more sophisticated approach is used. In the early 1970s, Realist incorporated introduced the Macro Realist designed to stereograph subjects 4 to 5 1/2 inches away, for viewing in Realist format viewers and projectors. It featured a 15mm base and fixed focus. It was invented by Clarence G. Henning.
In recent years cameras have been produced which are designed to stereograph subjects 10" to 20" using print film, with a 27mm baseline. Another technique, usable with fixed base cameras such as the Fujifilm FinePix Real 3D W1/W3 is to back off from the subject and use the zoom function to zoom to a closer view, such as was done in the image of a cake. This has the effect of reducing the effective baseline. Similar techniques could be used with paired digital cameras.
Another way to take images of very small objects, "extreme macro", is to use an ordinary flatbed scanner. This is a variation on the shift technique in which the object is turned upside down and placed on the scanner, scanned, moved over and scanned again. This produces stereos of a range objects as large as about 6" across down to objects as small as a carrot seed. This technique goes back to at least 1995. See the article Scanography for more details.
In stereo drawings and computer generated stereo images a smaller than normal baseline may be built into the constructed images to simulate a "bug's eye" view of the scene.
Baseline tailored to viewing method
How far the picture is viewed from requires a certain separation between the cameras. This separation is called stereo base or stereo base line and results from the ratio of the distance to the image to the distance between the eyes (usually about 2.5 inches). In any case the farther the screen is viewed from the more the image will pop out. The closer the screen is viewed from the flatter it will appear. Personal anatomical differences can be compensated for by moving closer or farther from the screen.
To provide close emulation of natural vision for images viewed on a computer monitor, a fixed stereo base of 6 cm might be appropriate. This will vary depending on the size of the monitor and the viewing distance. For hyper stereo, a ratio smaller than 1:30 could be used. For example if a stereo image is to be viewed on a computer monitor from a distance of 1000 mm there will be an eye to view ratio of 1000/63 or about 16. To set the cameras the appropriate distance apart for the desired effect, the distance to the subject (say a person at a distance from the cameras of 3 meters) is divided by 16 which yields a stereo base of 188 mm between the cameras.
However, images optimized for a small screen viewed from a short distance will show excessive parallax when viewed with more ortho methods, such as a projected image or a head mounted display, possibly causing eyestrain and headaches, or doubling, so pictures optimized for this viewing method may not be usable with other methods.
Where images may also be used for anaglyph display a narrower base, say 40mm will allow for less ghosting in the display.
Variable base for "geometric stereo"
As mentioned previously, the goal of the photographer may be a reason for using a baseline that is larger than normal. Such is the case when, instead of trying to achieve a close emulation to natural vision, a stereographer may be trying to achieve geometric perfection. This approach means that objects are shown with the shape they actually have, rather than the way they are seen by humans.
Objects at 25 to 30 feet, instead of having the subtle depth that one being there would see, or what would be recorded with a normal baseline, will have the much more dramatic depth that would be seen from 7 to 10 feet. So instead seeing objects as one would with eyes 2 1/2" apart, they would be seen as they would appear if one's eyes were 12" apart. In other words, the baseline is chosen to produce the same depth effect, regardless of the distance from the subject. As with true ortho, this effect is impossible to achieve in a literal sense, since different objects in the scene will be at different distances and will thus show different amounts of parallax, but the geometric stereographer, like the ortho stereographer attempts to come as close as possible.
Achieving this could be as simple as using the 1:30 rule to find a custom base for every shot, regardless of distance, or it could involve using a more complicated formula.
This could be thought of as a form of hyperstereo, but less extreme. As a result, it has all of the same limitations of hyperstereo. When objects are given enhanced depth, but not magnified to take up a larger portion of the view, there is a certain miniaturization effect. Of course, this may be exactly what the stereographer has in mind.
While geometric stereo neither attempts nor achieves a close emulation of natural vision, there are valid reasons for this approach. It does, however, represent a very specialized branch of stereography.
Precise stereoscopic baseline calculation methods
Recent research has led to precise methods for calculating the stereoscopic camera baseline. These techniques consider the geometry of the display/viewer and scene/camera spaces independently and can be used to reliably calculate a mapping of the scene depth being captured to a comfortable display depth budget. This frees up the photographer to place their camera wherever they wish to achieve the desired composition and then use the baseline calculator to work out the camera inter-axial separation required to produce the desired effect.
This approach means there is no guess work in the stereoscopic setup once a small set of parameters have been measured, it can be implemented for photography and computer graphics and the methods can be easily implemented in a software tool.
Multi-rig stereoscopic cameras
The precise methods for camera control have also allowed the development of multi-rig stereoscopic cameras where different slices of scene depth are captured using different inter-axial settings, the images of the slices are then composed together to form the final stereoscopic image pair. This allows important regions of a scene to be given better stereoscopic representation while less important regions are assigned less of the depth budget. It provides stereographers with a way to manage composition within the limited depth budget of each individual display technology.
For any branch of stereoscopy the concept of the stereo window is important. If a scene is viewed through a window the entire scene would normally be behind the window, if the scene is distant, it would be some distance behind the window, if it is nearby, it would appear to be just beyond the window. An object smaller than the window itself could even go through the window and appear partially or completely in front of it. The same applies to a part of a larger object that is smaller than the window.
The goal of setting the stereo window is to duplicate this effect.
To truly understand the concept of window adjustment it is necessary to understand where the stereo window itself is. In the case of projected stereo, including "3D" movies, the window would be the surface of the screen. With printed material the window is at the surface of the paper. When stereo images are seen by looking into a viewer the window is at the position of the frame. In the case of Virtual Reality the window seems to disappear as the scene becomes truly immersive.
The entire scene can be moved backwards or forwards in depth, relative to the stereo window, by horizontally sliding the left and right eye views relative to each other. Moving either or both images away from the center will bring the whole scene away from the viewer, whereas moving either or both images toward the center will move the whole scene toward the viewer. Any objects in the scene that have no horizontal offset, will appear at the same depth as the stereo window.
There are several considerations in deciding where to place the scene relative to the window.
First, in the case of an actual physical window, the left eye will see less of the left side of the scene and the right eye will see less of the right side of the scene, because the view is partly blocked by the window frame. This principle is known as "less to the left on the left" or 3L, and is often used as a guide when adjusting the stereo window where all objects are to appear behind the window. When the images are moved further apart, the outer edges are cropped by the same amount, thus duplicating the effect of a window frame.
Another consideration involves deciding where individual objects are placed relative to the window. It would be normal for the frame of an actual window to partly overlap or "cut off" an object that is behind the window. Thus an object behind the stereo window might be partly cut off by the frame or side of the stereo window. So the stereo window is often adjusted to place objects cut off by window behind the window. If an object, or part of an object, is not cut off by the window then it could be placed in front of it and the stereo window may be adjusted with this in mind. This effect is how swords, bugs, flashlights, etc. often seem to "come off the screen" in 3D movies.
If an object which is cut off by the window is placed in front of it, an effect results that is somewhat unnatural and is usually considered undesirable, this is often called a "window violation". This can best be understood by returning to the analogy of an actual physical window. An object in front of the window would not be cut off by the window frame but would, rather, continue to the right and/or left of it. This can't be duplicated in stereography techniques other than Virtual Reality so the stereo window will normally be adjusted to avoid window violations. There are, however, circumstances where they could be considered permissible.
A third consideration is viewing comfort. If the window is adjusted too far back the right and left images of distant parts of the scene may be more than 2.5" apart, requiring that the viewers eyes diverge in order to fuse them. This results in image doubling and/or viewer discomfort. In such cases a compromise is necessary between viewing comfort and the avoidance of window violations.
In stereo photography window adjustments is accomplished by shifting/cropping the images, in other form of stereoscopy such as drawings and computer generated images the window is built into the design of the images as they are generated. It is by design that in CGI movies certain images are behind the screen whereas others are in front of it.
|Commons has media related to Stereoscopy.|
- The Quantitative Analysis of Stereoscopic Effect
- Durham Visualization Laboratory stereoscopic imaging methods and software tools
- University of Washington Libraries Digital Collections Stereocard Collection
- Stereographic Views of Louisville and Beyond, 1850s–1930 from the University of Louisville Libraries
- Extremely rare and detailed Stereoscopic 3D scenes
- International Stereoscopic Union
- American University in Cairo Rare Books and Special Collections Digital Library Underwood & Underwood Egypt Stereoviews Collection
- The Bancroft Library