Infra-Red/UV Video Image
Segmentation Technique Theory
Mickael Maddison, May 2010
Currently the movie and photography industries utilize techniques often referred to as “chroma-key”, “luma-key” or “thermo-key” to film subjects for the purpose of removing the subject from the background of the image. Once the subject is removed from the background of the image, the subject can then be superimposed on alternative backgrounds. For example, filming an actor in front of a “green screen” and using the chroma-key technique to remove the actor from the green background would allow the video editor to place the actor on an image of the moon; without ever having to go to the moon.
Existing techniques for image segmentation require very careful and often expensive settings, lighting and filming techniques in addition to powerful post-production processing to achieve a quality result. This document proposes the use of the Near-infrared spectrum and optional ambient UV to replace the background; allowing for a much simpler means of extracting the desired image from an infrared background. Using a selection of isolated IR wavelengths and CCD or CMOS digital camera technologies adapted to capture and record these isolated wavelengths while at the same time recording the standard RGB (Red Green Blue) or RGBY (Red Green Blue Yellow) visible light would allow for software and devices to be produced will allow subject(s) to be removed from backgrounds with a higher degree of accuracy while requiring far less effort and processing.
* Adding in the detection of UV spectrum will also allow for additional processing options.
Uses: Cameras equipped with combined RGB/RGBY and IR/UV CMOS or CCD sensors would have a wide variety of uses.
- Standard Video capture and recording.
- Image Segmentation.
- Capturing and recording Near-infrared and UV used for special effects/artistic purposes.
- Capturing a wide range of light spectrum useful for night-vision image capture.
- Reconnaissance and security systems.
- Scientific research requiring combined access to visible and non-visible spectrum images.
- Other techniques and uses not yet considered or developed.
1 – CCD: Existing CCD and CMOS sensors already have the capability to record near-infrared wavelengths. Most cameras use a special filter to block the infrared wavelengths from being captured. Cameras that do not have this filter in place store the infrared information in the RGB image. Currently, CCD and CMOS type sensors capture RGB light by using a special “Bayer Color Filter” as seen here:
Each square represents one “pixel” of information captured by the sensor. Processing techniques may vary, but in effect a square of 4 pixels are combined to create a single pixel of “true” color.
The following is an example of how a new filter could be designed to allow for the capture of the additional non-visible spectrum using the existing sensors:
In this sample image, instead of using a pattern of 4 pixels, the pattern is spread over 9 pixels. The 4 existing RGB pixels are captured in addition to 5 additional pixels as represented by the white and various shades of grey boxes. The optimal configuration is subject to analysis, but for example it could be laid out like this:
Red = Red, Blue = Blue, Green1/2 = Green1/2
White = wide-spectrum UV
lightest grey = 840nm IR, second-lightest grey = 900nm IR
third-lightest grey = 950nm, darkest grey = 1000nm OR wide-spectrum IR.
As a future consideration, there are also technologies coming that could utilize the optical properties of carbon nanotubes to capture information on specific wavelengths. Research has shown that a single carbon nanotube connected to a pair of electrodes can can measure IR radiation effectively. This technology may be a long way from practical use, however, it provides an ongoing opportunity to continue developing and refining the technology.
2 – File Format: In addition to the filter, a new image/video file format would be created to store the captured information in a useful format. Many cameras have built-in processors that convert the RAW pixel data from the CCD into consumer file formats such as mpeg, jpeg, tiff, etc. A new processor may be developed to provide traditional file formats + a masking file, or for more advanced use the camera may save all the data together to allow for more advanced processing and usage of the recorded data.
The new image/video RAW format would have more information available and would need to store unique information for each displayed pixel to be effective for image external processing.
3 - Software: Image and video editing software would require modifications and/or filters to be developed that would make full use of the extended information available through the new file format and/or the processing of the masking file with the video file. This may be modifications and additions to versions of existing industry-standard software. In addition, research may deem entirely new software should be developed to make full and wider ranging use of the data.
Some image processing systems have experimented using non-visible light to increase the accuracy and quality of the visibly produced image. For the purpose of image segmentation, the various wavelengths of non-visible light would be used to create highly detailed “mask(s)” useful for cropping the background from the desired image(s).
4 - Non-Visible Illumination: To make the most of the technology, a wide range of electronic devices would be developed as the technology is adopted by the relevant industries. Some examples of devices that would be developed and produced:
- IR floodlights – wavelength specific floodlights to provide a suitable non-visible background.
- IR spotlights – wavelength specific spotlights that could be used to segment multiple objects within a single image by using multiple wavelengths.
- IR backdrops – currently most greenscreen type applications use light shining evenly onto a controlled, smooth surface. An IR backdrop could be a “screen” that actually emits the light from it’s surface.
- IR/RGB backdrops – Using modifications of technologies that are hitting the market today, large LCD screens using LED backlight technology could be redeveloped to emit a combination of visible and non-visible light, allowing a for a fully visible background while at the same time providing the non-visible background needed for image segmentation.
- IR absorbing and reflecting materials – these could be used to achieve a variety of effects. For example, the current industry uses chroma-suits to allow the segmentation of parts of a subject such as a body-less person.
Benefits of the Technology
- Subjects could be photographed or filmed against a variety of backgrounds and still be easily removed from the scene.
- Subjects could be filmed with against a background that has similar colors and textures to the scene they would be placed into. For example, an actor could be filmed against a projection of a mountain scene similar to the one that will be added during post production. This would allow for complex images such as hair to more easily be segmented with minimal artifacts.
- A single background could be used for full color filming, even if the background color matches the color of subjects being segmented from the scene.
- Shadows appearing on the background may have little or no effect on the IR mask, allowing lighting of the subject to be tailored to the final scene rather than to achieving the best separation of color from the green screen.
- Artistic photography, such as family portraits, would be able to eliminate the need to have a wide variety of backgrounds available to photograph subjects against. Instead, the photographer would use a generic or projected image as the background. After (or during) the photography session, the photographer could select the actual background image from an unlimited selection of background images. This allows the photographer to use a single quality photograph for any number of scenes.
Patent and related technology research:
Live Action Compositing example http://www.scribd.com/doc/654481/Live-Action-Compositing
Using human-IR heat for image segmentation http://nae-lab.org/project/thermo-key/
This is not for IR - it is a device that uses a mapped background
Practical Example of Theory
Sony DCR DVD203 Digital Camcorder used in photograph mode.
IR narrow-beam floodlight (wide beam would be much more effective)
Blue floor mat for background
Doll with hair
A Dark room
Computer system with Adobe Photoshop Elements 8.0
Step 1 - photograph still image of subject in Nightshot plus mode which uses NIR to enhance image visibility.
Step 2 - photograph still image of subject in normal RGB mode (sorry for poor quality photo).
Step 3 - Open these to images into a single layered file in Photoshop (elements)
Step 4 - On the nightshot (Mask) layer, convert to B/W and increase contrast and in this case, with a blue background used, I adjust the blue level to get the best differential I can considering the poor lighting. You can see in this step that the segmentation opportunity is pretty good but not nearly perfect. The beam from the IR spotlight is a little too intense and focused. Developing proper Infra-Red lighting would not be difficult, at least in this scene.
Step 5 - Due to inadequate lighting I will cut excess dark regions (which would normally appear as a fairly even white background with proper IR lighting) and delete to pure white to match the area around the subject. As part of this step I have increased the contrast to show clearly the mask that was created. Proper lighting and some minor filtering would remove the manual portions of this step making it easy to automate.
Step 6 - Using magic wand (which would be an automated part of the process) I select all the white area and invert the selection to have the mask selected.
Step 7 - switch to the layer with the RGB image and copy the subject and paste into a new layer.
Step 8 - Turn off the visibility on all layers except the cutout of the subject. You now have a fairly good cutout of the subject to work with.
Step 9 - Insert background and adjust cutout layer as desired.
Due to nightshot plus mode captures IR data and incorporates this into an RGB image. Due to this, the Mask layer is not nearly as good quality as if an actual IR layer were saved in addition to the RGB layer. This limitation also requires the Mask layer of the subject to be shot in (visible) darkness to generate a strong contrast. The RGB layer is then shot in RGB mode with visible lights enabled. If I had 2 nightshot cameras of the same type, split the image into the 2 cameras, and had a filter on the RGB mode camera to remove all IR data and a filter on the nightshot mode camera to remove all RGB data I believe I would get a much more accurate mask.
For example, in unedited images you can see that the hair sticking up off the head is visible. With even IR lighting and no conflicting RGB data being stored on the mask layer, this should result in an even sharper, more accurate mask layer. The sharper, more accurate mask layer could then be used to cut out these fine details from the RGB subject layer, providing a nice crisp image to work with.
Also of note based on previous experience doing still-frame image compositing the process outlined in this document was very quick and simple to do. With proper camera(s), proper lighting, suitable software, and a good studio environment the results should be far better than this simple test.
Some strange things are happening in the telecommunications industry, but fortunately the end result should be good for us all. Reading the CBC's "Background" on Voice over IP (VoIP), I found a quote very similar to my previous statements about the convergence of Voice and the Internet. "Bell Canada, the industry's largest player, intends to channel all of its phone traffic through the internet within 2 years."
This statement is huge for the telecom industry; the transition from the Internet running over voice lines to voice running over Internet (data) lines is gaining momentum. With the advancements in wireless interent technologies such as WiMax - eventually your Internet, Telephone and Mobile Phone, and Audio/Visual broadcasting will all share one common transport system.
The strange part of the transition is our CRTC, which currently sets limitations on the big monopolies like Telus. The CRTC currently forces these big companies to share their infrastructure with smaller companies, and works to ensure we all have fair and reasonably priced access to the telecommunications network. The CRTC has decided that VoIP is still subject to the same rules and limitations, which means companies like Telus will have a harder time adopting and growing a VoIP network than companies that are just getting involved.
It's funny to see a case where a government regulatory commission such as the CRTC has the effect of stimulating technology rather than hindering the development of it. The smaller companies just getting their feet when in telecom through VoIP are willing to invest a lot more time and money to gain a market share. The big companies like Telus seem to have little interest in adopting the new technologies, other than to retain market share and prevent these smaller companies from pulling the rug out from under them. It's going to be an interesting battle...
For at least the second time in less than a year, I've ended up with a high tech device that I hadn't realized I wanted. The first was the ever famous iPod. Honestly, had I not been so fortunate as to win a 40GB iPod by attending an online seminar with 3ware (thanks again!), I doubt that I would have purchased any type of mp3 player to this day. Last week, a simple series of events took place that led me to the purchase of a digital camcorder.
I've always thought of video cameras as fun toys, but more annoying than useful for general 'family' use. That hasn't changed; well, at least not yet! What led me to Future Shop to purchase a Sony Handycam wasn't a family function. The story, if you'd be so inclined to follow along, doesn't involve spotting alien ships flying over the city. I didn't capture breaking news or a candid display from a Hollywood star that got off track. All that happened, is I realized that my Sony Digital camera was better suited to it's core function of taking photographs than it was video clips.
If you've read my blog here before or caught me chatting on The X with Arjun Singh, you're probably aware that I'm involved in a process of recording music that I've written over the years. As part of my recording process, I 'stumbled' onto the desire to provide video's for anyone who might be so kind as to check out my music site and want to listen to my tracks.
The first video was put together using my digital camera's short video clip feature, and it came out fine for what I put into it. When it came time to put together a video for "Drink", I was inclined to do a little more with it. Part way through shooting the footage I'd need to create a music video, I realized that the quality of the product just wasn't going to satisfy my desire for a good quality picture, suitable for playback through a DVD Player or VCR. Fortunately my ever supportive wife came to the rescue and encouraged me to get what I needed. And that I did.
I picked up a Sony Handycam DVD203 and a couple of mini DVDRW's and was back filming that night. After not reading the directions and finding that the camera is essentially easy to use, I must say I'm incredibly impressed and happy that this weapon has become part of my technology arsenal. My brother and I had a blast filming scenes all over his town, and without much effort, much of what we shot has made it to my video.
Now I find myself thinking a little bigger for the next video. The filming budget is still, well, nonexistant really. The music comes first - but I'm encouraged that I've got a tool that can provide me with a flexibility that I had not expected.
If I haven't mentioned it before, Technology is Fun!
Today Arjun Singh once again invited me to his Radio show on The X. Today was different though. I was almost nervous going on the air today. It felt a little more like the first time I was on the radio... well, for all of a couple minutes anyway. In all honesty, I enjoy talking on the airwaves.
What made the world a different place for me today, was that Arjun played 2 of the songs off my soon to be released Album during his show. "Kamloops Internet Radio" isn't really about the music, we talk about the Internet, geek toys, and computer related issues primarily; however the show started off today with the first song I've completed "When I'm Free". Part way through the show, "Drink" was played to provide the world with a brief respite from our rantings. Today marks the first time any of my music has been played on the airwaves, an exciting milestone.
There were no fans screaming outside the studio waiting for autographs. No limo picked me up after the show. No callers phoned the station and begged for copies of the songs. There wasn't any fanfare about it, but that's not the point anyway. Today I achieved something I've wanted to do for a long time. I can now honestly say I've recorded songs that have been played on the radio.
Thanks Arjun for your support, and for continually inviting me to guest on your show. Thanks TRU, formerly known as UCC, for having a small but growing radio station (they welcome donations to help them get the upgrade they need for full community broadcasting), and thanks to all the friends and family who continue to support my many projects.
P.S. see www.mickaelmaddison.com for Music information
Well, today I learned of Skype In. I've mentioned Skype before; a great alternative Voice over IP phone system and online messenger. So now, much like my Primus or many people's Vonage Voice over IP phones, Skype has become a solution.
Skype is different than the two mentioned above in that it's currently strictly computer based. With Primus or Vonage, you don't even need a computer to use their systems. With Skype, you need a PC or PDA with the Skype software installed. While for some this may be a limitation, for myself, I'm starting to find that this may actually be a better solution.
Currently, I run my Voice over IP lines throughout the house as Line 2. It works very well, but I find the quality lacking, and although the costs are good, they're not amazing. What attracts me to Skype is that it's audio quality seems a lot stronger, it's encrypted, and Skype-to-Skype calls are always free. Also, I believe all long distance calls to most areas are $0.017EU (1.7 Euro cents) which is approximately $0.03 Canadian. So calling anywhere I need to call is quite inexpensive and is essentially painless.
Now that phone numbers are available for anyone on a regular phone line to be able to call me on my Skype connection, it's starting to look like a fairly good solution. I've signed up to try it, so once I know if the incoming call setup works as well as one expects, I'll let you know. My only real reservation here is the technical support and features. Skype hasn't earned a reputation for being the fastest to resolve issues, but then, at about 1/5th the price of my current VOIP, perhaps the odd time that there are bugs in the system I can route through to my Cell phone.
From a geek standpoint, once the Skype IN and Skype Out systems are both fully PDA compatible, my PDA becomes one more step closer to handling everything I need. Currently, whenever I'm away from the office, if there's an openly available wireless network, I can connect from my PDA and use Skype or MSN to chat with my business associates, customers, family and friends. It's not a novelty, it's a very effective tool, at least from my chair.