Infra-Red/UV Video Image
Segmentation Technique Theory
Mickael Maddison, May 2010
Currently the movie and photography industries utilize techniques often referred to as “chroma-key”, “luma-key” or “thermo-key” to film subjects for the purpose of removing the subject from the background of the image. Once the subject is removed from the background of the image, the subject can then be superimposed on alternative backgrounds. For example, filming an actor in front of a “green screen” and using the chroma-key technique to remove the actor from the green background would allow the video editor to place the actor on an image of the moon; without ever having to go to the moon.
Existing techniques for image segmentation require very careful and often expensive settings, lighting and filming techniques in addition to powerful post-production processing to achieve a quality result. This document proposes the use of the Near-infrared spectrum and optional ambient UV to replace the background; allowing for a much simpler means of extracting the desired image from an infrared background. Using a selection of isolated IR wavelengths and CCD or CMOS digital camera technologies adapted to capture and record these isolated wavelengths while at the same time recording the standard RGB (Red Green Blue) or RGBY (Red Green Blue Yellow) visible light would allow for software and devices to be produced will allow subject(s) to be removed from backgrounds with a higher degree of accuracy while requiring far less effort and processing.
* Adding in the detection of UV spectrum will also allow for additional processing options.
Uses: Cameras equipped with combined RGB/RGBY and IR/UV CMOS or CCD sensors would have a wide variety of uses.
- Standard Video capture and recording.
- Image Segmentation.
- Capturing and recording Near-infrared and UV used for special effects/artistic purposes.
- Capturing a wide range of light spectrum useful for night-vision image capture.
- Reconnaissance and security systems.
- Scientific research requiring combined access to visible and non-visible spectrum images.
- Other techniques and uses not yet considered or developed.
1 – CCD: Existing CCD and CMOS sensors already have the capability to record near-infrared wavelengths. Most cameras use a special filter to block the infrared wavelengths from being captured. Cameras that do not have this filter in place store the infrared information in the RGB image. Currently, CCD and CMOS type sensors capture RGB light by using a special “Bayer Color Filter” as seen here:
Each square represents one “pixel” of information captured by the sensor. Processing techniques may vary, but in effect a square of 4 pixels are combined to create a single pixel of “true” color.
The following is an example of how a new filter could be designed to allow for the capture of the additional non-visible spectrum using the existing sensors:
In this sample image, instead of using a pattern of 4 pixels, the pattern is spread over 9 pixels. The 4 existing RGB pixels are captured in addition to 5 additional pixels as represented by the white and various shades of grey boxes. The optimal configuration is subject to analysis, but for example it could be laid out like this:
Red = Red, Blue = Blue, Green1/2 = Green1/2
White = wide-spectrum UV
lightest grey = 840nm IR, second-lightest grey = 900nm IR
third-lightest grey = 950nm, darkest grey = 1000nm OR wide-spectrum IR.
As a future consideration, there are also technologies coming that could utilize the optical properties of carbon nanotubes to capture information on specific wavelengths. Research has shown that a single carbon nanotube connected to a pair of electrodes can can measure IR radiation effectively. This technology may be a long way from practical use, however, it provides an ongoing opportunity to continue developing and refining the technology.
2 – File Format: In addition to the filter, a new image/video file format would be created to store the captured information in a useful format. Many cameras have built-in processors that convert the RAW pixel data from the CCD into consumer file formats such as mpeg, jpeg, tiff, etc. A new processor may be developed to provide traditional file formats + a masking file, or for more advanced use the camera may save all the data together to allow for more advanced processing and usage of the recorded data.
The new image/video RAW format would have more information available and would need to store unique information for each displayed pixel to be effective for image external processing.
3 - Software: Image and video editing software would require modifications and/or filters to be developed that would make full use of the extended information available through the new file format and/or the processing of the masking file with the video file. This may be modifications and additions to versions of existing industry-standard software. In addition, research may deem entirely new software should be developed to make full and wider ranging use of the data.
Some image processing systems have experimented using non-visible light to increase the accuracy and quality of the visibly produced image. For the purpose of image segmentation, the various wavelengths of non-visible light would be used to create highly detailed “mask(s)” useful for cropping the background from the desired image(s).
4 - Non-Visible Illumination: To make the most of the technology, a wide range of electronic devices would be developed as the technology is adopted by the relevant industries. Some examples of devices that would be developed and produced:
- IR floodlights – wavelength specific floodlights to provide a suitable non-visible background.
- IR spotlights – wavelength specific spotlights that could be used to segment multiple objects within a single image by using multiple wavelengths.
- IR backdrops – currently most greenscreen type applications use light shining evenly onto a controlled, smooth surface. An IR backdrop could be a “screen” that actually emits the light from it’s surface.
- IR/RGB backdrops – Using modifications of technologies that are hitting the market today, large LCD screens using LED backlight technology could be redeveloped to emit a combination of visible and non-visible light, allowing a for a fully visible background while at the same time providing the non-visible background needed for image segmentation.
- IR absorbing and reflecting materials – these could be used to achieve a variety of effects. For example, the current industry uses chroma-suits to allow the segmentation of parts of a subject such as a body-less person.
Benefits of the Technology
- Subjects could be photographed or filmed against a variety of backgrounds and still be easily removed from the scene.
- Subjects could be filmed with against a background that has similar colors and textures to the scene they would be placed into. For example, an actor could be filmed against a projection of a mountain scene similar to the one that will be added during post production. This would allow for complex images such as hair to more easily be segmented with minimal artifacts.
- A single background could be used for full color filming, even if the background color matches the color of subjects being segmented from the scene.
- Shadows appearing on the background may have little or no effect on the IR mask, allowing lighting of the subject to be tailored to the final scene rather than to achieving the best separation of color from the green screen.
- Artistic photography, such as family portraits, would be able to eliminate the need to have a wide variety of backgrounds available to photograph subjects against. Instead, the photographer would use a generic or projected image as the background. After (or during) the photography session, the photographer could select the actual background image from an unlimited selection of background images. This allows the photographer to use a single quality photograph for any number of scenes.
Patent and related technology research:
Live Action Compositing example http://www.scribd.com/doc/654481/Live-Action-Compositing
Using human-IR heat for image segmentation http://nae-lab.org/project/thermo-key/
This is not for IR - it is a device that uses a mapped background
Practical Example of Theory
Sony DCR DVD203 Digital Camcorder used in photograph mode.
IR narrow-beam floodlight (wide beam would be much more effective)
Blue floor mat for background
Doll with hair
A Dark room
Computer system with Adobe Photoshop Elements 8.0
Step 1 - photograph still image of subject in Nightshot plus mode which uses NIR to enhance image visibility.
Step 2 - photograph still image of subject in normal RGB mode (sorry for poor quality photo).
Step 3 - Open these to images into a single layered file in Photoshop (elements)
Step 4 - On the nightshot (Mask) layer, convert to B/W and increase contrast and in this case, with a blue background used, I adjust the blue level to get the best differential I can considering the poor lighting. You can see in this step that the segmentation opportunity is pretty good but not nearly perfect. The beam from the IR spotlight is a little too intense and focused. Developing proper Infra-Red lighting would not be difficult, at least in this scene.
Step 5 - Due to inadequate lighting I will cut excess dark regions (which would normally appear as a fairly even white background with proper IR lighting) and delete to pure white to match the area around the subject. As part of this step I have increased the contrast to show clearly the mask that was created. Proper lighting and some minor filtering would remove the manual portions of this step making it easy to automate.
Step 6 - Using magic wand (which would be an automated part of the process) I select all the white area and invert the selection to have the mask selected.
Step 7 - switch to the layer with the RGB image and copy the subject and paste into a new layer.
Step 8 - Turn off the visibility on all layers except the cutout of the subject. You now have a fairly good cutout of the subject to work with.
Step 9 - Insert background and adjust cutout layer as desired.
Due to nightshot plus mode captures IR data and incorporates this into an RGB image. Due to this, the Mask layer is not nearly as good quality as if an actual IR layer were saved in addition to the RGB layer. This limitation also requires the Mask layer of the subject to be shot in (visible) darkness to generate a strong contrast. The RGB layer is then shot in RGB mode with visible lights enabled. If I had 2 nightshot cameras of the same type, split the image into the 2 cameras, and had a filter on the RGB mode camera to remove all IR data and a filter on the nightshot mode camera to remove all RGB data I believe I would get a much more accurate mask.
For example, in unedited images you can see that the hair sticking up off the head is visible. With even IR lighting and no conflicting RGB data being stored on the mask layer, this should result in an even sharper, more accurate mask layer. The sharper, more accurate mask layer could then be used to cut out these fine details from the RGB subject layer, providing a nice crisp image to work with.
Also of note based on previous experience doing still-frame image compositing the process outlined in this document was very quick and simple to do. With proper camera(s), proper lighting, suitable software, and a good studio environment the results should be far better than this simple test.
I just read the most interesting story about a city manager in "Tuttle" Oklahoma that really needs a geek. According to the transcript this fellow was prepared to call in the FBI, due to his own ignorance, to investigate an attack on their webserver/network that never took place. It is unfortunate that the manager did not consult with a local geek in the first place, as it would have saved him a lot of criticism, but then I suppose if he flies off the handle then it's really his own fault.
The interesting thing about this article, is how clearly it underlines the lack of understanding about the technologies that build our global computer network. Governments, businesses and individuals are all directly and indirectly using a massive array of software and hardware every day, and the actual number of people that know how it works could very well be as low as 1%.
And amongst those of us who would have instantly recognized that this incident, where a simple default apache/CentOS webpage was displayed in place of the city website, was no attack, there seems to be too much demand for our skills to keep up with the demand.
I'm always amazed when I am referred to a business to help with their network, how often things are really in a terrible mess. I recall a time recently where I was called in to resolve some computer problems, and the client was so anxious to have it fixed that I was greeted by 2 staff members who held the doors open and escorted me to their computers. It felt as if I were a famous musician being escorted to the stage after my flight was delayed.
Anyhow, back to the story about the city manager from Tuttle, I feel for this guy. His initial reaction, although aggressive, was honest. He felt he had been attacked, knew the source of the problem, and went straight for it. There's no doubt that geeks around the world are laughing as they read the article and transcript, and I'm sure his face has turned to a deep blush once he realized his error.
Cheers to all the geeks, and those who must tolerate our lot.
I love performance computers. I love having a great desktop computer that I can tweak and upgrade to handle all the crazy things I'm always forcing it to do such as running dozens of windows during my workday, music and music video recording and editing, as well as the occasional online game. While I love having a PC, I long for a portable machine that can combine the best features of a laptop with a true PC.
The last few months have been a bit of a struggle with my ECS Laptop. When it's running well, it's proven to be a fast, powerful machine. The problem with the ECS, is that it seems to be a little too flaky and a lot too noisy for a guy like me who relies on his machines 10+hrs per day. When the hard-drive started flaking out for the second time in 2 years, and with the video card creating a red mask over CPU/Video intensive applications, I had little choice but to replace the machine.
I did a lot of looking and contemplating over what to get next. The ECS, still being on par with machines currently available for sale, will be sent back for repairs and possible upgrades but is not likely to make it back to my own desk. The big decision was a tough one; Do I replace it with a simple, inexpensive portable machine, or do I look for another powerhouse machine that can participate in my more demanding tasks?
After too much time online checking out offerings from sites such as TigerDirect, Acer, HP, IBM, Compaq and AlienWare, I ended up settling on an Inspiron 9400 from Dell. I selected this machine based primarily on 2 factors. #1, if a machine performs poorly, I'm not likely to use it extensively. #2 Fast Support and Repair times.
With an Intel Centrino Duo-Core processor, 1GB DDR2 RAM, 256MB Video Card, 17" widescreen display and 3 years on-site support, I felt I had a machine that I would be able to push hard without becoming frustrated with poor performance, and if the system should fail me as machines have been known to, Dell will fix it fast.
Today I am already pleased at the choice. I've been working on this machine extensively since it arrived, and I can say the performance has been exactly as expected. The 17" display has even allowed me to handle some of my tasks more efficiently, as I can easily fit IE and Firefox side-by-side and browse and compare information from 2 sites at once. This is very handy for many of my daily tasks including programming, systems administration, billing etc.
So far the Dell experience has also been quite good. While there does seem to be a problem with their online order tracking system that didn't get resolved during the 10 days I was waiting for my system to arrive, the phone support was superb. When the machine arrived with a French keyboard, Dell made arrangements to have a replacement sent. In the meantime, I set the system to think the keyboard is US style, so as long as I don't look at the keys while typing, nothing seems out of place. Once the new machine arrives, all I must do is eject the hard disk from the French machine, insert it into the replacement, and I should be all set.
Without going overboard, I believe I've been able to balance my need for performance, portability, reliability and cost effectively. I suppose time will tell.
A few days ago I received my new laptop. A new laptop is a good opportunity for reviewing and changing my selection of software, and the first thing on my list has been my Email software.
For the last few years, I've been using and email client called "The Bat", provided by "Ritlabs". When I first started using this software, I was switching away from PMMail. I had selected PMMail as an alternative to Netscape Communicator a couple of years before when I was looking for software that could help me organize my huge volumes of email. Over time I outgrew PMMail, and began searching again for an alternative. A colleague introduced me to The Bat, which provided me with a powerful alternative and an array of easy-to-use filtering tools that I needed. Today, I find my email client needs changing, and while I would prefer to use an entirely web-based solution, I haven't found one that has the same features I get from a PC based email client.
Having come to respect and use Mozilla Firefox as an equal to Internet Explorer for website browsing, I have been considering trying out Thunderbird. Downloading and installing the software is simple, the only real issues I faced were exporting the 30,000 messages from The Bat and setting up my filtering.
Fortunately, unlike some of the less friendly email programs around, The Bat does provide the ability to export messages in a number of formats, including the UNIX mbox format used by Thunderbird. Once I got that sorted out and had Thunderbird ready to use, the testing begins.
Over the last 3 days, I can say that it's been performing well. It has a very simple and seemingly very effective anti-spam system built in. After only a day of marking incoming spam as "junk", I have had only a single false-positive which was quickly resolved by checking out the "SPAM" folder I had created. My assortment of inboxes seem to be pine-sol fresh most of the time. The message filtering tool, while not nearly as powerful as those found in The Bat, are flexible enough that I've been able to get most things working suitably.
It hasn't taken much effort to find a couple of bugs; fortunately they haven't been critical or caused significant trouble. There seems to be a bug, which is actually one of the same bugs I found in The Bat, that causes the view settings for a folder to get stuck organizing the messages by name or status rather than my preferred "Date". Attempts to change this seem to fail on both programs. The other bug, is that when selecting "View All" messages, I only get a portion of the messages. Using the "Search" tool gets past this easily.
Overall, considering that the software comes without any price tag, I think Thunderbird has a chance of staying on my machine as my new program of choice. I knew that selecting an open source software would have some issues, but I have also seen what I've come to love about much of the open source software I use; flexibility. The software actually seems to be setup in a way that works well with me.
Only time will tell. Now back installing software.
Hockey fans throughout North America are wondering what the future holds for what we often tout as the greatest game on earth. If you've been priveledged enough to see any of the recent NHL games and you've been a hockey fan for more than a few years, chances are you've noticed the penalties have become dominating.
It's almost like a different game. When I watch hockey I am expecting to see the players skating hard, passing, checking and getting to the net and occassionally slipping one past the goalie. Hockey is known to be a fast sport, and from the times of The Great One, being a great skater could stand out in the game. Over time, between a shift to defensive style play and various rule changes the key to the game seems to do whatever you can to aggravate your opponent and cause him to take a penalty by retaliating. If you're able to play dirty, but just go far enough not to get a penalty yourself, you might just put your team into a Power Play, where your chances of scoring are often significantly greater.
There's no denying that hockey has a long history of being a tough physical game. Players often slam each other into the boards. There are frequent fights. Broken fists, wrists, jaws, lips aren't a daily event, but they happen often enough. What seems to have changed over the years is the dirtiness of it all. This dirtiness has resulted in efforts to reduce player injury (especially the serious ones), which equates to 2 referees rather than the original 1, and a lot more penalties.
While one might hope that taking a hard stance on the dirty play might result in less injuries and a better overall game, the effect hasn't been welcome. Fans are constantly bickering that the refs are winning or losing the games for the teams, and the amount of 5 on 5 play has dropped.
I would love to see a game dominated by the 5 on 5 play again. Perhaps the solution is to continue awarding as many penalties as players deserve but keep the play at 5 on 5. Or maybe they need to further increase the size of the goal so that a power play more often results in a goal against; making it far more important to stay out of the penalty box. Maybe the players just need to fight it through and what happens, happens. Whatever your angle is, if you're a hockey fan you've probably formed some opinion on what needs to be done to get the game back on track.