Contributions
This page details the impact of MERL on MELCO's business in four areas: product features, system components, licensing, and standards contributions. In each of these areas, there is continuing revenue from MERL technology that had its initial impact in previous years.
A dream of MERL is to create a new high volume product for MELCO. We have not yet achieved this, but we have contributed important new features to a number of products.
A large part of MELCO's business is in the form of large custom systems for business or government. MERL has contributed components to a number of such systems.
A different kind of way that MERL can impact MELCO is by making standard contributions. This may or may not lead to direct revenue via licensing. However, in any case, it allows MELCO to keep closely in touch with important standards and to shape these standards for maximum benefit to MELCO.
A final way that MERL can impact MELCO is by licensing MERL IP to third parties and obtaining direct revenue as a result.
The following subsections detail what MERL's impact on MELCO has been. In addition, they summarize how this impact was achieved. It is worthy of note that there are several distinct models of how impact can be achieved ranging from work specifically requested by MELCO to finding an application in MELCO for a technology developed independently by MERL.
Recommendation Engine for Diaprism
In August 2003, the Information Systems business unit (ISHon) began selling recommendation software as a companion product of its Diaprism commercial database hardware product. The heart of the software is a recommendation algorithm developed by MERL that can make predictions (e.g., of customer interests) based on information in a database. The recommendation software is only a small part of the Diaprism product line, but such new features are essential to keep the product line fresh.
Data modeling, data mining, optimization and control are key aspects of MERL's research. We expect to contribute to MELCO business in a number of ways in this area. Current MERL work includes: dimensionality reduction for data modeling (page 118), various kinds of data mining (page 120), optimization for elevator control (page 119), and man-machine collaborative optimization (Page 121).
Details:
In 2001, an MRL researcher working on fundamental questions in computer vision developed a new way to do Singular Value Decomposition (SVD). SVD is used to analyze the fundamental structure of high dimensional data. Suppose that a database contains 1,000 100x100-pixel pictures of faces and you want to know what features of these pictures (e.g., the shape of the face or the size of the nose) are the best ways to tell one picture from another. (This is a question that is fundamental to good face recognition systems.) You can view this data as a 10,000 by 1,000 matrix of intensity values (one row for each pixel and one column for each picture). Using SVD you can determine, for instance, what the six most important features in the data are.
SVD is extremely powerful and used in many applications. However, as traditionally calculated, it has two problems. First, it is quite costly to compute on the kinds of huge matrices it is usually applied to and must be entirely recomputed whenever any change is made in the matrix (e.g., when a new face is added to (or removed from) the data base above. Second, SVD requires all the data to be present in the matrix. This is often not the case due to incomplete data collection. When there is missing data, it has to be imputed (estimated) by some method before SVD can be applied. This process can also be costly and can introduce errors.
Faced with these problems in the application he was attacking, the MRL researcher invented a new approach to SVD called Incremental Imputative SVD (IISVD). This deals with incomplete data in an efficient and elegant way and can incorporate new data incrementally without having to reprocess the entire matrix. This latter feature leads to dramatic reduction in computation time in a variety of well-known SVD applications.
The researcher immediately began to look for applications where IISVD would have the greatest benefit and began to look at the problem of making recommendations. As an example, he began to experiment with some data about movie preferences that had been collected by the University of Wisconsin. This data specifies for thousands of people and thousands of movies, some information about which movies each person likes and dislikes. This data is extremely sparse, because information only exists for a few dozen movies for each person. The job of a recommendation engine is to fill in the missing data and estimate how people would rate movies they have not yet seen. The highly rated movies can then be recommended to them.
SVD does a reasonable job with this task, but slowly due to the enormous amount of data that must be processed whenever new information is added to the database. (A real application might have to deal with hundreds of thousands of people.) Using IISVD good results can be obtained so rapidly that new people or preferences can be added and recommendations can be updated almost instantly.
A prototype movie recommender was implemented and shown to a number of people in Japan in the fall of 2001. It was seen by a researcher at MELCO's Johosoken laboratory who made the connection to ISHon's Diaprism database product line. At the request of Johosoken, MERL upgraded the prototype, adding a number of new features. Based on the revised prototype, Johosoken created code for inclusion in the Diaprism product line. This code was beta tested with a customer in the fall of 2002 and became a MELCO product in 2003.
Road Recognition in the "Heli-Tele" System
In the fall of 2003, the Japanese government signed a contract with MELCO's Kobe works for the delivery of a system called "Heli-Tele". The system uses a MERL component to automatically locate roads in aerial photographs taken by helicopters, this information is then used to align the photographs with pre-existing maps, so that they can be shown as overlays on these maps. MERL's technology is only a small part of the system as a whole; however, it provides a key part of the functionality.
Computer-vision-based image analysis is a major area of research at MERL. This has lead to a wide range of technologies including tracking (page 54), face detection (page 58), face recognition (page 56), face-based surveillance video analysis (page 55), and human activity detection (page 51).
Details: One of MELCO's many businesses is the design and construction of earth observation satellites. Knowing that MELCO is interested in producing the software systems used for analyzing satellite data, MERL has investigated how its technology might be used in such systems. One such investigation was done by an MTL researcher in 2002. Extending his earlier graduate work. He built a prototype system for identifying roads in satellite images.
This prototype was shown to various people in MELCO in late 2002 and early 2003. It did not generate much interest from people working on satellite data, but it was seen by a researcher at MELCO's Johosoken laboratory who was working with the Kobe works in an effort to obtain the contract for the Heli-Tele system.
In the Heli-Tele system, helicopters with cameras are used to survey an area, e.g., in a disaster situation. The helicopters are equipped with GPS units and can report the approximate orientation and position (within 20 meters) from which each image is taken. This allows approximate alignment of the images with map data. However, to get exact alignment, the GPS data is not enough. Upon seeing the MERL demonstration, the Johosoken researcher had the idea that exact alignment could be achieved by matching roads recognized in an image, with the roads in the map data. To test this idea, he asked MERL to adapt its prototype to aerial images instead of satellite images.
It turns out that the technology needed to recognize roads many pixels wide in aerial images is entirely different from the technology needed to recognize roads a fraction of a pixel wide in satellite images; however, the MTL researcher new how to do this task equally well and produced a prototype system in the summer of 2003. This prototype was used in demonstrations that were critical to MELCO's efforts to win the Heli-Tele contract and was delivered as part of an initial version of the system in January 2004. MERL is currently working on an improved road recognition module, which will be delivered as part of the final system in August of 2004.
MPEG-2/4 Transcoding
In February 2004, MELCO's Koriyama works began shipping a PC-based product (the BC-5600) for converting MPEG-2 encoded video into MPEG-4 encoded video. The prime purpose of the system is to take multiple MPEG-2 streams from surveillance cameras and convert (transcode) them into compact, low-resolution MPEG-4 streams so they can be communicated effectively over the Internet. The transcoding is done entirely in software using a module developed jointly by MERL and Johosoken that incorporates software designed and written by MERL (page 88).
The BC-5600 is just one of what we expect will be several applications of MERL's transcoding module in Melco. Another business unit is planning to use MERL<92>s transcoding in a surveillance related product in the coming year. In addition, we are discussing a number of potential applications in the area of video storage and transmission for entertainment.
Details: The standard approach to transcoding is to use decoding hardware to decode the source video and then use encoding hardware to create output in the target encoding. However, this is both costly and inflexible due to the need for special purpose hardware.
In 1999, a pair of MTL researchers began a project on software-only transcoding, based on pioneering work done by one of them several years earlier. In 2000, this work resulted in key methods for doing spatial resolution reduction (making a picture with a small number of pixels from one with a large number of pixels) for MPEG-2 and MPEG-4. These methods are fast in part because they operate directly on the compressed form of the video without having to decompress it. This work was considered groundbreaking by the scientific community and led to several awards.
The work also led to a practical PC-based software-only transcoder for converting high resolution MPEG-2 into lower resolution MPEG-4. (The methods of compression used by MPEG-2 and MPEG-4 are very similar. However, MPEG-4 is more flexible and allows for various low resolution and/or low frame rate video encodings.) This transcoder was presented to various people in Melco in late 2000 and early 2001 and generated considerable interest
In recent years, many large surveillance systems (from MELCO and others) have been built around cameras that produce MPEG-2 encoded output. This is highly effective as a means for communication video over a dedicated digital network. However, the large bit-rate needed for MPEG-2 means that it is not practical to send the video out over the Internet for remote monitoring. For remote viewing, one would like to have lower resolution video in MPEG-4; however, Melco was hesitant to go in this direction because hardware transcoding is expensive and cannot conveniently deal with the multiple, varying frame rate video streams one typically encounters in surveillance situations. This particular need for a superior method of transcoding was not known to MERL, but it was known to key groups in Melco, which became interested in MERL's transcoder as soon as they heard about it.
At the request of Melco, MERL focused its transcoder work on surveillance related applications starting in 2002 and produced a product ready software module in 2003. At MELCO's further request, we are currently maintaining this module and investigating ways to extend it to other compression standards, other kinds of computational hardware, and other applications.
Link Quality Adaptation in ZigBee
In March 2004, MERL's contribution <93>Link-Quality-Indicator-Based Routing Protocol<94> was included in draft v0.8 of the ZigBee standard (page 69). This is a fundamental contribution to the standard specifying how to implicitly assemble a group of ZigBee nodes into a network and how to route messages over this network.
MERL continues to work actively on the ZigBee standard and ZigBee applications. This includes further proposals to the ZigBee standard (pages 70-72), work on an SCP/ZigBee bridge (page 78), and applications of ZigBee to sensor networks.
Details: MELCO was a founding member of the ZigBee alliance in 2002. The goal of this alliance is to create a standard for low cost, medium data rate (20-200 Kbits/sec), short-range (30 meters) wireless communication over ad hoc networks. The prime target is monitoring and control applications in home automation; however, there are industrial and sensor network applications as well. A basic motivation is to produce something that is cheaper and better than BlueTooth.
ZigBee is relevant to many product lines within MELCO including home appliances, home entertainment, and industrial systems. However, the primary initial force behind MELCO's support of the alliance was the Semiconductor business unit (Hanpon), whose goal is to make ZigBee interface chips. This goal remains strong in Renesas Technologies Corporation, which was formed by the merger of most of MELCO and Hitachi's semiconductor businesses in 2003.
MTL has a long history of involvement in home networking including work with standards like IEEE 1394, Havi, HomeAPI and SCP. MTL also has a long history of working with Hanpon/Renesas. This included working with Hanpon on an SCP/PLC interface chips in 2001-03 (page 77).
As a result, it was only natural that Hanpon turned to MERL for help when Hanpon joined with other companies to initiate the ZigBee standard. MERL jumped into ZigBee standard activities at the end of 2002 and was able to make proposals shortly thereafter. Because MERL was there at the right time with the right knowledge, we were able to get our contribution accepted as part of the fundamental basis of ZigBee.
Point-Based Rendering in MPEG-4
In March 2004, MERL's contribution "Point based Rendering for MPEG-4 AFX" (page 87) was included in the draft of the Animation Framework eXtension (AFX) which is part of the Synthetic Natural Hybrid Coding (SNHC) section of the MPEG-4 standard. MPEG-4 has long included the ability to support polygon graphics. MERL's contribution adds support for point-based graphics.
In addition to working on point-based graphics in the context of the MPEG-4 standard, MERL is participating in the MPEG 3D (Multi-View Video Coding) discussion, which may mature into a standard. MRL is also working on technology for live 3D TV (page 86). In addition, key technology for capturing point-based graphics data (page 57) is being used as a basis for 3D face recognition (page 56).
Details: In the fall of 1999, an MRL researcher began work on point-based graphics in collaboration with a researcher at the Swiss Federal Institute of Technology (ETH) in Zurich. This lead to major papers on rendering images based on point-based graphics at the SIGGRAPH conferences in 2000 and 2001, which are some of the most cited papers in this new sub field of graphics. MERL's focus then switched to scanning technologies for obtaining point-based graphics data from real objects. This led to further groundbreaking work in 2001-2003.
To date, most graphics is polygon-based. The 3D shape of an object is modeled as surfaces created by combining large numbers of (usually very small) triangles edge to edge. The appearance of the object is modeled by applying (usually very small) "textures" (image segments) to the faces of the triangles. An image of the object from a particular viewpoint is created by determining which location(s) on which textured triangle(s) correspond to each pixel in the desired image.
In contrast, point-based graphics models an object as a cloud of points (analogous to pixels in a 2D image) each of which has a 3D position and a color. An image from a particular viewpoint is created by determining which point(s) correspond to each pixel in the desired image. The key potential advantage of point-based graphics is that it can be directly captured from a real object in analogy with taking a photograph and the hope that processing of point-based graphics could be faster because the data structures involved are simpler and more uniform.
The work on point-based graphics at MERL was initially spurred by the desire of the semiconductor business unit (Hanpon) to get into the business of 3D graphics rendering chips. However, Hanpon eventually decided not to move in that direction. As a result, the MRL researcher looked for other ways that MERL's point-based graphics technology could benefit MELCO.
In collaboration with MTL researchers, he introduced these ideas into the MPEG 3D discussion in early 2003. Later that year, they introduced the ideas into MPEG-4, leading to the accepted contribution noted above. In support of the contribution, reference code was implemented and delivered to MPEG-4 in the summer of 2004.
Face Detection in the D-506i Cell Phone
In May 2004, MELCO started to ship a new camera-equipped NTT DoCoMo cell phone (model D-506i). This phone uses a fast face detection algorithm from MERL. When the user takes a picture with the phone, the face detector can be used to automatically generate thumbnail images of the faces in the picture. These can be stored as part of address book entries. Using caller ID, the picture of a person pops up when he or she calls. This is only one small feature of the phone, but it is an advertised point of differentiation between the D-506i and phones from other manufacturers.
The D-506i is just one of what we hope will be many applications of fast face detection to MELCO business. In particular, the face detector and the basic approach underlying it have become a key foundation for a significant part of MERL's computer vision efforts. This includes extending the approach to the temporal domain to recognize actions in video (page 51), a fast and accurate method of face recognition (page 58) and analysis of surveillance data based on the faces shown in it (page 55).
Details: In late 2000, MRL embarked on an effort to apply its computer vision expertise to the observation of people in video, with the goal of contributing to MELCO business in the areas of surveillance and access control. This led in 2001 to the staffing of a computer vision application group in MTL as a complement to the existing research group in MRL. In 2002 it led to a major 3-year project on "Computer Human Observation".
The development of MERL's face detection algorithm began in 2001 as a collaboration between a newly hired researcher at MRL and a newly hired researcher at MTL based on work they had done in the previous year. Before their work, there were a number of accurate methods for finding faces in images, but they were computationally expensive and therefore slow. The key advance of their work was combining high accuracy on full frontal faces with by far the world's fastest speed.
The algorithm operates in two parts. Computation-intensive computer learning techniques operating on large amounts of data are used to `train' a classifier that can determine whether or not a particular part of an image is a face. The image features used by the classifier are extremely simple, making it possible to evaluate the classifier very rapidly using a lightweight program. This allows real-time location of faces in video.
The initial version of the face detector was demonstrated to MELCO in the summer of 2001. In the ensuing year, the algorithm was redesigned, fundamentally improving the operation of both the classifier and the training algorithm. The system was also extended to operate on profile faces.
This improved face detector was demonstrated to MELCO in the summer of 2002. One of the people that saw this demonstration was a researcher at MELCO's Johosoken laboratory. He became interested in the detector and began to experiment with it.
The Johosoken researcher made the initial connection to the cell phone business unit. After discussion with MERL, he reimplemented MERL's classifier evaluation code for the cell phone platform and created a prototype application. His code, combined with a classifier trained by MERL, supports the D-506i.
The Renesas SCP/PLC Chip
In May 2004, Renesas Technology Corporation began production of the M603S, a single-chip Power Line Communication (PLC) IC for cost-effective smart home networking. The chip combines ITRAN Communications Ltd's IT800 power line modem and Renesas's M16C microprocessor. The M16C supports Microsoft's UPnP-compatible Simple Control Protocol (SCP) by means of microcode provided by MERL (page77).
MERL continues involvement in a range of projects related to home networking. This includes work on an SCP/ZigBee bridge (page 78), contributions to the ZigBee standard (pages 70-72) and work on broadband power line communications for HomePlug Audio/Video.
Details: One promising approach to low data rate networking of control information inside homes is to use PLC over existing electrical wiring. Microsoft's SCP is a lightweight alternative to the Internet Protocol (IP) for power-line communicating in support of Microsoft's Universal Plug and Play (UPnP) home networking standard.
Motivated by the prospect of high volume sales of interface chips, interest in various kinds of home networking by MELCO's semiconductor business unit (Hanpon) goes back many years. This interest continues unabated following the merger of most of MELCO's semiconductor operations with most of Hitachi's semiconductor operations to form Renesas in 2003.
MTL has a long history of working with Hanpon/Renesas. One of the labs that were eventually merged into MTL was founded in 1993 to support a digital TV chip-set project. MTL also has a long history of involvement in home networking including work with standards like IEEE 1394, Havi and HomeAPI. This included working with Hanpon on an interface chip for the Japanese PLC standard Echonet in 2000.
As a result, it was only natural that Hanpon turned to MERL for help when it put together a project with Microsoft and ITRAN to create an SCP/PLC chip. In 2001, MERL experimented with SCP at Hanpon's request. In 2002-03, MERL created the microcode for the M603S at Hanpon/Renesas's request. This was done in close collaboration with a semiconductor lab in Japan, which did all the hardware design of the chip and Microsoft, which provided the highest-level code supporting SCP. MERL wrote the main body of microcode, which provides the support environment expected by Microsoft's code.
Symbol Spreading in MBOA Standard
In May 2004, MERL's contribution "Symbol Spreading Technique" was included in draft v0.8 of the Multi-Band OFDM Alliance (MBOA) standard for Ultra Wide Band (UWB) wireless communication. By using multiple OFDM subcarriers to encode each symbol, MERL's technology modifies the MBOA system to exploit the frequency diversity inherent in a UWB channel thereby reducing transmission errors caused by interference and fading (page 67).
MERL's Symbol Spreading work is just one of a range of efforts at MERL on wireless communications in general (pages 66-78) and Ultra-Wide Band (UWB) communications in particular (pages 66-67). A particular focus of our work is on contributions to standards.
Details: UWB was developed about 15 years ago, mainly for military purposes. The prime attraction of UWB to the military is that it is a spread-spectrum system with a large spreading factor. This results in a low power spectral density (which makes it difficult to detect) and a high interference rejection capability (which makes it almost impossible to jam). Furthermore, the different frequency components inherent in a UWB signal provide a high frequency diversity and thus high reliability.
Recently, UWB has been approved for civilian use in the USA at extremely low power levels. At these levels, UWB can either support very high data rates (> 100Mbit/s) for short range communications (< 10m), or low data rates for longer ranges. High data rate communications are attractive for so called Personal Area Networks (PANs), e.g., for communicating high data rate information such as video within a home or office.
In late 2002, MERL began an initiative within MELCO on UWB. In collaboration with MELCO's Johosoken laboratory, MERL pursued a number of efforts including participating in the IEEE 802.15.3a PAN standard effort. The goal of 802.15.3a is short range (3-10 meter) high data rate (100-400 megabits per second) communication using UWB.
Two major groups have emerged within 802.15.3a. At MERL's suggestion, MELCO joined with six other companies in June 2003 to found one of these groups, the Multi-Band OFDM Alliance (MBOA), which has since grown to more than 150 members. (It is difficult at best to influence an important standard by going it alone. One needs allies.) MERL then focused its work within 802.15.3a on the MBOA proposal.
As sometimes happens, the 802.15.3a process became deadlocked due to a 60/40 split between the MBOA and the other major power group. (A 3/4 majority is needed for either approach to succeed.) As a result, the MBOA decided to form a special interest group by itself and put forward a standard before the end of 2004 along with compliant products. As noted above, MERL has been successful in making an impact on the MBOA standard.
The essence of the the MBOA proposal is to divide the available spectrum into several subbands of 500MHz width each and use high-speed pseudo-random switching between subbands to allow multiple simultaneous transmissions with minimal interference. Within the current subband, Orthogonal Frequency Division Multiplexing (OFDM) is used to communicate 100 streams of data simultaneously using 100 difference subfrequencies. The frequencies are close together and interfere with each other, but the modulation used for the individual data streams are chosen to be mathematically `orthogonal' so the data streams can nevertheless be reliably decoded by a receiver.
A problem with the above approach is that each individual OFDM subfrequency is subject to fading and interference from other sources. This can lead to errors in individual data streams even when other streams are communicated clearly. The essence of MERL's proposal is to mix the data streams together so that multiple subfrequencies are used for each data stream (and each subfrequency combines information from several data streams). This allows realiable communication of every data stream even if communication on a few subfrequencies is blocked.
