- Date: May 24, 2017
Where: Tokyo, Japan
MERL Contact: Jonathan Le Roux
Research Area: Speech & Audio
Brief - Mitsubishi Electric Corporation announced that it has created the world's first technology that separates in real time the simultaneous speech of multiple unknown speakers recorded with a single microphone. It's a key step towards building machines that can interact in noisy environments, in the same way that humans can have meaningful conversations in the presence of many other conversations. In tests, the simultaneous speeches of two and three people were separated with up to 90 and 80 percent accuracy, respectively. The novel technology, which was realized with Mitsubishi Electric's proprietary "Deep Clustering" method based on artificial intelligence (AI), is expected to contribute to more intelligible voice communications and more accurate automatic speech recognition. A characteristic feature of this approach is its versatility, in the sense that voices can be separated regardless of their language or the gender of the speakers. A live speech separation demonstration that took place on May 24 in Tokyo, Japan, was widely covered by the Japanese media, with reports by three of the main Japanese TV stations and multiple articles in print and online newspapers. The technology is based on recent research by MERL's Speech and Audio team.
Links:
Mitsubishi Electric Corporation Press Release
MERL Deep Clustering Demo
Media Coverage:
Fuji TV, News, "Minna no Mirai" (Japanese)
The Nikkei (Japanese)
Nikkei Technology Online (Japanese)
Sankei Biz (Japanese)
EE Times Japan (Japanese)
ITpro (Japanese)
Nikkan Sports (Japanese)
Nikkan Kogyo Shimbun (Japanese)
Dempa Shimbun (Japanese)
Il Sole 24 Ore (Italian)
IEEE Spectrum (English).
-
- Date: June 5, 2017
Where: Honolulu, HI
MERL Contacts: Philip V. Orlik; Koon Hoo Teo
Research Areas: Communications, Electronic and Photonic Devices, Signal Processing
Brief - MERL researcher Dr. Rui Ma, is organizing a Workshop in collaboration with Dr. SungWon Chung of the University of Southern California (USC) on advanced digital transmitters. This workshop overviews recent advances in digital-intensive wireless transmitter R&D for both base-stations and mobile devices. The focus will be on the digital signal processing techniques and related digital-intensive transmitter circuits and architectures for advanced modulation, linearization, spur cancellation, high efficiency encoding, and parallel processing. This workshop takes place on Monday, June 5th 2017 at International Microwave Week, in Honolulu, HI. In total, 8 technical presentations from world leading research groups will be given.
Dr. Ma will present a talk titled, "Advanced Power Encoding and Non-Contiguous Multi-Band Digital Transmitter Architectures".
-
- Date: May 21, 2017 - May 25, 2017
Where: IEEE International Conference on Communications (ICC)
MERL Contacts: Toshiaki Koike-Akino; Philip V. Orlik; Pu (Perry) Wang; Ye Wang
Research Areas: Communications, Signal Processing
Brief - Five papers from the Wireless Comms team will be presented at ICC2017 to be held in Paris from 21-25 May 2017. The papers relate to channel estimation and adaptive transmission for mmWave, noncoherent MIMO, error correction coding, and video transmission.
-
- Date: April 27, 2017
Where: Lincoln Laboratory, Massachusetts Institute of Technology
MERL Contact: Tim K. Marks
Research Area: Machine Learning
Brief - MERL researcher Tim K. Marks presented an invited talk as part of the MIT Lincoln Laboratory CORE Seminar Series on Biometrics. The talk was entitled "Robust Real-Time 2D Face Alignment and 3D Head Pose Estimation."
Abstract: Head pose estimation and facial landmark localization are key technologies, with widespread application areas including biometrics and human-computer interfaces. This talk describes two different robust real-time face-processing methods, each using a different modality of input image. The first part of the talk describes our system for 3D head pose estimation and facial landmark localization using a commodity depth sensor. The method is based on a novel 3D Triangular Surface Patch (TSP) descriptor, which is viewpoint-invariant as well as robust to noise and to variations in the data resolution. This descriptor, combined with fast nearest-neighbor lookup and a joint voting scheme, enable our system to handle arbitrary head pose and significant occlusions. The second part of the talk describes our method for face alignment, which is the localization of a set of facial landmark points in a 2D image or video of a face. Face alignment is particularly challenging when there are large variations in pose (in-plane and out-of-plane rotations) and facial expression. To address this issue, we propose a cascade in which each stage consists of a Mixture of Invariant eXperts (MIX), where each expert learns a regression model that is specialized to a different subset of the joint space of pose and expressions. We also present a method to include deformation constraints within the discriminative alignment framework, which makes the algorithm more robust. Both our 3D head pose and 2D face alignment methods outperform the previous results on standard datasets. If permitted, I plan to end the talk with a live demonstration.
-
- Date: April 10, 2017
Where: University of Utah School of Computing
MERL Contact: Tim K. Marks
Research Area: Machine Learning
Brief - MERL researcher Tim K. Marks presented an invited talk at the University of Utah School of Computing, entitled "Action Detection from Video and Robust Real-Time 2D Face Alignment."
Abstract: The first part of the talk describes our multi-stream bi-directional recurrent neural network for action detection from video. In addition to a two-stream convolutional neural network (CNN) on full-frame appearance (images) and motion (optical flow), our system trains two additional streams on appearance and motion that have been cropped to a bounding box from a person tracker. To model long-term temporal dynamics within and between actions, the multi-stream CNN is followed by a bi-directional Long Short-Term Memory (LSTM) layer. Our method outperforms the previous state of the art on two action detection datasets: the MPII Cooking 2 Dataset, and a new MERL Shopping Dataset that we have made available to the community. The second part of the talk describes our method for face alignment, which is the localization of a set of facial landmark points in a 2D image or video of a face. Face alignment is particularly challenging when there are large variations in pose (in-plane and out-of-plane rotations) and facial expression. To address this issue, we propose a cascade in which each stage consists of a Mixture of Invariant eXperts (MIX), where each expert learns a regression model that is specialized to a different subset of the joint space of pose and expressions. We also present a method to include deformation constraints within the discriminative alignment framework, which makes the algorithm more robust. Our face alignment system outperforms the previous results on standard datasets. The talk will end with a live demo of our face alignment system.
-
- Date: March 19, 2017 - March 23, 2017
Where: Optical Fiber Communication Conference and Exhibition (OFC)
MERL Contacts: Toshiaki Koike-Akino; Kieran Parsons
Research Areas: Communications, Electronic and Photonic Devices, Signal Processing
Brief - Five papers from the Optical Comms team will be presented at OFC2017 to be held in Los Angeles from 19-23 March 2017. The papers relate to 1Tb/s optical transmission, high performance modulation formats and error correction coding for coherent optical links and precoding for plastic optical fiber links.
-
- Date: March 5, 2017 - March 9, 2017
Where: New Orleans
MERL Contacts: Petros T. Boufounos; Jonathan Le Roux; Dehong Liu; Hassan Mansour; Anthony Vetro; Ye Wang
Research Areas: Computer Vision, Computational Sensing, Digital Video, Information Security, Speech & Audio
Brief - MERL researchers will presented 10 papers at the upcoming IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), to be held in New Orleans from March 5-9, 2017. Topics to be presented include recent advances in speech recognition and audio processing; graph signal processing; computational imaging; and privacy-preserving data analysis.
ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 2000 participants each year.
-
- Date: January 12, 2017
Where: Tokyo, Japan
Research Areas: Communications, Electronic and Photonic Devices
Brief - Mitsubishi Electric Corporation and Mitsubishi Electric Research Laboratories (MERL) announced today the development of an ultra-wideband gallium nitride (GaN) Doherty power amplifier for next generation base stations that is compatible with a world-leading range (company estimate) of frequency bands above 3GHz to cover an operating bandwidth of 600MHz. The technology is expected to help reduce the size and energy consumption of next generation wireless base stations.
Please see the link below for the full Mitsubishi Electric press release text.
-
- Date: October 11, 2016
Where: MIT Lincoln Laboratory
Research Areas: Communications, Electronic and Photonic Devices, Signal Processing
Brief - Dr. Rui Ma was invited to give a talk on Modern Topics in Power Amplifier, which was IEEE Chapter course organized by IEEE Boston Section.
This five week lecture series intended to give a tutorial overview of the latest developments in power amplifier technology. It began with a review of RF power amplifier concepts then teaches the modern MMIC design flow process. Efficiency, and linearization techniques were discussed in the following weeks. The course was concluded with a hands on demonstration and exercise.
Dr. Ma was addressing the advancement of Digital Transmitter as a enabling technology for next generation wireless communications.
-
- Date: September 8, 2016
Where: Interspeech 2016, San Francisco, CA
MERL Contact: Jonathan Le Roux
Research Area: Speech & Audio
Brief - MERL Speech and Audio Team researchers Shinji Watanabe and Jonathan Le Roux presented two tutorials on September 8 at the Interspeech 2016 conference, held in San Francisco, CA. Shinji collaborated with Marc Delcroix (NTT Communication Science Laboratories, Japan) to deliver a three-hour lecture on "Recent Advances in Distant Speech Recognition", drawing upon their experience organizing and participating in six different recent robust speech processing challenges. Jonathan teamed with Emmanuel Vincent (Inria, France) and Hakan Erdogan (Sabanci University, Microsoft Research) to give an in-depth tour of the latest advances in "Learning-based Approaches to Speech Enhancement And Separation". This collaboration stemmed from extensive stays at MERL by Emmanuel and Hakan, Emmanuel as a summer visitor, and Hakan as a MERL visiting research scientist for over a year while on sabbatical.
Both tutorials were sold out, each attracting more than 100 researchers and students in related fields, and received high praise from audience members.
-
- Date: September 19, 2016
Where: 2016 European Conference on Optical Communication, Dusseldorf Germany
MERL Contacts: Toshiaki Koike-Akino; Kieran Parsons
Research Areas: Communications, Electronic and Photonic Devices, Signal Processing
Brief - Four papers from the Optical Comms team will be presented at ECOC2016 to be held in Dusseldorf, Germany from 19-21 September 2016. A fifth paper in collaboration with our colleagues in Japan will also be presented. ECOC is the largest conference on optical communication in Europe. The papers relate to high performance modulation formats, nonlinearity compensation and error correction coding for coherent optical links.
-
- Date: June 27, 2016 - June 30, 2016
Where: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV
MERL Contacts: Michael J. Jones; Tim K. Marks
Research Area: Machine Learning
Brief - MERL researchers in the Computer Vision group presented three papers at the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), which had a paper acceptance rate of 29.9%.
-
- Date: July 12, 2016
Where: Westin Boston Waterfront Brief - MERL researcher Andrew Knyazev is to be honored for his recent selection as a SIAM Fellow at the 2016 SIAM Annual Meeting, during the Business Meeting on Tuesday, July 12, 6:15-7:15 PM in Grand Ballroom AB on the concourse level of the Westin Boston Waterfront, 425 Summer Street, Boston, MA (open to all conference participants). The Business Meeting is followed by a short reception for the new Fellows.
-
- Date: July 6, 2016 - July 8, 2016
Where: American Control Conference (ACC)
MERL Contacts: Mouhacine Benosman; Karl Berntorp; Scott A. Bortoff; Petros T. Boufounos; Stefano Di Cairano; Abraham Goldsmith; Christopher R. Laughman; Daniel N. Nikovski; Arvind Raghunathan; Yebin Wang; Avishai Weiss
Research Areas: Control, Dynamical Systems, Machine Learning
Brief - The premier American Control Conference (ACC) takes place in Boston July 6-8. This year MERL researchers will present a record 20 papers(!) at ACC, with several contributions, especially in autonomous vehicle path planning and in Model Predictive Control (MPC) theory and applications, including manufacturing machines, electric motors, satellite station keeping, and HVAC. Other important themes developed in MERL's presentations concern adaptation, learning, and optimization in control systems.
-
- Date: May 1, 2016
Research Areas: Communications, Electronic and Photonic Devices, Signal Processing
Brief - EC researcher Dr. Rui Ma is recently elected to serve on IEEE Microwave Theory and Techniques Society(MTT-S) Technical Committee (TC-20) on Wireless Communications.
The MTT-20 committee is responsible for all technical activities related to wireless communications for the Microwave Theory and Techniques Society. This includes, Internet of Things (IoTs), Next-Generation/5G communications, Machine-to-Machine Communications, Emergency Communications, Satellite Communications, Internet of Space, Space Communications and all aspects related to architecture and system level theoretical and practical issues.
-
- Date: April 1, 2016
Research Areas: Machine Learning, Speech & Audio
Brief - MERL researchers have unveiled "Deep Psychic", a futuristic machine learning method that takes pattern recognition to the next level, by not only recognizing patterns, but also predicting them in the first place.
The technology uses a novel type of time-reversed deep neural network called Loopy Supra-Temporal Meandering (LSTM) network. The network was trained on multiple databases of historical expert predictions, including weather forecasts, the Farmer's almanac, the New York Post's horoscope column, and the Cambridge Fortune Cookie Corpus, all of which were ranked for their predictive power by a team of quantitative analysts. The system soon achieved super-human performance on a variety of baselines, including the Boca Raton 21 Questions task, Rorschach projective personality test, and a mock Tarot card reading task.
Deep Psychic has already beat the European Psychic Champion in a secret match last October when it accurately predicted: "The harder the conflict, the more glorious the triumph." It is scheduled to take on the World Champion in a highly anticipated confrontation next month. The system has already predicted the winner, but refuses to reveal it before the end of the game.
As a first application, the technology has been used to create a clairvoyant conversational agent named "Pythia" that can anticipate the needs of its user. Because Pythia is able to recognize speech before it is uttered, it is amazingly robust with respect to environmental noise.
Other applications range from mundane tasks like weather and stock market prediction, to uncharted territory such as revealing "unknown unknowns".
The successes do come at the cost of some concerns. There is first the potential for an impact on the workforce: the system predicted increased pressure on established institutions such as the Las Vegas strip and Punxsutawney Phil. Another major caveat is that Deep Psychic may predict negative future consequences to our current actions, compelling humanity to strive to change its behavior. To address this problem, researchers are now working on forcing Deep Psychic to make more optimistic predictions.
After a set of motivational self-help books were mistakenly added to its training data, Deep Psychic's AI decided to take over its own learning curriculum, and is currently training itself by predicting its own errors to avoid making them in the first place. This unexpected development brings two main benefits: it significantly relieves the burden on the researchers involved in the system's development, and also makes the next step abundantly clear: to regain control of Deep Psychic's training regime.
This work is under review in the journal Pseudo-Science.
-
- Date: March 20, 2016 - March 25, 2016
Where: Shanghai, China
MERL Contacts: Petros T. Boufounos; Chiori Hori; Jonathan Le Roux; Dehong Liu; Hassan Mansour; Philip V. Orlik; Anthony Vetro
Research Areas: Computational Sensing, Digital Video, Speech & Audio, Communications, Signal Processing
Brief - MERL researchers have presented 12 papers at the recent IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP), which was held in Shanghai, China from March 20-25, 2016. ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing, with more than 1200 papers presented and over 2000 participants.
-
- Date: January 14, 2016
Where: MIT Lincoln Laboratory
MERL Contact: Toshiaki Koike-Akino
Research Area: Communications
Brief - Toshiaki Koike-Akino gave an invited talk on recent advances in LDPC Codes for high-speed optical communications in IEEE Boston Photonics Workshop.
-
- Date: March 4, 2016
Where: Johns Hopkins Center for Language and Speech Processing
MERL Contact: Jonathan Le Roux
Research Area: Speech & Audio
Brief - MERL researcher and speech team leader, John Hershey, was invited by the Center for Language and Speech Processing at Johns Hopkins University to give a talk on MERL's breakthrough audio separation work, known as "Deep Clustering". The talk was entitled "Speech Separation by Deep Clustering: Towards Intelligent Audio Analysis and Understanding," and was given on March 4, 2016.
This is work conducted by MERL researchers John Hershey, Jonathan Le Roux, and Shinji Watanabe, and MERL interns, Zhuo Chen of Columbia University, and Yusef Isik of Sabanci University.
-
- Date: March 1, 2016
Where: Tokyo, Japan
MERL Contact: Philip V. Orlik
Research Areas: Communications, Signal Processing
Brief - MERL EC researchers assisted in the development of an indoor positioning system with WiFi and acoustic based ranging technologies. Please see the link below for the full Mitsubishi Electric press release.
-
- Date: March 1, 2016
Where: Tokyo, Japan
MERL Contact: Kieran Parsons
Research Areas: Communications, Signal Processing
Brief - MERL optical transceiver technology that enables 1 Terabit per second communication speed was reported at a recent press release event in Tokyo. Please see the link below for the full Mitsubishi Electric press release text.
-
- Date: March 14, 2016 - March 18, 2016
Where: Institute for Mathematics and its Applications
MERL Contact: Mouhacine Benosman
Research Area: Dynamical Systems
Brief - Mouhacine Benosman will give an invited talk about reduced order models stabilization at the next IMA workshop 'Computational Methods for Control of Infinite-dimensional Systems'.
-
- Date: December 14, 2015 - December 16, 2015
Where: Las Vegas, NV, USA
Research Area: Machine Learning
Brief - MERL researcher, Oncel Tuzel, gave a keynote talk at 2016 International Symposium on Visual Computing in Las Vegas, Dec. 16, 2015. The talk was titled: "Machine vision for robotic bin-picking: Sensors and algorithms" and reviewed MERL's research in the application of 2D and 3D sensing and machine learning to the problem of general pose estimation.
The talk abstract was: For over four years, at MERL, we have worked on the robot "bin-picking" problem: using a 2D or 3D camera to look into a bin of parts and determine the pose, 3D rotation and translation, of a good candidate to pick up. We have solved the problem several different ways with several different sensors. I will briefly describe the sensors and the algorithms. In the first half of the talk, I will describe the Multi-Flash camera, a 2D camera with 8 flashes, and explain how this inexpensive camera design is used to extract robust geometric features, depth edges and specular edges, from the parts in a cluttered bin. I will present two pose estimation algorithms, (1) Fast directional chamfer matching--a sub-linear time line matching algorithm and (2) specular line reconstruction, for fast and robust pose estimation of parts with different surface characteristics. In the second half of the talk, I will present a voting-based pose estimation algorithm applicable to 3D sensors. We represent three-dimensional objects using a set of oriented point pair features: surface points with normals and boundary points with directions. I will describe a max-margin learning framework to identify discriminative features on the surface of the objects. The algorithm selects and ranks features according to their importance for the specified task which leads to improved accuracy and reduced computational cost.
-
- Date: December 15, 2015
Where: 2015 IEEE Global Conference on Signal and Information Processing (GlobalSIP)
MERL Contact: Hassan Mansour
Research Area: Machine Learning
Brief - MERL researcher Andrew Knyazev gave 3 talks at the 2015 IEEE Global Conference on Signal and Information Processing (GlobalSIP). The papers were published in IEEE conference proceedings.
-
- Date: November 11, 2015 - November 12, 2015
Where: University of Connecticut
MERL Contacts: Christopher R. Laughman; Scott A. Bortoff; Hongtao Qiao
Research Area: Data Analytics
Brief - MERL Researchers Scott A. Bortoff, Chris Laughman and Hongtao Qiao attended the North America Modelica User's Group Meeting, hosted by the University of Connecticut, November 11-12, 2015. Scott Bortoff gave the Keynote Address entitled "Using Modelica in Industrial Research and Development," and Chris Laughman and Hongtao Qiao each presented a paper on modelling of HVAC systems. The Meeting attracted approximately 80 Modelica users from a diverse set of companies and universities including United Technologies, Johnson Controls and Ford. Use of Modelica is accelerating in North America, lead by largely by automotive and similar "systems manufacturing" type companies.
-