Professor Morency

Employing video plethysmography for medical treatment and training

Professor Morency

Marie Gethins
February 2016

The first facial recognition software was developed in the 1960s for security purposes, but only in recent years have researchers begun to explore the possibilities it may offer for medical use in diagnosing and supporting treatment. Professor Louis-Philippe Morency, at Carnegie Mellon University’s School of Computer Science in Pittsburgh, Pennsylvania US, is at the forefront of this evolving technology.

In daily social interaction, we subconsciously respond to non-verbal clues: a raised eyebrow, smile, frown, broken eye contact. In a medical setting, these clues can help a mental health professional to gauge patient response. Yet, subtle clues can be difficult to pick-up and assimilate. Over the past decade research into facial expression and behaviour has provided interesting databases on links between physical movements and mood indication. Building upon this research, innovators are hoping to create a system that can assist in mental health treatment.

Therapists often use patient responses to validated questionnaires to gauge treatment progress. Adding video plethysmography and computer analysis of the patient’s physical gestures as they respond to questions may provide additional insight or another supportive tool in mental health treatment.

Professor Louis-Philippe Morency’s team have developed a multi-modal algorithm that uses 68 facial points and other physical markers to analyse patient expression during counselling sessions. The idea of analysing patient physical behaviour evolved from a study that examined changes in patient vocal tone.


In a six-month 60 patient study involving the University of Southern California and Cincinnati Children’s Hospital, Morency was part of a team that investigated speech patterns. While exploratory, the algorithm revealed significant differences in breathy voice qualities between the at-risk and control cohorts. The researchers concluded that video-based features would help contextualise responses and further refine the results. A second multi-centre study is ongoing.

Morency’s team then developed the MultiSense system. Using an X-Box Kinect, webcam, and microphone, the combined audio and video feed is analysed by a sophisticated algorithm, providing indicators to the mental health professional during the counselling session and a full analysis of the recording afterwards. Physical expression is compared against an established database of gestures. These indicators continue to be refined as the database grows. Session-to-session information on the patient’s progress over time also is provided, comparing the individual patient’s physical responses across several sessions.

In addition to facial expression, the system can track gestures, head position, and other body movements. ‘During the live recording MultiSense does automatic analysis of the user’s nonverbal behaviours and it’s at the end of the interaction that we finalize the analysis by looking at the change of behaviours over time and in different context. By design, we do not give feedback during the interaction so that it does not interfere with the conversation,’ Morency said. ‘The analysis is almost real-time, but the bulk of it is at the end. You push one button and in less than 20 to 30 seconds you’ve got a full 20 to 30 minutes of interaction processed.’

Morency further explained that there are two main uses for this technology: screening and treatment. In screening the algorithm picks up indicators of distress or anxiety. In a second round of analysis, the system can help to finalise a diagnosis. ‘It gives [mental health professionals] an objective measure for something that is usually quite subjective,’ he said. Perhaps an even stronger potential role is incorporating the system into mental health treatment. Using the MultiSense database of the patient session recordings, the algorithm can provide metrics to evaluate patient progress and the trajectory of the illness.

Mental health disorders are where the system may be particularly suited, but Morency’s team is constantly discovering new areas to explore. He reported, ‘The most exciting part of this technology is the number of research branches it opens up for non-verbal indicators.’ Two areas that he finds particularly intriguing are remote medical consultations and medical training. Using a web cam and microphone, the system could offer physicians another tool when communicating with patients that may have difficulties in physically attending a session. Data from the algorithm may assist in refining remote treatment in the future. Medical training also is being explored. The team designed a pilot study prototype called TeleCoach to give feedback to doctors about their performance. He highlighted, ‘One of the applications is for training in medical schools with virtual patients. We are at the really early stages of this.’

While the initial results so far have been exciting, Morency stresses that the technology is still in a development phase. ‘I would call all of our previous research exploratory,’ he said. His team has partnered with top U.S. medical institutions, including Harvard Medical School and Yale, to deepen the understanding of nonverbal behaviour indicators. All going well, he hopes that in the next five years the technology may be closer to incorporation into treatment plans and virtual medical training.

Professor Morency notes that as with all multi-disciplinary innovation, there have been challenges, but he is very optimistic on the future outlook. ‘The challenges are the same as you see in most research across the spectrum. Bringing these two communities together - the computer science experts with medical professionals dealing with mental health. The collaborative infrastructure, that’s really the big challenge for us, but it’s very exciting to see people opening up to the idea and get a dialogue going,’ he said.


1Interview Professor Louis-Philippe Morency
2Lijun Yin, Xiaozhou Wei, Yi Sun, Jun Wang. ‘A 3D Facial Expression Database for Facial Behavior Research’. In Proceedings of 7th International Conference on Automatic Face and Gesture Recognition (FGR), (2006) Available here:
3S. Scherer, J. Pestian, and L.-P. Morency. ‘Investigating the Speech Characteristics of Suicidal Adolescents’. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), (2013) Available here:
4It’s only a computer: the impact of human-agent interaction in clinical interviews (2014) Available here:
5Virtual rapport (2006) Available here:
6Latent-dynamic discriminative models for continuous gesture recognition (2007) Available here:
7Robot can detect signs of depression in your face (2015) Available here:
8Machines that can see depression on a person’s face (2015) Available here: