We present a new method of formation photoplethysmographic images with high spatial resolution from video recordings of a living body in the reflection geometry. The method (patent pending) is based on lock-in amplification of every pixel of the recorded video frames. A reference function required for synchronous detection of cardiovascular pulse waves is formed from the same frames. The method is featured by ability to visualize dynamic changes in cardiovascular pulse wave during the cardiac (or respiratory) cycle. We demonstrate that the system is capable to detect the minimal irritations of the body such as gentle scratching of the skin by own finger.
©2011 Optical Society of America
As known , photoplethysmography (PPG) is a non-invasive optical method to detect a cardiovascular pulse wave travelling through the body. The basic form of PPG technology requires only two optoelectronic components: a light source to illuminate a part of the body and a photodetector to measure small variations in light intensity after light interaction with the illuminated part. The interaction of light with biological tissue is complex and it includes the processes of light transmission, reflection, absorption, multiple scattering, and fluorescence . It is established nevertheless [1,3–5] that light interaction with biological tissue in vivo results in the intensity modulation which occurs synchronously with each heartbeat. The key factors that can affect time-varying component of light intensity are the blood volume, blood-vessel-wall movement, and the orientation of the red blood cells [3,6]. Generally, this time-varying component provides a signal proportional to changes in skin blood volume but does not produce a quantitative measure .
The PPG technology has been used in monitoring of oxygen saturation (pulse oxymetry), heart and respiration rates, blood pressure, cardiac output, assessment of autonomic functions, and detection of peripheral vascular diseases . The conventional PPG systems have been limited on the single spot and contact measurements. Remote, non-contact PPG imaging systems became an object of intensive investigations only relatively recently [7–10]. These systems usually operate in a reflection mode (when both the illuminating light source and photodetector are situated at the same side of the body) using two-dimensional photosensitive matrix instead of a single photodiode. The reflection geometry is associated with smaller (compare to the transmission geometry) amplitude of the time-varying intensity modulation which is also strongly affected by the relative movement of the camera and body (motion artifacts), which results in a poor signal-to-noise ratio (SNR). To overcome this problem, significant efforts were made in searching of adequate, robust, and efficient processing of recorded videos. Averaging of the pixels value within relatively large discrete Region of Interest (ROI) and Fourier analysis are typical techniques applied to increase SNR. Using both these data-processing techniques, Wieringa et al. demonstrated acquisition of spatially resolved heartbeat-related photoplethysmograms at multiple wavelengths from the video recorded by a monochrome CMOS-camera . Verkruysse et al. showed that similar processing of videos recorded by conventional color camera under ambient illumination of subject allows two-dimensional mapping of both the amplitude and phase of cardiovascular pulse wave . However, spatial resolution of PPG-images reported in previous papers [7,10] is not high enough because of necessity of averaging large number of pixels for increasing SNR. Fourier analysis also does not improve SNR at longer observation times due to natural instability of the pulse rate.
In this paper we present a novel methodology for formation of PPG images from video recordings of living body in the reflection geometry. It is based on synchronous detection of every pixel of the recorded video frames. A reference function, required for synchronous detection of cardiovascular pulse waves in PPG images, was formed from a large area of the same images to improve SNR by averaging data from large number of pixels. This technique allows us to obtain PPG images with increased SNR and high spatial resolution which is limited only by the used photosensitive matrix. The main purpose of this paper is to demonstrate visualization of dynamic changes in cardiovascular pulse wave during the cardiac cycle. To the best of our knowledge, such kind of waves imaging has never been demonstrated.
2.1. Synchronous detection
In previous works devoted to plethysmographic imaging [7–10], information about blood pulsations at subject’s heart-beat frequency was retrieved after either spectral analysis or narrow band-pass filtering of temporally varying pixels values. After this initial processing only the amplitude data was mapped on the skin image. Amplitude calculation is equivalent to the envelope demodulation in communication systems. Synchronous detection, also known as coherent demodulation , while being more complex has several advantages over envelope demodulation. For example, for amplitude modulated signals it provides higher SNR than envelope detection, and, therefore, the amplitude estimate of the oscillations in video image pixels is less susceptible to noise. Synchronous detection can also retrieve the relative phase of blood pulsations, which opens up new possibilities in photoplethysmographic imaging because it would allow visualization of the dynamics of blood pulsations under the skin.
In data processing systems which use envelope demodulation, the amplitude of a periodical signal is estimated by a peak detector. In contrast, during synchronous detection, the periodical input signal (which typically contains also noise) is multiplied with a reference periodic signal (reference function) having constant amplitude and the same frequency as the input signal . Then this product is integrated over a specific time interval defined by the signal bandwidth. Since the input and reference signals are at the same frequency, the product contains a slowly varying component providing high output value after integration. At the same time, random noise which is not synchronized with the reference signal grows much slower during the integration. Synchronous detection can retrieve the parameters of the input signal even when the noise is stronger than the signal. Detailed description and analysis of synchronous detection can be found in the most of textbooks for communication systems (see, for example .
2.2. Experimental set-up
We used a simple experimental setup which configuration is shown in Fig. 1 . Several light-emitting diodes (LEDs) attached to a lens of a digital camera are used for illuminating a part of the subject’s body. Light reflected from the skin is collected into a monochrome CMOS camera for recording the videos which are then processed to estimate the amplitude and phase of light-intensity oscillations at the heart-beat frequency of the subject by using synchronous detection. The camera (model EO-1312 of the Edmund Optics Inc.) with a global shutter and a detector resolution of 1024 × 1280 pixels was focused on the skin of the subject palm using a complementary Canon TV-zoom (18-108) lens. The distance between the camera lens front and the palm was approximately 100 cm. The camera was fixed at an optical table while subject was asked to put his hand on a custom support and refrain from movements during video recording. Duration of recording varied from 30 s to several minutes.
Various LEDs emitting light at different wavelengths (from 530 to 840 nm) were tested for palm illumination in preliminary experiments. We found that the intensity of light reflected from the subject palm is modulated in time at any of the tested wavelength. As expected, the maximal amplitude of the modulation was achieved at the wavelength of 530 nm which is in accordance with PPG spectra measured and calculated by Cui et al. . Consequently, we carried out final experiments by using a light source comprising only 2 LEDs with the central wavelength of 530 nm (bandwidth 20 nm) and output power of 30 mW. These LEDs were situated at the distance of 100 cm from the palm. Injection current of LEDs and exposure time of the camera were chosen so that the camera’s dynamic range was optimally used while avoiding pixel saturation. In the experiments presented in this paper, videos were recorded at the frame rate of 20 frames per second with the exposure time for each frame of 8 ms.
2.3. Image acquisition and processing
During camera standby mode, a preview window allowed aiming and focusing of the camera. After software triggering the camera stored a sequence of 8-bit digitally encoded images in an internal memory. All image frames then were automatically downloaded to a personal computer and saved on a hard disk in an uncompressed bitmap-image format.
Image processing was performed with custom software using Matlab®, Mathworks. According to established theoretical model , physiological processes of heart pulsations and breathing lead to modulation in time of fractional blood volume at the heart-beat and respiration rates, respectively. Modulation of the blood volume results in the absorption modulation of the light penetrated in the skin, which leads to the intensity modulation of light reflected in vivo from a subject’s skin. Therefore, value of pixels with the same spatial coordinates (x, y) in series of recorded frames is also modulated in time with the heart-beat and respiration rates. This fact was confirmed in numerous experiments including those carried out recently with digital cameras in the reflection mode [7,10,12]. In our experiments during video recording, a subject was asked to avoid any movement of his body and to keep normal breathing. Consequently all frames in the recorded series are very similar one to another so that differences between them are hardly being visible by a naked eye. However, after processing these frames in a computer, time-variable pixel values were revealed and enhanced.
2.4. Reference function formation
In the first step we process the recorded frames to generate a reference function RC(t). To this end a ROI was selected, which covers the most of the palm image as shown in Fig. 2a . All pixel values within this ROI were spatially averaged resulting in single mean value per each recorded frame. Time-trace of this mean value during the whole recorded video is shown in Fig. 2b. As one can see, the ratio of mean-pixel-value modulation to the DC level is about of 0.002 which is enough for recognition of cardiac and respiration physiological processes after Fourier analysis.
Typical power spectrum of the mean-pixel value of the selected ROI is shown in Fig. 3 . It was obtained by applying fast Fourier transform to the signal shown in Fig. 2b after detrending its DC level. Two extremes, which are associated with the respiration rate (0.13 Hz) and heart-beat rate (1.0 Hz), are clearly seen in the spectrum. Instabilities of the rates of physiological process during the time of the reflected-images recording result in broadening of the frequencies representative for the cardiac pulsation and breathing. These instabilities originate from both natural variations of the rates and motion artifacts. Precise definition of the cardiac and/or respiration rate is not the goal of the proposed technique. Therefore for generation of reference function associated with the cardiac cycle, we select a narrow frequency band limited by vertical bars C 1 and C 2, while for breathing reference function it is the band limited by the bars B 1 and B 2 as shown in Fig. 3.
After the frequency band corresponded to the heart beats is selected, all other frequencies are truncated, and the inverse Fourier transform is applied exclusively to the truncated spectrum (for frequencies f in the range of C 1 > f > C 2). This mathematical operation reconstructs the reference function RC(t) which represents the heart pulsations. Note that RC(t) has both the real and imaginary parts since it is calculated from spectral components with positive frequencies only. Since the reference function calculated from the inverse Fourier transform has an integer number of periods, its real and imaginary parts (cosine and sine components) are mutually orthogonal. After calculation of inverse Fourier transform it is normalized in such a way that
This reference function is further used for lock-in amplification of the recorded series of images. The normalization allows further evaluation of the mean complex amplitude of the signal synchronized with the heart beats within the chosen frequency range at every pixel of the frame. Both the real and imaginary parts of the calculated reference function RC(t) are shown in Fig. 4 together with raw time-trace of the mean pixel value.
As seen in Fig. 4, the real part of the reference function is in the phase with variations of the mean pixel-value used for its generation. Phase coincidence between the function RC(t) and raw time-trace during the chosen processing time is an essential condition for successful lock-in amplification of the recorded frames. However, uncontrolled change of the cardiac pulse rate during video recording may lead to unbalance between the reference function and raw time-trace thus resulting in incorrect estimation of the amplitude of pixels modulation. In this case another (adaptive) algorithm of the reference-function generation is to be applied.
The adaptive algorithm also uses the raw time-trace of the spatially averaged pixels value for calculation of the reference function. In contrast with previous algorithm we use only 3 – 5 periods of the cardiac pulsations for approximate estimation of the heart-beat rate. This estimation is done by applying fast Fourier transform to the initial part of the raw time-trace and selecting the frequency at which the power spectra reaches the maximum within pre-defined frequency band (typically, 0.7 – 1.5 Hz). The reference function is generated as a harmonic signal with the period length of integer number of samples at the frequency nearest to the heart beat rate. Then every period of the reference function is compared with the raw time-trace of the mean pixel value in real time by calculating their cross-correlation value. If the phases of the reference function and raw time-trace do not coincide, the phase of the reference function is adjusted to achieve their coincidence in the time domain. Thus the real part of the final reference function built with this algorithm consists of a series of single-period cosine segments where the phase of every segment corresponds to the phase of the original raw time-trace of the spatially averaged pixels value. Imaginary part of this reference function is, respectfully, a series of the same phase-adjusted cosine segments shifted by π/2. Formation of the reference function using adaptive algorithm is shown in Fig. 5 .
2.5. Correlation matrix and visualization of blood-volume dynamics
In the second step, the recorded series of frames is multiplied by the reference function in such a way that each pixel value of the first frame is multiplied by the same coefficient RC(t = t 0) where t 0 is the moment when the first frame was captured. Then the second frame is multiplied by the coefficient RC(t 1) where t 1 is the moment of the second frame capturing, and so on. Schematically multiplication of the series of frames by the reference function RC(t) is shown in Fig. 6 . Note that only the real part of the function RC(t) is plotted in Fig. 6 to simplify the drawing but the function RC(t) has both the real and imaginary parts. Therefore after multiplication, we obtain the series of frames in which pixels have the complex values. Thereafter, a correlation matrix SC(x, y) is calculated by summarizing complex elements having the same coordinates (x, y) over all the frames in accordance with
Here I(x, y, t) is value of the pixel with coordinates (x, y) of the image frame captured at the moment of t. The correlation matrix SC(x, y) contains the same number of pixels as any of the initial frames I(x, y, t).
As seen from Eq. (2), the matrix SC(x, y) is approximately equal to the cross-correlation function of the reference RC(t) and the time-varying image of the subject’s palm. Since RC(t) represents the cardiac pulsations, the matrix SC(x, y) is a lock-in amplification of the pixel values varying in time synchronously with the heart beats. Therefore, the modulus of each pixel value of the correlation matrix SC(x, y) is proportional to the modulation amplitude of light intensity reflected from the respective point of the subject’s skin. Since light-intensity modulation is caused by the blood pulsations [3–10,12], the matrix SC(x, y) describes spatial distribution of ac-component of the blood-volume pulsations at the heart-beats frequency. In other words, it is PPG image.
Note that a priori we cannot claim that blood pulsations occur with the same phase everywhere in the whole observable area of the subject’s skin. Owing to its complexity, the matrix SC(x, y) contains information also about relative phase of blood pulsations in different parts of the observable area. Let us assume, that the reference function, normalized in an observation interval with N samples of mean-pixel-value time-trace is
and the pixels values are
Here A(x,y) is the amplitude of the pixel value oscillations at the frequency of f, ψ(x,y) is the relative phase of these oscillations, and B(x,y) is the mean pixel value. The number of cycles of the reference function within the observation interval is integer. This is provided by both algorithms for reference function calculation. Therefore, Eq. (2) yields
Consequently, the real part of SC(x, y) corresponds to the instant deviation of pixels values I(x,y,t) from their mean values B(x,y) at the moment when the phase of the reference function is equal to zero. The imaginary part of SC(x, y) in turn describes the instant deviation of pixels values from their mean values at the moment when the phase of RC(t) is equal to π/2. Therefore, since the deviation of the pixel value is proportional to the local change in blood volume, we can reconstruct dynamic changes of the blood volume pulsations during the cardiac cycle by calculating a new series of frames as
Here where fC is the mean rate of the heart beats.
Values assigned to the pixels of the frames can be positive, negative, or zero. Zero level means either absence of the blood-volume pulsations at the heart beats or pulsations which are phase shifted by 90° in respect to the reference function RC(t). Positive values are pulsations of the blood volume in phase with RC(t), while negative values are in counter phase. The higher the amplitude of the pulsations, the larger value is assigned to the pixel. The animation shows dynamic changes of the amplitude of the fractional blood volume in the skin during the cardiac cycle and spatial distribution of the relative phase of these pulsations.
Five volunteers from the members of our research group were enrolled in this study. Informed consent was obtained from them prior to the start of each study session. Before recording the video, subject was asked to find a comfortable position, to put his hand onto support, to keep it relaxed, and to breathe normally. First video of subject’s palm was recorded during 30 s. Then subject was asked to make a gentle scratch a part of his palm by his finger. Second video was recorded immediately after scratching also during 30 s. The frames of the recorded videos were processed off-line using the algorithm described in the previous section.
Figure 7 shows spatial distribution of the amplitude of blood-volume pulsations at the frequency at the heart-beats frequency (PPG image) given by the modulus of the matrix SC(x,y). The matrices were calculated from the video taken before (a) and after (b) scratching. One can see that small impact caused by gentle scratching results in the increased amplitude of blood volume pulsations. Increasing of blood pulsations in the area of scratching was observed in each subject. However, initial PPG image varies from one subject to another. This difference can be seen in Fig. 7 (c and d) where PPG images of another subject are shown. In Fig. 7 the amplitude of blood-volume pulsations is coded by pseudocolors with the scale shown in the right side of each subfigure.
Non-uniform spatial distribution of the blood-pulsations phase is illustrated in Fig. 8 which shows phase-shifted instant deviation of the blood-volume-pulsations amplitude from its averaged value given by the matrix at ϕ = −30°. Since the matrix has both positive and negative values, the values of pixels in Fig. 8 are coded so that the positive values are marked by green while the negative – by magenta. More bright green or magenta corresponds to large pixel value while dark is zero value. As one can see in Fig. 8a and Fig. 8c, the blood volume at the most of the palm area has positive deviation (green) from its average value while some part (within the white ellipses) has negative deviation (magenta). Presence of both colors in images of Fig. 8 unambiguously shows that the phase of blood pulsations is not uniform. Note that both the position and size of the areas with counter-phased pulsations are different for different subjects (compare Fig. 8a with Fig. 8c). Dynamic changes of blood volume during the cardiac cycle are shown in animations associated with Fig. 8 (see Media 1–4) in which different phase of blood pulsations in different areas are clearly visible. Influence of scratching on the blood pulsation is also clearly seen as more bright spots in both green and magenta (compare Media 1 with Media 2, and Media 3 with Media 4).
The data processing method reported in the paper has great potential in visualization of the blood perfusion. Dynamic changes in cardiovascular pulse wave during the cardiac (or respiratory) cycle, which cannot be seen by naked eye, are clearly visible in the demonstrated experiments (Media 1 – 4 and Fig. 8). The experiment with gentle scratching of the subject’s skin confirms that the proposed technique provides visualization of blood-pulsations amplitude. Similar response of human skin on the gentle scratching was observed when both Laser-Doppler Velocimetry and Laser-Speckle Correlation techniques were applied for blood perfusion mapping . Both previous techniques show local increase of blood perfusion in the area of skin scratching . Our technique is more advantageous because it additionally allows dynamic visualization of blood perfusion during the cardiac cycle.
Information presented in animations is very useful for analysis of the blood perfusion process in live tissues. Therefore, it can provide supportive cardiovascular information to assist diagnosis of various diseases. Comparison of animations in Media 1 and 3 shows high variability of blood perfusion in individual subjects. This feature can be used as a signature for detection of integrated morphological and functional peculiarities of local vascularization. In contrast to the results obtained from individual subjects, the minimal irritation (such as gentle scratching) results in stereotypic vascular reaction (see Media 2 and 4) which can be used to evaluate the reactivity of local compensatory reactions.
As we showed in our experiments, comparative analysis of blood perfusion changes allows studying body reactions on any external or internal impacts. Let us estimate the time resolution achievable in these studies. The correlation matrices SC(x, y) and (which allows mapping of both amplitude and phase of blood pulsations as shown in Figs. 6, 7) were calculated after summarizing the product of the video frames recorded during 30 s and the reference function RC(t) [Eq. (2)]. This video duration corresponds approximately to 30 periods of the cardiac cycle. It means that in the presented figures and animations we show the blood-pulsation amplitude which is averaged over all these 30 heart beats. In other words, the resolvable time interval for studying of body reaction on different impacts is 30 s in this particular case. This time can be diminished but in the expense of SNR of the visualized spatial distributions. In the most of our experiments, 3 – 5 periods of the heart beats were enough to generate the reference function. Correspondingly, correlation matrixes calculated after 6 – 8 cardiac periods (during 8 s) have reasonable SNR for rough estimations of the most important features in spatial distributions of blood pulsations. Most probably, 8 s is about the minimal temporal resolution which can be achieved in the proposed system. However, many physiological reactions and processes are characterized by much longer relaxation time, which allows their study by using the proposed technique.
In the experiments reported in this paper, illumination of the subject’s palm was carried out at the wavelength of 530 nm. However, correlation matrixes were also successfully calculated when LEDs emitting at another wavelength were used for illumination the subject’s skin. Since the proposed technique is self-referenced, there are no special requirements to the wavelength selection except to provide large enough PPG signal. Light penetration into the skin depends on the illumination wavelength increasing when it becomes greater than 600 nm. Therefore, PPG images obtained in the red and near-infrared light will show spatial distribution of the amplitude and phase blood pulsations from deeper vascular bed. Note that formation of PPG images simultaneously at several wavelengths is also possible in the proposed technique. Consequently, the proposed system could also provide contactless imaging of arterial oxygen saturation distribution within tissue in the way similar to that presented in .
As we mentioned in the introduction, motion artifacts always affect the PPGs signal. In our system this influence is diminished owing to applied principle of lock-in amplification. Any relative movement of the camera and subject’s skin, which is not correlated in the frequency domain with either cardiac or breathing physiological process, makes smaller contribution to the values of the matrix pixels than the correlated variations of the light intensity received by the camera from the subject’s skin. Note that the larger distance between the camera and the subject, the smaller influence of the relative movement is. However, significant movements of the subject during video recording result in diminishing of the ac-amplitude of the PPG signal thus diminishing SNR in the final PPG images. Therefore, any relative movements of the camera and subject should be minimized.
Image processing described and presented here was given using the reference function synchronized with the cardiac cycle. But this is no unique option of the system: the reference function can be also generated to follow the breathing process by selecting the frequency band B 1-B 2 in Fig. 3. In this case the correlation matrix shows distribution of the amplitude and phase of blood pulsations related with the breathing process. Our preliminary experiments show that these images are slightly different from those synchronized with the heart beats. However, more detailed research is required for analysis of the difference between the correlation matrixes related with cardiac and breathing physiological processes, which is out of the scope of this paper. It is worth noting that even though the data processing reported in this paper was carried out offline, the algorithm can be modified and built in the camera software to process data online in real time providing visualization of blood pulsation waves in a few seconds after the patient observation.
We developed a new method of high resolution visualization of blood pulsative flow from video recordings of living body in the reflection geometry. The method is based on lock-in amplification of every pixel of the recorded video frames. A reference function required for synchronous detection of cardiovascular pulse waves is formed from the same video frames. Dynamic changes of the blood volume pulsations during the cardiac cycle are visualized by calculating a phase-shifted correlation matrix. We show high variability of images of blood volume pulsations in individual subject, which allows us to suggest our method as a signature detecting integrated morphological and functional peculiarities of local vascularization. In sharp contrast to results obtained from individual subjects, we found that the minimal irritation results in stereotypic vascular reaction which could also be used to evaluate the reactivity of local compensatory reactions. Information presented in animations is very useful for analysis of the blood perfusion process in live tissues. Therefore, it can provide supportive cardiovascular information to assist diagnosis of various diseases related with blood perfusion.
This work was partially funded by the Academy of Finland (project no. 128582). The authors are grateful to Dr. Tapani Lahtinen (Delfin Technologies Ltd., Finland) and Prof. Rashid Giniatullin (the University of Eastern Finland) for the useful discussion of the obtained results.
References and links
1. A. B. Hertzman and C. R. Spealman, “Observations on the finger volume pulse recorded photoelectrically,” Am. J. Physiol. 119, 334–335 (1937).
7. F. P. Wieringa, F. Mastik, and A. F. W. van der Steen, “Contactless multiple wavelength photoplethysmographic imaging: a first step toward “SpO2 camera” technology,” Ann. Biomed. Eng. 33(8), 1034–1041 (2005). [CrossRef] [PubMed]
8. S. Hu, J. Zhieng, V. Chouliaras, and R. Summers, “Feasibility of imaging photoplethysmography,” in Proceedings of IEEE Conference on BioMedical Engineering and Informatics (IEEE, 2008), pp. 72–75.
9. K. Humphreys, T. Ward, and C. Markham, “Noncontact simultaneous dual wavelength photoplethysmography: a further step toward noncontact pulse oximetry,” Rev. Sci. Instrum. 78(4), 044304 (2007). [CrossRef] [PubMed]
11. M. P. Fitz, Fundamentals of Communications Systems (McGraw-Hill, New York, 2007), Chap. 11.
12. M.-Z. Poh, D. J. McDuff, and R. W. Picard, “Non-contact, automated cardiac pulse measurements using video imaging and blind source separation,” Opt. Express 18(10), 10762–10774 (2010). [CrossRef] [PubMed]