CMOS camera based visible light communication (VLC) using grayscale value distribution and machine learning algorithm

Ke-Ling Hsu; Yu-Chun Wu; Yu-Cheng Chuang; Chi-Wai Chow; Yang Liu; Xin-Lan Liao; Kun-Hsien Lin; Yi-Yuan Chen

doi:10.1364/OE.28.002427

1. Introduction

The increase in numbers and types of connected devices, together with the huge amounts of bandwidth required, is pushing the existing wireless networks to the limit. Optical wireless communication (OWC) [1–6] could provide a promising solution by supplying extra bandwidth, license-free and electromagnetic interference free connectivity for radio-frequency (RF) restricted environments. Mobile-phone based visible light communication (VLC) has gained considerable considerations nowadays since mobile-phones in the markets with embedded complementary-metal–oxide–semiconductor (CMOS) cameras are common. Using these cameras as VLC receiver (Rx) is possible but challenging, since the camera frame rate is usually limited to 30/60 fps. Camera communication at 150 bit/s was reported; however the data rates were limited [7]. Image senor with separated tailor-made pixels for imaging and VLC detection has been proposed [8]; however, it is complicated and costly. Employing CMOS camera rolling shutter effect (RSE) for high-speed VLC detection has been reported [9–14], in which the VLC data rates can be higher than the frame rates. During the light detection of the CMOS camera, the pixel row is activated in succession. If the LED transmitter (Tx) is flashing higher than the camera frame rate, bright and dark fringes can be observed in the image frame representing the LED “ON” and “OFF”. By demodulating this RSE pattern, the VLC data logic can be retrieved. Double equalization, 2D-convolutional neural network (CNN) and composite amplitude-shift keying have been proposed to enhance the RSE VLC performance [15–17].

In this work, we put forward a VLC system using mobile-phone CMOS camera Rx and light emitting diode (LED) display panel Tx. The display panel is primarily used for displaying advertisements. By modulating its backlight LED, dynamic contents, such as navigation information, maps, restaurant menus, shop borsches, etc. (i.e. secondary information) can be transmitted wirelessly to users. However, as shown in [13], the data rate of RSE based VLC is limited by the pixel-per-bit received by the CMOS image sensor. Besides, as different primarily content will be displayed, if the contrast is too low, the data rate is reduced. In previous work, we have defined the noise-ratio (NR) [14] as the figure-of-merit to estimate the display content grayscale value contrast with respected to all-white content. Here, we propose and demonstrate a CMOS RSE pattern demodulation scheme based on grayscale value distribution (GVD) and machine learning algorithm (MLA) to enhance the RSE demodulation.

2. Experiment and machine learning algorithm

The proposed VLC system using LED display panel and CMOS camera is shown in Fig. 1(a). An arbitrary waveform generator (AWG, Tektronix AFG3252C) is connected to the display panel (Li-Cheng Corp.) to directly modulate its backlight LED. The backlight LED of the display panel has the output power of 22 W. The AWG has 240 MHz analog bandwidth and 2 GS/s sampling rate. The Rx is an iPhone 7±, with 1080 × 1920 pixels CMOS camera resolution. The signals are captured at 30 fps using our developed application program in the mobile phone. The program disables the automatic white-balance function and uses an ISO = 400. The transmission distance is 1.5 m during experiment. The flow diagram of the MLA implementation is shown in Fig. 1(b). The VLC packet is read-in, and the position of the display panel in the image is located by identifying the packet header. The proposed MLA is based on logistic regression classification [18]. The received RSE pattern is separated into R, G, and B color patterns. Due to the different display contents, the grayscale values of the pixels in each pixel row are not the same. Row averaging is employed to obtain an average grayscale for each pixel row. Then the GVD analyses of the R, G, and B color pattern are performed. The grayscale values of R, G and B is from 0 (dark) to 255 (bright). During the GVD process, the color pattern having distribution with higher counts at higher grayscale values is chosen for the subsequent MLA process. To generate a bipolar grayscale values for the MLA, z-score normalization is employed. This is obtained by using the average grayscale value minus the means of all the averaged grayscale values in the VLC packet, and then it is divided by the standard deviation of all the averaged grayscale values. In the CMOS camera, there is an inter-frame time gap without any light detection. To guarantee the entire VLC packet is captured in an image frame, each VLC packet is transmitted successively three times. The VLC packet is shown in Fig. 1(c), and each packet consists of a 16-bit header and different bit-length payload.

Fig. 1. (a) VLC system using LED display panel and CMOS image sensor. (b) Flow diagram of the MLA and GVD implementation. (c) VLC packet design.

Download Full Size | PDF

In our proposed MLA scheme, the frame rate is 30 fps. 30 frames are used in each bit-error-rate (BER) measurement. The first image frame out of the 30 frames is chosen for the MLA training process. This image is also used for the GVD process. Here, the BER is obtained from the bit-by-bit comparison between the received data logic and the transmitted data logic. Figures 2(a) and 2(b) illustrate examples by using the traditional thresholding scheme, named as extreme value averaging (EVA) [13] to identify the data logic at NR = 0% and NR = 70% respectively. The insets illustrate the corresponding advertizing display contents received. The blue lines are the decoded grayscale value patterns from the received RSE patterns at NR = 0% and 70% respectively; while the red lines are the EVA thresholding. In the dotted purple circles in Fig. 2(b), the original transmitted data logics should be all “1” (blue line). However, after the thresholding (red line), some grayscale values are below the threshold, and will be considered as logic “0”, creating errors. When the NR = 0% as shown in Fig. 2(a), the thresholding can correctly identify the logic in this example. Hence, the traditional thresholding scheme does not work well when the NR is very high; and MLA is proposed.

Fig. 2. Example to illustrate the limitation of the traditional thresholding scheme to identify the data logics of the received RSE patterns at (a) NR = 0% and (b) NR = 70%.

Download Full Size | PDF

Figures 3(a)–3(c) show the z-score normalized R, G and B grayscale value RSE patterns respectively. The R color RSE pattern shows a more even RSE pattern due to a higher red color components in the display content. However, the G and B RSE patterns have higher signal fluctuations. Then, each z-score normalization pattern is converted into a column matrix, in which the length is the number of pixel rows. The MLA is based on classification using logistic regression, in which the monochromatic z-score normalized matrix is multiplied by weight ω plus bias ω₀. Considering the length of the payload data has N pixels in an image; hence, the length of the column matrix is N. The posterior probability P_n can be obtained by using the sigmoid function σ(.) as shown in Eq. (1), where n is from 1 to N.

(1)$${P_n} = \sigma ({{z_n}} ),\textrm{ and }{z_n} = \omega {x_n} + {\omega _0} = {\bf w}{{\bf x}_{\bf n}}$$

In which w = [ω₀, ω], and x_n = [1, x_n]^T. For the total N pixel probability, the likelihood function as shown in Eq. (2) can be used. t is the target value of Pn.

(2)$$p({\boldsymbol{t}|{\bf w}} )= \mathop \prod \limits_{n = 1}^N P_n^{{t_n}}{({1 - {P_n}} )^{1 - {t_n}}}$$

Fig. 3. (a) R, (b) B and (c) G average grayscale value RSE patterns.

Download Full Size | PDF

Then we calculate the cross entropy error function as shown in Eq. (3).

(3)$$E({\bf w} )={-} \ln p({{\bf t}|{\bf w}} )={-} \mathop \sum \limits_{n = 1}^N [{{t_n}\ln {P_n} + ({1 - {t_n}} )\ln ({1 - {P_n}} )} ]$$

By applying gradient descent [18] to minimize the cross entropy error function; the updated weight w can be obtained as shown in Eq. (4), where τ is number of iterations, η is the learning rate. Finally, the BER can be calculated based on the data logic obtained in the probability P_n. If P_n ≧ 0.5, the logic is 1; otherwise, the logic is 0.

(4)$${{\bf w}^{\tau \textrm{ + 1}}} = {{\bf w}^\tau }\textrm{ - }\eta \frac{{\partial E}}{{\partial {\bf w}}},\textrm{ and }\frac{{\partial E}}{{\partial {\bf w}}} = \mathop \sum \limits_{n = 1}^N ({{P_n} - {t_n}} )\; {{\boldsymbol x}_n}$$

3. Results and discussions

GVD analysis is performed to select a better monochromatic RSE pattern for the subsequent MLA demodulation. Figures 4(a)–4(c) show the GVDs of the R, G, B RSE patterns respectively when the display content NR = 0% with 34 bits payload. For NR = 0% (completely white display content), the grayscale value distributions of the R, G, B RSE are similar with mean values (similar mean value, μ at ∼200). Figures 4(d)–4(f) show the GVDs of the R, G, B RSE patterns respectively when NR = 40%. We can observe that the GVD of the R, G, B RSE at 40% are similar with μ = ∼200. Figures 4(g)–4(i) show the GVD of the R, G, B RSE patterns respectively when the NR = 70%. We can observe the distributions of the means of G and B RSE patterns are very low, with μ < 100. However, as shown in Fig. 4(g), the R color pattern has higher μ = 114.7, since the original display content has higher red and orange color components; and the R color RSE pattern is selected for the subsequent MLA. Figures 4(j)–4(l) show the GVD of the R, G, B RSE patterns when the NR = 90%. Similar to the previous case, as B color RSE pattern has a higher GVD, it is selected for the subsequent MLA.

Fig. 4. GVDs of the (a) R, (b) G, and (c) B RSE patterns when display content NR = 0%. (d) R, (e) G, and (f) B RSE patterns when display content NR = 40%; (g) R, (h) G, and (i) B RSE patterns when display content NR = 70%; (j) R, (k) G, and (l) B RSE patterns when display content NR = 90%.

Download Full Size | PDF

The proposed GVD with MLA based VLC system is also compared with the previous VLC system based on traditional demodulation, including enhancement of extinction-ratio (ER) and thresholding scheme using EVA adaptive method [13]. Figures 5(a)–5(d) show the measured BER curves of the demodulated VLC signal using the R color with GVD + MLA; G color with GVD + MLA; B color with GVD + MLA; R ± G±B pattern with MLA only; and the traditional demodulation at NR of 0%, 40%, 70% and 90% respectively. The R ± G±B refers to the case reported in [12] inputting the R, G and B color patterns simultaneously into the machine learning model. When NR = 0% as shown in Fig. 5(a), the traditional scheme can only achieve forward error correction (FEC) requirement when the data rate is at 1,020 bit/s (34 bit/frame x 30 fps); however all the MLA schemes can achieve FEC requirement when the data rate is at 1,260 bit/s (42 bit/frame x 30 fps). Similar trends can also be observed in Fig. 5(b), due to the similar R, G, B GVDs as shown in Figs. 4(d)–4(f), nearly the same BER performance can be achieved by selecting either one of the color pattern. When the NR = 70%, as shown in Fig. 5(c) the G color GVD ± MLA and B color GVD ± MLA have higher BERs due to their small GVDs as shown in Fig. 4(h) and 4(i). In this case, the R color GVD ± MLA should be selected. Further experiment was performed using the display content with B color dominant and NR = 90%. As shown in Fig. 5(d), by selecting the B color GVD ± MLA, which has a higher GVD value as shown in Fig. 4(l), better BER performance could be observed.

Fig. 5. Measured BER curves of the demodulated VLC signal based on different conditions when NR = (a) 0%, (b) 40%, (c) 70%, (d) 90%.

Download Full Size | PDF

At different free space transmission distances of 50 cm, 100 cm and 150 cm, as shown in Fig. 6(a), the proposed GVD with MLA scheme can satisfy the FEC at all these distances. Besides, the received illumination levels at distances of 50 cm, 100 cm and 150 cm are shown in Fig. 6(b). The received illumination level changes with different advertising contents and transmission distances. We can observe that for the VLC system with NR of 70%, transmission distance of 150 cm and data rate at 1,020 bit/s, the proposed GVD and MLA scheme can satisfy the FEC requirement at very low illumination of 188 lux. Figures 6(c)–(f) show the number of iterations needed in the training process to minimize the cross entropy error function for the R, G, B, average grayscale value when NR = 70%. Although 100 iterations are performed; only < 10 iterations are need to minimize the error function.

Fig. 6. (a) Measured BER curves of the demodulated VLC signal based on different conditions at different transmission distances. (b) Received illumination (lux) at free space transmission distances of 50 cm, 100 cm and 150 cm. Iterations to minimize the cross entropy error function for (c) R, (d) B, (e) G, (f) average grayscale patterns.

Download Full Size | PDF

4. Conclusion

We proposed and demonstrated a LED display panel and CMOS camera based VLC system using GVD together with MLA to enhance the RSE demodulation. The received RSE pattern was separated into R, G, and B patterns. Then, the GVD analyses of the R, G, and B color pattern were performed. The color pattern having distribution with higher counts at higher grayscale values was chosen for the subsequent MLA process. Experimental results showed that the system performance can be significantly enhanced. At NR of 70%, the proposed GVD plus MLA scheme can achieve data rate of 1,020 bit/s satisfying the FEC at the illumination of 188 lux. Besides, < 10 iterations were needed to minimize the error function during the training process.

Funding

Ministry of Science and Technology, Taiwan (MOST-107-2221-E-009-118-MY3).

Disclosures

The authors declare no conflicts of interest.

References

1. Z. Wang, C. Yu, W. D. Zhong, J. Chen, and W. Chen, “Performance of a novel LED lamp arrangement to reduce SNR fluctuation for multi-user visible light communication systems,” Opt. Express 20(4), 4564–4573 (2012). [CrossRef]

2. H. L. Minh, D. O’Brien, G. Faulkner, L. Zeng, K. Lee, D. Jung, Y. J. Oh, and E. T. Won, “100-Mb/s NRZ visible light communications using a post-equalized white LED,” IEEE Photonics Technol. Lett. 21(15), 1063–1065 (2009). [CrossRef]

3. H. H. Lu, Y. P. Lin, P. Y. Wu, C. Y. Chen, M. C. Chen, and T. W. Jhang, “A multiple-input-multiple-output visible light communication system based on VCSELs and spatial light modulators,” Opt. Express 22(3), 3468–3474 (2014). [CrossRef]

4. B. Janjua, H. M. Oubei, J. R. Durán Retamal, T. K. Ng, C. T. Tsai, H. Y. Wang, Y. C. Chi, H. C. Kuo, G. R. Lin, J. H. He, and B. S. Ooi, “Going beyond 4 Gbps data rate by employing RGB laser diodes for visible light communication,” Opt. Express 23(14), 18746–18753 (2015). [CrossRef]

5. C. H. Chang, C. Y. Li, H. H. Lu, C. Y. Lin, J. H. Chen, Z. W. Wan, and C. J. Cheng, “Cheng “A 100-Gb/s multiple-input multiple-output visible laser light communication system,” J. Lightwave Technol. 32(24), 4723–4729 (2014). [CrossRef]

6. Y. F. Liu, Y. C. Chang, C. W. Chow, and C. H. Yeh, “Equalization and pre-distorted schemes for increasing data rate in In-door visible light communication system,” Proc. OFC2011, paper JWA083.

7. P. Luo, M. Zhang, Z. Ghassemlooy, H. L. Minh, H. M. Tsai, X. Tang, L. C. Png, and D. Han, “Experimental demonstration of RGB LED-based optical camera communications,” IEEE Photonics J. 7(5), 1–12 (2015). [CrossRef]

8. I. Takai, S. Ito, K. Yasutomi, K. Kagawa, M. Andoh, and S. Kawahito, “LED and CMOS image sensor based optical wireless communication system for automotive applications,” IEEE Photonics J. 5(5), 6801418 (2013). [CrossRef]

9. C. Danakis, M. Afgani, G. Povey, I. Underwood, and H. Haas, “Using a CMOS camera sensor for visible light communication,” Proc. OWC 12, 1244–1248 (2012). [CrossRef]

10. C. W. Chow, C. Y. Chen, and S. H. Chen, “Visible light communication using mobile-phone camera with data rate higher than frame rate,” Opt. Express 23(20), 26080–26085 (2015). [CrossRef]

11. C. W. Chow, C. Chen, and S. Chen, “Enhancement of signal performance in LED visible light communications using mobile phone camera,” IEEE Photonics J. 7(5), 1–7 (2015). [CrossRef]

12. Y. C. Chuang, C. W. Chow, Y. Liu, C. H. Yeh, X. L. Liao, K. H. Lin, and Y. Y. Chen, “Using logistic regression classification for mitigating high noise-ratio advisement light-panel in rolling-shutter based visible light communications,” Opt. Express 27(21), 29924–29929 (2019). [CrossRef]

13. C. W. Chen, C. W. Chow, Y. Liu, and C. H. Yeh, “Efficient demodulation scheme for rolling-shutter-patterning of CMOS image sensor based visible light communications,” Opt. Express 25(20), 24362–24367 (2017). [CrossRef]

14. C. W. Chow, R. J. Shiu, Y. C. Liu, C. H. Yeh, X. L. Liao, K. H. Lin, Y. C. Wang, and Y. Y. Chen, “Secure mobile-phone based visible light communications with different noise-ratio light-panel,” IEEE Photonics J. 10(2), 1–6 (2018). [CrossRef]

15. L. Liu, R. Deng, and L. Chen, “Spatial and time dispersions compensation with double-equalization for optical camera communications,” IEEE Photonics Technol. Lett. 31(21), 1753–1756 (2019). [CrossRef]

16. L. Liu, R. Deng, and L. Chen, “47-kbit/s RGB-LED-based optical camera communication based on 2D-CNN and XOR-based data loss compensation,” Opt. Express 27(23), 33840–33846 (2019). [CrossRef]

17. Y. Yang and J. Luo, “Composite amplitude-shift keying for effective LED-Camera VLC,” IEEE Trans. Mobile Comput. doi: 10.1109/TMC.2019.2897101.

18. C. M. Bishop, Pattern Recognition and Machine Learning (Springer, 2006).

CMOS camera based visible light communication (VLC) using grayscale value distribution and machine learning algorithm

Abstract

1. Introduction

2. Experiment and machine learning algorithm

3. Results and discussions

4. Conclusion

Funding

Disclosures

References

Cited By

Figures (6)

Equations (4)

Optics Express