Journal of Semiconductors, Volume. 45, Issue 12, 120401(2024)

Artificial cat eye camera for objects detection against complex backgrounds and varied lighting

Shengqiang Zhang and Zhuoran Wang*
Author Affiliations
  • School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China
  • show less

    (Color online) (a) Schematics of feline and conventional vision during the daytime and (b) nighttime. (c) Schematics of the feline vision in the daytime with a VP for light adaptation. The yellow plane represents the tangential plane, and the blue plane represents the sagittal plane. (d) Optical simulation of the cross-sectional focal spot according to object distance. The horizontal cross section represents the tangential plane, and (e) the vertical cross section represents the sagittal plane. (f) Ray tracing simulation for various object distances with small circular pupils and (g) VPs, the cross-shaped object is located at 150, 200, and 250 mm. (h) Simulation for camouflage breaking with small CPs and VPs. The center of the random texture is located at 200 mm, and the background random texture is located at 400 mm. (i) Photograph of the fabricated hemispherical silicon photodetector array combined with patterned silver reflectors (HPA-AgR). The inset shows an individual photodiode pixel with a circuit diagram. (j) Exploded structure of the device with a detailed thickness of each component. (k) Schematic illustration showing the artificial feline eye–inspired vision system. (l) The tested image with the ground truth (GT) image and noisy GT of letters (i.e., F, O, C, U, and S) obtained with a small CP and VP. (m) Optical simulation results for the dataset with small CP and VP for each label. (n) The calculated accuracy rates for the Fashion-MNIST dataset from image simulations, both with and without noise. Copyright 2024, American Association for the Advancement of Science[10].

    Figure 1.(Color online) (a) Schematics of feline and conventional vision during the daytime and (b) nighttime. (c) Schematics of the feline vision in the daytime with a VP for light adaptation. The yellow plane represents the tangential plane, and the blue plane represents the sagittal plane. (d) Optical simulation of the cross-sectional focal spot according to object distance. The horizontal cross section represents the tangential plane, and (e) the vertical cross section represents the sagittal plane. (f) Ray tracing simulation for various object distances with small circular pupils and (g) VPs, the cross-shaped object is located at 150, 200, and 250 mm. (h) Simulation for camouflage breaking with small CPs and VPs. The center of the random texture is located at 200 mm, and the background random texture is located at 400 mm. (i) Photograph of the fabricated hemispherical silicon photodetector array combined with patterned silver reflectors (HPA-AgR). The inset shows an individual photodiode pixel with a circuit diagram. (j) Exploded structure of the device with a detailed thickness of each component. (k) Schematic illustration showing the artificial feline eye–inspired vision system. (l) The tested image with the ground truth (GT) image and noisy GT of letters (i.e., F, O, C, U, and S) obtained with a small CP and VP. (m) Optical simulation results for the dataset with small CP and VP for each label. (n) The calculated accuracy rates for the Fashion-MNIST dataset from image simulations, both with and without noise. Copyright 2024, American Association for the Advancement of Science[10].

    Apart from the improvement in optics, an ultrathin photodiode array with artificial reflectors was constructed mimicking the tapetum lucidum in nocturnal animals, as shown in Fig. 1(i). This array comprises a curved image sensor array with active silicon photodiodes. To enhance light absorption especially in low-light environment, a silver reflective layer is added to the diodes' backside. Fig. 1(j) shows the device's exploded structure containing multiple ultrathin layers encapsulated with polyimide (PI). This ultrathin design grants significant mechanical toughness, allowing the device to conform to a spherical surface and could potentially adapt to the curved focal plane. This approach enhances the mechanical strength of ultrathin flexible devices. Fig. 1(k) presents the feline eye–inspired imaging system, which consists of two main components: a hemispherical silicon photodetector array combined with patterned silver reflectors (HPA-AgR) serving as the imaging system, and an optical system that features adjustable apertures (i.e., small CP and VP). Fig. 1(l) displays the results of the target recognition test performed with the artificial bionic system, evaluated against ground truth images of mixed noise and letters. When letter images serve as targets, the artificial bionic system integrated with a vertical pupil demonstrates superior camouflage-breaking capabilities in a random noise background compared to the system with a small circular pupil. Fig. 1(m) presents the test results of the artificial bionic system consisting of either small circular or vertical pupils in reproducing ground-truth images against mixed noise, where the former demonstrates superior ability to penetrate camouflage in random noise backgrounds, leading to significantly higher recognition accuracy under noisy background (Fig. 1(n)).

    In summary, this artificial vision system shows a high degree of similarity in both structures and functions to cat eyes, demonstrating superior ability in filtering redundant information and detecting camouflaged objects in diverse lighting, overriding the traditional circular pupil structures at the hardware level in this regard. As a result, the cat-eye-inspired bionic system enhances the input image's signal-to-noise ratio on a hardware level by utilizing unique depth-of-field control to disrupt camouflage at specific distances. This sensory hardware implemented image preprocessing can therefore reduce the computational burden of the processor, which would consume extra power on edge detection and feature extraction with conventional sensors in order to perform neural network tasks such as classification or recognition. Consequently, this approach has the potential to save computational resources and reduce power consumption in machine vision. However, it is identified the primary issue lies in the limited pixel density and resolution that is by no means comparable to current CMOS sensors—a persisting challenging araised by the yet-to-be-developed flexible manufacturing technology, in addition to the limited FoV that could potentially be compensated by mechanical eye movement[11, 12].

    Smaller ambush predators usually evolve vertical pupils (VPs), which provide spatial asymmetry of depth of field (DoF) and greater flexibility in controlling focus. This adaptation makes it easier to detect prey that is disguised in their natural environment[8]. Fig. 1(a) shows that during the day traditional circular pupils (CP) can lead to overload of clear information, causing targets like rats to become less noticeable. In contrast, the vertical apertures of felines intentionally keep the background blurry while maintaining clarity on the target. Furthermore, for nocturnal animals such as domestic cats, owls, and nightingales, the presence of the tapetum lucidum behind their retinas enhances light absorption by reflection, thereby improving visual sensitivity[9], as demonstrated in Fig. 1(b). Inspired by these unique characteristics of feline vision, Kim et al. designed a bionic artificial imaging system incorporating a slit-like elliptical aperture and a spherical silicon photodiode array, which is advantageous in filtering out redundant information and detecting camouflaged targets against varying light intensities[10].

    Advanced machine vision provides a direct and fast approach to perceive the external environment, enabling the rapid development in the state-of-art automatic driving, environmental monitoring, and human-machine interaction, etc. However, detecting and recognizing objects from complex backgrounds usually requires high dynamic range imaging and complex algorithms, raising tremedous challenging in further reducing the size, weight, and power (SWaP) in sensory system. Once the target object and background are easily distinguished at the sensor hardware level, a great deal of computational resources and power consumption can be saved[1, 2]. Inspired by biological eyes that are advantageous in having both simple structures and high environment adaptability, artificial biomimetic vision systems have been realized in recent years featuring wide field of view (FoV) in fish[3], telescope vison in eagle[4, 5], amphibious adaptability in crab[6], high dynamic range in mantis shrimp[7], etc. The camouflage breaking vision in predators are also providing insights in designing advanced imaging sensory hardwares.

    Fig. 1(c) shows the asymmetric DoF created by the vertical pupil of felines and its impact on imaging characteristics. The contraction of the vertical pupil results in a sagittal depth of field (S-DoF) that is longer than the tangential depth of field (T-DoF). Consequently, the target positioned at (ⅱ) is clearly imaged because it falls within both DoF ranges, while the targets at positions (ⅰ) and (ⅲ) lie outside the T-DoF, leading to blurred images. To validate the asymmetric DoF associated with the vertical pupil, an optical path system with a working distance of 200 mm was constructed. Figs. 1(d) and 1(e) depict the relationship between focal spot size and object distance in both horizontal and vertical directions. In the vertical direction, the limited DoF makes it challenging to maintain a clear image across the object distance range of 150 to 200 mm. The optical systems with circular and vertical apertures image a cross pattern at object distances of 150, 200, and 250 mm, respectively. In small circular pupil imaging, the cross is clear at all distances (Fig. 1(f)), which becomes blurry when imaging with the vertical pupil at object distances of 150 and 250 mm only in its herizontal direction (Fig. 1(g)), confirming the asymmetric DoF characteristics of the vertical pupil. Fig. 1(h) illustrates the imaging conditions for a central texture target against background noise, under varying light intensities and pupil constriction states. Similar to the pupillary light reflex observed in biology, the pupil constricts in response to increasing light intensity. For a circular pupil, this constriction augments the depth of field in a directionless-specific manner, thereby integrating the background into the focal plane and impairing target discrimination. Conversely, a vertically oriented pupil constricts primarily in the horizontal direction, extending only the S-DoF. This configuration ensures that the target consistently remains within the focused DoF while the background resides outside the T-DoF. As a result, the target pattern can be distinguished easily in all light conditions for the vertical pupil, which completely mixed with the background when a circular pupil was applied with limited opening ratio.

    Tools

    Get Citation

    Copy Citation Text

    Shengqiang Zhang, Zhuoran Wang. Artificial cat eye camera for objects detection against complex backgrounds and varied lighting[J]. Journal of Semiconductors, 2024, 45(12): 120401

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Research Articles

    Received: Sep. 28, 2024

    Accepted: --

    Published Online: Jan. 15, 2025

    The Author Email: Wang Zhuoran (ZRWang)

    DOI:10.1088/1674-4926/24090053

    Topics