Image-free cross-species pose estimation via an ultra-low sampling rate single-pixel camera

Xin Wu; Cheng Zhou; Binyu Li; Jipeng Huang; Yanli Meng; Lijun Song; Shensheng Han

doi:10.3788/COL202523.091101

Chinese Optics Letters, Volume. 23, Issue 9, 091101(2025)

Image-free cross-species pose estimation via an ultra-low sampling rate single-pixel camera

Xin Wu¹, Cheng Zhou^1、*, Binyu Li², Jipeng Huang^1、**, Yanli Meng¹, Lijun Song^3、***, and Shensheng Han⁴

Author Affiliations

¹School of Physics, Northeast Normal University, Changchun 130024, China

²Beijing Institute of Space Mechanics and Electricity, Beijing 100094, China

³Changchun Institute of Technology, Changchun 130012, China

⁴Shanghai Institute of Optics and Fine Mechanics, Chinese Academy of Sciences, Shanghai 201800, China

show less

Figures & Tables(9)

Fig. 1. Overview of plane-array-camera-based and SPC-based pose estimation models. (a) The architecture of plane-array-camera-based (image-based) pose estimation model. (b) The architecture of SPC-based (image-free) pose estimation model that performs reconstruction before pose estimation, the two-stage architecture of image-free model with implicit reconstruction, and the architecture of our proposed image-free and (implicit) reconstruction-free model.

Download full size

View in Article

Fig. 2. Overview of the SPCPose. (a) SPC structure schematic. (b) The architecture of our proposed image-free and (implicit) reconstruction-free cross-species pose estimation model.

Download full size

View in Article

Fig. 3. Visualization of SPCPose on the public dataset. (a) Visualization of SPCPose on Tri-Mouse. (b) Visualization of SPCPose on Horse10. M1 means natural order, M2 means reverse order, and M3 means random order. The numbers on the picture mean the sampling rate.

Download full size

View in Article

Fig. 4. Visualization of SPCPose on the indoor Human dataset. The numbers on the picture mean the sampling rate.

Download full size

View in Article

Fig. 5. Real-world experimental results. (a) Schematic of our customized SPC and captured scenes. (b) Results of the same object captured with an RGB camera perform different actions in three real-world scenes. (c) Reconstructed scenes using our single-pixel detection system and pose estimation results using an image-based approach. (d) Results of the natural-order single-pixel detection value map representation and the action of SPCPose. (e) Inverse-order single-pixel detection value map representation and results from the action of SPCPose. (f) Random-order single-pixel probe-value map representation and results of the action of SPCPose.

Download full size

View in Article

Fig. 6. Effect of extraction methods on SPCPose’s performance in parsing object poses at 256 sample points. (a) Visualization of SPCPose processing results with different extraction methods. The numbers on the picture mean the sampling rate. (b) Objective evaluation of SPCPose processing performance with different extraction methods.

Download full size

View in Article

Table 1. Performance Comparison of Image-Based and Image-Free Pose Estimation Regarding AP, PCK, AUC, EPE, Params, and GFLOPs on Tri-Mouse^[6,51]^a

View table

View in Article

Table 1. Performance Comparison of Image-Based and Image-Free Pose Estimation Regarding AP, PCK, AUC, EPE, Params, and GFLOPs on Tri-Mouse^[6,51]^a

Method	Sample rate	Backbone	AP@50-95	PCK@0.05	AUC	EPE	Params (M)	GFLOPs
Heatmap^[52]	1.000	HRNet-W32	99.9	99.8	91.4	1.91	28.5	10.2
Heatmap^[52]	1.000	HRNet-W48	100.0	99.8	92.2	1.67	63.6	21.0
RLE^[48]	1.000	ResNet50	98.4	95.8	86.4	3.43	23.7	5.4
	1.000	ResNet101	99.3	97.4	87.7	3.01	42.7	10.2
	1.000	ResNet152	98.9	96.4	87.5	3.08	58.3	15.1
SPCPose-M1	5.333 × 10⁻²	ViT-S	99.9	95.9	91.4	1.90	24.3	5.9
	1.333 × 10⁻²	ViT-S	96.5	99.9	88.5	3.56	24.3	5.9
	3.333 × 10⁻³	ViT-S	53.0	54.8	54.5	25.39	24.3	5.9
	8.333 × 10⁻⁴	ViT-S	78.4	79.7	77.6	14.59	24.3	5.9
SPCPose-M2	1.333 × 10⁻²	ViT-S	100.0	100.0	93.9	1.13	24.3	5.9
	3.333 × 10⁻³	ViT-S	100.0	100.0	92.8	1.49	24.3	5.9
	8.333 × 10⁻⁴	ViT-S	100.0	100.0	93.4	1.30	24.3	5.9
SPCPose-M3	1.333 × 10⁻²	ViT-S	100.0	99.7	91.5	1.87	24.3	5.9
	3.333 × 10⁻³	ViT-S	100.0	99.8	91.3	1.93	24.3	5.9
	8.333 × 10⁻⁴	ViT-S	100.0	100.0	93.4	1.28	24.3	5.9

Table 2. Performance Comparison of Image-Based and Image-Free Pose Estimation Regarding AP, PCK, AUC, EPE, Params, and GFLOPs on Horse10^[51]^a

View table

View in Article

Table 2. Performance Comparison of Image-Based and Image-Free Pose Estimation Regarding AP, PCK, AUC, EPE, Params, and GFLOPs on Horse10^[51]^a

Method	Sample rate	Backbone	AP@50-95	PCK@0.05	AUC	EPE	Params (M)	GFLOPs
Heatmap^[52]	1.000	HRNet-W32	98.3	99.9	93.6	1.14	28.5	10.2
Heatmap^[52]	1.000	HRNet-W48	98.3	99.9	93.7	1.10	63.6	21.0
RLE^[48]	1.000	ResNet50	95.1	99.6	93.8	1.44	23.7	5.4
	1.000	ResNet101	96.1	99.7	93.9	1.36	42.7	10.2
	1.000	ResNet152	96.5	99.7	94.0	1.30	58.3	15.1
SPCPose-M1	3.512 × 10⁻¹	ViT-S	87.6	94.0	88.1	2.96	24.3	5.9
	8.779 × 10⁻²	ViT-S	76.0	86.6	83.0	4.85	24.3	5.9
	2.195 × 10⁻²	ViT-S	67.4	81.5	80.2	5.83	24.3	5.9
	5.487 × 10⁻³	ViT-S	72.6	83.2	81.0	5.65	24.3	5.9
SPCPose-M2	8.779 × 10⁻²	ViT-S	92.7	97.7	91.3	1.87	24.3	5.9
	2.195 × 10⁻²	ViT-S	89.4	94.9	88.5	2.83	24.3	5.9
	5.487 × 10⁻³	ViT-S	83.3	90.3	85.0	4.10	24.3	5.9
SPCPose-M3	8.779 × 10⁻²	ViT-S	94.7	98.1	93.4	1.52	24.3	5.9
	2.195 × 10⁻²	ViT-S	91.1	96.5	90.3	2.20	24.3	5.9
	5.487 × 10⁻³	ViT-S	89.6	95.8	89.6	2.45	24.3	5.9

Table 3. Performance of Image-Free Pose Estimation on Our Self-Created Human Dataset with Different Extraction Methods including SPCPose-M1, SPCPose-M2, and SPCPose-M3, Evaluated in Terms of AP, PCK, AUC, EPE, Params, and GFLOPs

View table

View in Article

Table 3. Performance of Image-Free Pose Estimation on Our Self-Created Human Dataset with Different Extraction Methods including SPCPose-M1, SPCPose-M2, and SPCPose-M3, Evaluated in Terms of AP, PCK, AUC, EPE, Params, and GFLOPs

Method	Sample rate	Backbone	AP@50-95	PCK@0.05	AUC	EPE	Params (M)	GFLOPs
SPCPose-M1	4.006 × 10⁻²	ViT-S	58.0	72.9	56.1	15.97	24.3	5.9
	1.002 × 10⁻²	ViT-S	37.2	55.7	43.0	24.04	24.3	5.9
	2.504 × 10⁻³	ViT-S	27.3	47.7	37.0	27.92	24.3	5.9
	6.260 × 10⁻⁴	ViT-S	47.3	64.3	50.1	19.05	24.3	5.9
SPCPose-M2	1.002 × 10⁻²	ViT-S	68.4	78.5	62.7	12.96	24.3	5.9
	2.504 × 10⁻³	ViT-S	63.5	73.7	59.0	14.06	24.3	5.9
	6.260 × 10⁻⁴	ViT-S	41.1	48.7	45.4	22.18	24.3	5.9
SPCPose-M3	1.002 × 10⁻²	ViT-S	70.4	80.1	63.2	12.52	24.3	5.9
	2.504 × 10⁻³	ViT-S	69.2	79.5	62.6	12.50	24.3	5.9
	6.260 × 10⁻⁴	ViT-S	56.3	69.7	54.5	15.87	24.3	5.9

Tools

Get Citation

Copy Citation Text

Xin Wu, Cheng Zhou, Binyu Li, Jipeng Huang, Yanli Meng, Lijun Song, Shensheng Han, "Image-free cross-species pose estimation via an ultra-low sampling rate single-pixel camera," Chin. Opt. Lett. 23, 091101 (2025)

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Imaging Systems and Image Processing

Received: Apr. 1, 2025

Accepted: May. 9, 2025

Published Online: Aug. 22, 2025

The Author Email: Cheng Zhou (zhoucheng91210@163.com), Jipeng Huang (huangjp848@nenu.edu.cn), Lijun Song (ccdxslj@126.com)

DOI:10.3788/COL202523.091101

CSTR:32184.14.COL202523.091101

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology

Table 1. Performance Comparison of Image-Based and Image-Free Pose Estimation Regarding AP, PCK, AUC, EPE, Params, and GFLOPs on Tri-Mouse[6,51]a

Table 1. Performance Comparison of Image-Based and Image-Free Pose Estimation Regarding AP, PCK, AUC, EPE, Params, and GFLOPs on Tri-Mouse[6,51]a

Table 2. Performance Comparison of Image-Based and Image-Free Pose Estimation Regarding AP, PCK, AUC, EPE, Params, and GFLOPs on Horse10[51]a

Table 2. Performance Comparison of Image-Based and Image-Free Pose Estimation Regarding AP, PCK, AUC, EPE, Params, and GFLOPs on Horse10[51]a

Table 3. Performance of Image-Free Pose Estimation on Our Self-Created Human Dataset with Different Extraction Methods including SPCPose-M1, SPCPose-M2, and SPCPose-M3, Evaluated in Terms of AP, PCK, AUC, EPE, Params, and GFLOPs

Table 3. Performance of Image-Free Pose Estimation on Our Self-Created Human Dataset with Different Extraction Methods including SPCPose-M1, SPCPose-M2, and SPCPose-M3, Evaluated in Terms of AP, PCK, AUC, EPE, Params, and GFLOPs

Table 1. Performance Comparison of Image-Based and Image-Free Pose Estimation Regarding AP, PCK, AUC, EPE, Params, and GFLOPs on Tri-Mouse^[6,51]^a

Table 1. Performance Comparison of Image-Based and Image-Free Pose Estimation Regarding AP, PCK, AUC, EPE, Params, and GFLOPs on Tri-Mouse^[6,51]^a

Table 2. Performance Comparison of Image-Based and Image-Free Pose Estimation Regarding AP, PCK, AUC, EPE, Params, and GFLOPs on Horse10^[51]^a

Table 2. Performance Comparison of Image-Based and Image-Free Pose Estimation Regarding AP, PCK, AUC, EPE, Params, and GFLOPs on Horse10^[51]^a