References

This page provides a comprehensive list of references used for CASEset, including academic papers, datasets, and technical resources focused on context-aware gaze estimation and computer vision.

[1] M. Tran and L. Milkowski (2024) "CASE: Context Aware Screen-Based Estimation of Gaze," Eighth IEEE International Conference on Robotic Computing (IRC), Tokyo, Japan, 2024, pp. 112-113,

https://ieeexplore.ieee.org/document/10818040ieeexplore.ieee.org

[2] Bao, Y., Cheng, Y., Liu, Y., & Lu, F. (2022). Adaptive Feature Fusion Network for Gaze Tracking in Mobile Tablets. 2022 26th International Conference on Pattern Recognition (ICPR), 1473-1479.

https://doi.org/10.1109/ICPR56361.2022.9956543doi.org

[3] Cheng, Y., Wang, H., Bao, Y., & Lu, F. (2022). Appearance-Based Gaze Estimation With Deep Learning: A Review and Benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11), 8428-8448.

https://doi.org/10.1109/TPAMI.2021.3111128doi.org

[4] Chen, Z., & Shi, B. E. (2023). Towards High Performance Low Complexity Calibration in Appearance Based Gaze Estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3), 3817-3829.

https://doi.org/10.1109/TPAMI.2022.3182940doi.org

[5] Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., et al. (2021). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. International Conference on Learning Representations (ICLR).

An Image is Worth 16x16 Words: Transformers for Image Recognition at ScalearXiv.org

[6] Wang, W., Xie, E., Li, X., Fan, D. P., Song, K., Liang, D., et al. (2022). PVT v2: Improved Baselines with Pyramid Vision Transformer. Computational Visual Media, 8(3), 415-424.

https://doi.org/10.1007/s41095-022-0274-8doi.org

[7] Li, J., Li, D., Savarese, S., & Hoi, S. (2023). BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. International Conference on Machine Learning (ICML).

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen...arXiv.org

[8] Huang, S., Dong, L., Wang, W., Hao, Y., Singhal, S., Ma, S., et al. (2023). Language Is Not All You Need: Aligning Perception with Language Models. Advances in Neural Information Processing Systems (NeurIPS), 36.

Language Is Not All You Need: Aligning Perception with Language ModelsarXiv.org

[9] Biten, A. F., Gómez, L., Rusiñol, M., & Karatzas, D. (2022). Scene Text Visual Question Answering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(7), 4073-4086.

https://doi.org/10.1109/TPAMI.2021.3055735doi.org

[10] Zhang, X., Park, S., Beeler, T., Bradley, D., Tang, S., & Hilliges, O. (2020). ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head Poses and Gaze Directions. European Conference on Computer Vision (ECCV), 365-381.

https://doi.org/10.1007/978-3-030-58558-7_22doi.org

[11] Fischer, T., Chang, H. J., & Demiris, Y. (2022). RT-BENE: A Dataset and Baselines for Real-Time Blink Estimation in Natural Environments. IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 1134-1143.

https://doi.org/10.1109/WACV51458.2022.00120doi.org

[12] Kothari, R., Yang, Z., Kanan, C., Bailey, R., Pelz, J. B., & Diaz, G. J. (2020). Gaze-in-Wild: A Dataset for Studying Eye and Head Coordination in Everyday Activities. Scientific Reports, 10(1), 2539.

https://doi.org/10.1038/s41597-020-0443-0doi.org

[13] Mehta, S., & Rastegari, M. (2022). MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer. International Conference on Learning Representations (ICLR).

MobileViT: Light-weight, General-purpose, and Mobile-friendly...arXiv.org

[14] Cai, H., Gan, C., Wang, T., Zhang, Z., & Han, S. (2020). Once-for-All: Train One Network and Specialize it for Efficient Deployment. International Conference on Learning Representations (ICLR).

Once-for-All: Train One Network and Specialize it for Efficient DeploymentarXiv.org

[15] Lin, J., Tang, J., Tang, H., Yang, S., Dang, X., & Han, S. (2023). AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration. arXiv preprint arXiv:2306.00978.

AWQ: Activation-aware Weight Quantization for LLM Compression and...arXiv.org

[16] Huang, M. X., Li, J., Ngai, G., Leong, H. V., & Bulling, A. (2022). Moment-to-Moment Detection of Internal Thought from Eye Vergence Behaviour. ACM Transactions on Computer-Human Interaction, 29(4), 1-49.

Follow the Timeline! Generating an Abstractive and Extractive Timeline Summary in Chronological Order | ACM Transactions on Information SystemsACM Transactions on Information Systems

[17] Brousseau, B., Rose, J., & Eizenman, M. (2020). Accurate Model-Based Point of Gaze Estimation on Mobile Devices. Vision Research, 175, 1-9.

https://doi.org/10.1016/j.visres.2020.06.008doi.org

[18] Pathirana, P. N., Senarath, S., Meedeniya, D., & Jayarathna, S. (2022). Eye Gaze Estimation: A Survey on Deep Learning-Based Approaches. Expert Systems with Applications, 199, 116894.

https://doi.org/10.1016/j.eswa.2022.116894doi.org

[19] Park, S., Mello, S. D., Molchanov, P., Iqbal, U., Hilliges, O., & Kautz, J. (2019). Few-Shot Adaptive Gaze Estimation. IEEE/CVF International Conference on Computer Vision (ICCV), 9367-9376.

https://doi.org/10.1109/ICCV.2019.00946doi.org

[20] Krafka, K., Khosla, A., Kellnhofer, P., Kannan, H., Bhandarkar, S., Matusik, W., & Torralba, A. (2016). Eye Tracking for Everyone. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2176-2184.

https://doi.org/10.1109/CVPR.2016.239doi.org

PreviousCASEset

Last updated 4 months ago