• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Wu Jinhui, Jiang Yuan. Universal Approximation and Approximation Advantages of Quaternion-Valued Neural Networks[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440410
Citation: Wu Jinhui, Jiang Yuan. Universal Approximation and Approximation Advantages of Quaternion-Valued Neural Networks[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440410

Universal Approximation and Approximation Advantages of Quaternion-Valued Neural Networks

Funds: This work was supported by the National Natural Science Foundation of China (62176117) and the Program for Outstanding PhD Candidates of Nanjing University (202401A13).
More Information
  • Author Bio:

    Wu Jinhui: born in 1998. PhD candidate. His main research interests include neural network theories and machine learning

    Jiang Yuan: born in 1976. PhD, professor, PhD supervisor. Her main research interests include artificial intelligence, machine learning, and intelligent medical applications

  • Received Date: May 30, 2024
  • Revised Date: October 07, 2024
  • Accepted Date: October 15, 2024
  • Available Online: October 15, 2024
  • Quaternion-valued neural networks extend real-valued neural networks to the algebra of quaternions. Quaternion-valued neural networks achieve higher accuracy or faster convergence than real-valued neural networks in some tasks, such as singular point compensation in polarimetric synthetic aperture, spoken language understanding, and radar robot control. The performance of quaternion-valued neural networks is widely supported by empirical studies, but there are few studies about theoretical properties of quaternion-valued neural networks, especially why quaternion-valued neural networks can be more efficient than real-valued neural networks. In this paper, we investigate theoretical properties of quaternion-valued neural networks and the advantages of quaternion-valued neural networks compared with real-valued neural networks from the perspective of approximation. Firstly, we prove the universal approximation of quaternion-valued neural networks with a non-split ReLU (rectified linear unit)-type activation function. Secondly, we demonstrate the approximation advantages of quaternion-valued neural networks compared with real-valued neural networks. For split ReLU-type activation functions, we show that one-hidden-layer real-valued neural networks need about 4 times the number of parameters to possess the same maximum number of convex linear regions as one-hidden-layer quaternion-valued neural networks. For the non-split ReLU-type activation function, we prove the approximation separation between one-hidden-layer quaternion-valued neural networks and one-hidden-layer real-valued neural networks, i.e., a quaternion-valued neural network can express a real-valued neural network using the same number of hidden neurons and the same parameter norm, while a real-valued neural network cannot approximate a quaternion-valued neural network unless the number of hidden neurons is exponentially large or the parameters are exponentially large. Finally, simulation experiments support our theoretical findings.

  • [1]
    Oyama K, Hirose A. Phasor quaternion neural networks for singular point compensation in polarimetric-interferometric synthetic aperture radar[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 57(5): 2510−2519
    [2]
    Parcollet T, Morchid M, Linares G. Deep quaternion neural networks for spoken language understanding [C] //Proc of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop. Piscataway, NJ: IEEE, 2017: 504−511
    [3]
    武越,苑咏哲,岳铭煜,等. 点云配准中多维度信息融合的特征挖掘方法[J]. 计算机研究与发展,2022,59(8):1732−1741

    Wu Yue, Yuan Yongzhe, Yue Mingyu, et al. Feature mining method of multi-dimensional information fusion in point cloud registration[J]. Journal of Computer Research and Development, 2022, 59(8): 1732−1741 (in Chinese)
    [4]
    Bayro-Corrochano E, Lechuga-Gutiérrez L, Garza-Burgos M. Geometric techniques for robotics and HMI: Interpolation and haptics in conformal geometric algebra and control using quaternion spike neural networks[J]. Robotics and Autonomous Systems, 2018, 104: 72−84 doi: 10.1016/j.robot.2018.02.015
    [5]
    Parcollet T, Ravanelli M, Morchid M, et al. Quaternion recurrent neural networks [C/OL] //Proc of the 7th Int Conf on Learning Representations. 2019 [2024-07-14]. https://openreview.net/pdf?id=ByMHvs0cFQ
    [6]
    Shoemake K. Animating rotation with quaternion curves [C] //Proc of the 12th Annual Conf on Computer Graphics and Interactive Techniques. New York: ACM, 1985: 245−254
    [7]
    Parcollet T, Morchid M, Linarès G. A survey of quaternion neural networks[J]. Artificial Intelligence Review, 2020, 53(4): 2957−2982 doi: 10.1007/s10462-019-09752-1
    [8]
    Arena P, Fortuna L, Muscato G, et al. Multilayer perceptrons to approximate quaternion valued functions[J]. Neural Networks, 1997, 10(2): 335−342 doi: 10.1016/S0893-6080(96)00048-2
    [9]
    Valle M E, Vital W L, Vieira G. Universal approximation theorem for vector- and hypercomplex-valued neural networks [J]. arXiv preprint, arXiv: 2401.02277, 2014
    [10]
    Ujang B C, Took C C, Mandic D P. Quaternion-valued nonlinear adaptive filtering[J]. IEEE Transactions on Neural Networks, 2011, 22(8): 1193−1206 doi: 10.1109/TNN.2011.2157358
    [11]
    Cybenko G. Approximation by superpositions of a sigmoidal function[J]. Mathematics of Control, Signals and Systems, 1989, 2(4): 303−314 doi: 10.1007/BF02551274
    [12]
    Funahashi K I. On the approximate realization of continuous mappings by neural networks[J]. Neural Networks, 1989, 2(3): 183−192 doi: 10.1016/0893-6080(89)90003-8
    [13]
    Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators[J]. Neural Networks, 1989, 2(5): 359−366 doi: 10.1016/0893-6080(89)90020-8
    [14]
    Leshno M, Lin V Y, Pinkus A, et al. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function[J]. Neural Networks, 1993, 6(6): 861−867 doi: 10.1016/S0893-6080(05)80131-5
    [15]
    Seidl D R, Lorenz R D. A structure by which a recurrent neural network can approximate a nonlinear dynamic system [C] //Proc of the 1991 Int Joint Conf on Neural Networks. Piscataway, NJ: IEEE, 1991: 709−714
    [16]
    Funahashi K I, Nakamura Y. Approximation of dynamical systems by continuous time recurrent neural networks[J]. Neural Networks, 1993, 6(6): 801−806 doi: 10.1016/S0893-6080(05)80125-X
    [17]
    Chow T W, Li Xiaodong. Modeling of continuous time dynamical systems with input by recurrent neural networks[J]. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, 2000, 47(4): 575−578 doi: 10.1109/81.841860
    [18]
    Li Xiaodong, Ho J K, Chow T W. Approximation of dynamical time-variant systems by continuous-time recurrent neural networks[J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 2005, 52(10): 656−660 doi: 10.1109/TCSII.2005.852006
    [19]
    Schäfer A M, Zimmermann H G. Recurrent neural networks are universal approximators [C] //Proc of the 16th Int Conf of Artificial Neural Networks. Berlin: Springer, 2006: 632−640
    [20]
    Zhou Dingxuan. Universality of deep convolutional neural networks[J]. Applied and Computational Harmonic Analysis, 2020, 48(2): 787−794 doi: 10.1016/j.acha.2019.06.004
    [21]
    Arena P, Fortuna L, Re R, et al. On the capability of neural networks with complex neurons in complex valued functions approximation [C] //Proc of the 1993 IEEE Int Symp on Circuits and Systems. Piscataway, NJ: IEEE, 1993: 2168−2171
    [22]
    Voigtlaender F. The universal approximation theorem for complex-valued neural networks[J]. Applied and Computational Harmonic Analysis, 2023, 64: 33−61 doi: 10.1016/j.acha.2022.12.002
    [23]
    Barron A R. Approximation and estimation bounds for artificial neural networks[J]. Machine Learning, 1994, 14(1): 115−133
    [24]
    Arora R, Basu A, Mianjy P, et al. Understanding deep neural networks with rectified linear units [C/OL] //Proc of the 6th Int Conf on Learning Representations. 2018 [2024-07-14]. https://openreview.net/pdf?id=B1J_rgWRW
    [25]
    Montufar G F, Pascanu R, Cho K, et al. On the number of linear regions of deep neural networks [C] //Advances in Neural Information Processing Systems 27. Cambridge, MA: MIT, 2014: 2924−2932
    [26]
    Goujon A, Etemadi A, Unser M. On the number of regions of piecewise linear neural networks[J]. Journal of Computational and Applied Mathematics, 2024, 441: 115667 doi: 10.1016/j.cam.2023.115667
    [27]
    Eldan R, Shamir O. The power of depth for feedforward neural networks [C] //Proc of the 29th Conf on Learning Theory. NewYork: PMLR, 2016: 907−940
    [28]
    Telgarsky M. Benefits of depth in neural networks [C] //Proc of the 29th Conf on Learning Theory. New York: PMLR, 2016: 1517−1539
    [29]
    Zhang Shaoqun, Gao Wei, Zhou Zhihua. Towards understanding theoretical advantages of complex-reaction networks[J]. Neural Networks, 2022, 151: 80−93 doi: 10.1016/j.neunet.2022.03.024
    [30]
    Wu Jinhui, Zhang Shaoqun, Jiang Yuan, et al. Theoretical exploration of flexible transmitter model[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 35(3): 3674−3688
    [31]
    Fukushima K. Visual feature extraction by a multilayered network of analog threshold elements[J]. IEEE Transactions on Systems Science and Cybernetics, 1969, 5(4): 322−333 doi: 10.1109/TSSC.1969.300225
    [32]
    Maas A L, Hannun A Y, Ng A Y. Rectifier nonlinearities improve neural network acoustic models [C/OL] //Proc of the 2013 ICML Workshop on Deep Learning for Audio, Speech, and Language Processing. 2013 [2024-07-15]. http://robotics.stanford.edu/~amaas/papers/relu_hybrid_icml2013_final.pdf
    [33]
    He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification [C] //Proc of the 2015 IEEE Int Conf on Computer Vision. Piscataway , NJ: IEEE, 2015: 1026−1034
    [34]
    Zaslavsky T. Facing up to Arrangements: Face-count Formulas for Partitions of Space by Hyperplanes [M]. Providence, RI: American Mathematical Society, 1975
  • Related Articles

    [1]Shen Yuan, Song Wei, Zhao Changsheng, Peng Zhiyong. A Cross-Domain Ciphertext Sharing Scheme Supporting Access Behavior Identity Tracing[J]. Journal of Computer Research and Development, 2024, 61(7): 1611-1628. DOI: 10.7544/issn1000-1239.202330618
    [2]Tang Yongli, Li Yuanhong, Zhang Xiaohang, Ye Qing. Identity-Based Group Signatures Scheme on Lattice[J]. Journal of Computer Research and Development, 2022, 59(12): 2723-2734. DOI: 10.7544/issn1000-1239.20210930
    [3]Li Jianmin, Yu Huifang, Xie Yong. ElGamal Broadcasting Multi-Signcryption Protocol with UC Security[J]. Journal of Computer Research and Development, 2019, 56(5): 1101-1111. DOI: 10.7544/issn1000-1239.2019.20180130
    [4]Wang Ziyu, Liu Jianwei, Zhang Zongyang, Yu Hui. Full Anonymous Blockchain Based on Aggregate Signature and Confidential Transaction[J]. Journal of Computer Research and Development, 2018, 55(10): 2185-2198. DOI: 10.7544/issn1000-1239.2018.20180430
    [5]Wu Libing, Zhang Yubo, He Debiao. Dual Server Identity-Based Encryption with Equality Test for Cloud Computing[J]. Journal of Computer Research and Development, 2017, 54(10): 2232-2243. DOI: 10.7544/issn1000-1239.2017.20170446
    [6]Xiao Siyu, Ge Aijun, Ma Chuangui. Decentralized Attribute-Based Encryption Scheme with Constant-Size Ciphertexts[J]. Journal of Computer Research and Development, 2016, 53(10): 2207-2215. DOI: 10.7544/issn1000-1239.2016.20160459
    [7]Li Huixian, Chen Xubao, Ju Longfei, Pang Liaojun, Wang Yumin. Improved Multi-Receiver Signcryption Scheme[J]. Journal of Computer Research and Development, 2013, 50(7): 1418-1425.
    [8]Zhu Hui, Li Hui, and Wang Yumin. Certificateless Signcryption Scheme Without Pairing[J]. Journal of Computer Research and Development, 2010, 47(9): 1587-1594.
    [9]Hu Liang, Liu Zheli, Sun Tao, Liu Fang. Survey of Security on Identity-Based Cryptography[J]. Journal of Computer Research and Development, 2009, 46(9): 1537-1548.
    [10]Lai Xin, Huang Xiaofang, He Dake. An ID-Based Efficient Signcryption Key Encapsulation Scheme[J]. Journal of Computer Research and Development, 2009, 46(5): 857-863.
  • Cited by

    Periodical cited type(0)

    Other cited types(6)

Catalog

    Article views (33) PDF downloads (11) Cited by(6)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return