高级检索

    四元数神经网络的万有逼近与逼近优势

    Universal Approximation and Approximation Advantages of Quaternion-Valued Neural Networks

    • 摘要: 四元数神经网络将实值神经网络推广到了四元数代数中,其在偏振合成孔径雷达奇异点补偿、口语理解、机器人控制等任务中取得了比实值神经网络更高的精度或更快的收敛速度. 四元数神经网络的性能在实验中已得到广泛验证,但四元数神经网络的理论性质及其相较于实值神经网络的优势还研究较少. 从表示能力的角度出发,研究四元数神经网络的理论性质及其相较于实值神经网络的优势. 首先,证明了四元数神经网络使用一非分开激活的修正线性单元(rectified linear unit,ReLU)型激活函数时的万有逼近定理. 其次,研究了四元数神经网络相较于实值神经网络的逼近优势. 针对分开激活的ReLU型激活函数,证明了单隐层实值神经网络需要约4倍参数量才能生成与单隐层四元数神经网络相同的最大凸线性区域数. 针对非分开激活的ReLU型激活函数,证明了单隐层四元数神经网络与单隐层实值神经网络间的逼近分离:四元数神经网络可以用相同的隐层神经元数量与权重模长表示实值神经网络,而实值神经网络需要指数多隐层神经元或指数大的参数才可能近似四元数神经网络. 最后,模拟实验验证了理论.

       

      Abstract: Quaternion-valued neural networks extend real-valued neural networks to the algebra of quaternions. Quaternion-valued neural networks achieve higher accuracy or faster convergence than real-valued neural networks in some tasks, such as singular point compensation in polarimetric synthetic aperture, spoken language understanding, and radar robot control. The performance of quaternion-valued neural networks is widely supported by empirical studies, but there are few studies about theoretical properties of quaternion-valued neural networks, especially why quaternion-valued neural networks can be more efficient than real-valued neural networks. In this paper, we investigate theoretical properties of quaternion-valued neural networks and the advantages of quaternion-valued neural networks compared with real-valued neural networks from the perspective of approximation. Firstly, we prove the universal approximation of quaternion-valued neural networks with a non-split ReLU (rectified linear unit)-type activation function. Secondly, we demonstrate the approximation advantages of quaternion-valued neural networks compared with real-valued neural networks. For split ReLU-type activation functions, we show that one-hidden-layer real-valued neural networks need about 4 times the number of parameters to possess the same maximum number of convex linear regions as one-hidden-layer quaternion-valued neural networks. For the non-split ReLU-type activation function, we prove the approximation separation between one-hidden-layer quaternion-valued neural networks and one-hidden-layer real-valued neural networks, i.e., a quaternion-valued neural network can express a real-valued neural network using the same number of hidden neurons and the same parameter norm, while a real-valued neural network cannot approximate a quaternion-valued neural network unless the number of hidden neurons is exponentially large or the parameters are exponentially large. Finally, simulation experiments support our theoretical findings.

       

    /

    返回文章
    返回