Abstract:
The expression of specious sign language in life is ambiguous, and the semantics of substandard gesture actions are easy to be confused. At the same time, it is difficult to obtain sufficient features for training sign language recognition model with finite samples, and the model is easy to over fit when it is too complex, which leads to low recognition accuracy. In order to solve this problem, we propose a representation learning method to expand the tolerant features of sub-standard sign language recognition with finite samples. This method based on the skeleton information of human body, facing the spatiotemporal correlation of sign language, constructes a autoencoder to extract standard features from a small number of original samples in sign language corpus; a large number of substandard samples are generated from standard features by generative adversarial networks, and then fault-tolerant features are extended by autoencoder to construct new features for subsequent sign language recognition tasks. The experimental results show that, under the condition of limited samples, the semantics of the samples generated by this method are clear, and the features of different semantics in the new feature set are easy to be divided. Using this method to build tolerant feature set in CSL dataset, the training sign language recognition model achieves 97.5% recognition accuracy, which indicates that it has broad application prospects.