• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Song Kehui, Zhang Ying, Zhang Jiangwei, Yuan Xiaojie. A Generative Model for Synthesizing Structured Datasets Based on GAN[J]. Journal of Computer Research and Development, 2019, 56(9): 1832-1842. DOI: 10.7544/issn1000-1239.2019.20180353
Citation: Song Kehui, Zhang Ying, Zhang Jiangwei, Yuan Xiaojie. A Generative Model for Synthesizing Structured Datasets Based on GAN[J]. Journal of Computer Research and Development, 2019, 56(9): 1832-1842. DOI: 10.7544/issn1000-1239.2019.20180353

A Generative Model for Synthesizing Structured Datasets Based on GAN

Funds: This work was supported by the National Natural Science Foundation of China (61772289, U1836109).
More Information
  • Published Date: August 31, 2019
  • Synthesizing high quality dataset has been a long-standing challenge in both machine learning and database community. One of the applications of high quality dataset synthesis is to improve the model training, especially deep learning models. A robust model training process requires a large annotated dataset. One way of acquiring a large annotated training set is via the domain experts manual annotation, which is expensive and prone to mistakes. Therefore, as an alternative, automatic synthesis of high quality and similar dataset is much more plausible. Some efforts have been devoted for synthesizing image dataset due to the rapid development of computer vision. However, those models can not be applied to the structured data (numeric & categorical table) directly. Moreover, little efforts have been payed to the numeric & categorical table. Therefore, we propose TableGAN, the first generative model from GAN family, which improves the performance of the generative model with adversarial learning mechanism. TableGAN modifies the internal structure of traditional GAN targeting numeric & categorical table, including the optimization function, to synthesize more high-quality training dataset samples for improving the effectiveness of the training models. Extensive experiments on real datasets show significant performance improvement for those models trained on the enlarged training datasets, and thus verify the effectiveness of our TableGAN.
  • Cited by

    Periodical cited type(7)

    1. 马辉,王瑞琴,杨帅. 一种渐进式增长条件生成对抗网络模型. 电信科学. 2023(06): 105-113 .
    2. 杨华芬. 云存储环境下大数据实时动态迁移算法研究. 机械设计与制造工程. 2021(02): 117-122 .
    3. 何少芳,沈陆明,谢红霞. 生成式对抗网络的土壤有机质高光谱估测模型. 光谱学与光谱分析. 2021(06): 1905-1911 .
    4. 卢锦玲,张梦雪,郭鲁豫. 基于GAN的不平衡负荷数据类型辨识方法. 电力科学与工程. 2021(06): 26-34 .
    5. 刘言林. 基于条件生成对抗网络的小样本机器学习数据处理算法研究. 宁夏师范学院学报. 2021(10): 66-73 .
    6. 杨彦荣,宋荣杰,周兆永. 基于GAN-PSO-ELM的网络入侵检测方法. 计算机工程与应用. 2020(12): 66-72 .
    7. 金秋,林馥. 定向网络中隐藏可逆数据的分层追踪算法. 计算机仿真. 2020(10): 226-229+277 .

    Other cited types(23)

Catalog

    Article views (2221) PDF downloads (1125) Cited by(30)
    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return