Advanced Search
    Deng Xubin, Zhu Yangyong. ReDE: A Regular Expression-Based Method for Extracting Biological Data[J]. Journal of Computer Research and Development, 2005, 42(12): 2184-2191.
    Citation: Deng Xubin, Zhu Yangyong. ReDE: A Regular Expression-Based Method for Extracting Biological Data[J]. Journal of Computer Research and Development, 2005, 42(12): 2184-2191.

    ReDE: A Regular Expression-Based Method for Extracting Biological Data

    • Extracting data from heterogeneous biological data sources to build a query and analysis platform for biological scientists is currently a hot research topic. In general, data extraction process concerns many interdependent metadata. Making full use of dependencies among metadata to generate one metadata from another can reduce metadata maintenance overhead. However, many data extraction methods overlook these dependencies and require much effort to construct and maintain many metadata. In this paper, a regular expression (RE) based method named as ReDE is proposed to avoid this drawback: by building a parse tree for RE groups, an RE-based algorithm for generating relational database scheme and a general data extraction and assembling algorithm are designed. The novelty is that the RE is the only necessary metadata whose management and maintenance are relatively easy. This method can serve as the basis for building a biological database design-aiding tool and a high automatic tool for data extraction, and has been applied to extract data for the first online integrated biological data warehouse of China.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return