ISSN 1000-1239 CN 11-1777/TP

Table of Content

15 October 2007, Volume 44 Issue 10
Research Progress of Application-Driven High Productivity Computing System
Hong Xuehai, Zhan Jianfeng, Fan Jianping, and Zhang Zhihong
2007, 44(10):  1633-1639. 
Asbtract ( 440 )   HTML ( 1)   PDF (326KB) ( 451 )  
Related Articles | Metrics
The research and application of high productivity computing system is key field of large-scale computer system. Facing various application fields, how to design high productivity computing system, how to use large-scale computer system high efficiently and how to make large-scale computer system meet the need of the various applications, all have become the most interest topic in the HPC (high performance computing). Recent years witness the rapid progress and ambitious research plans in the field of HPCS (high productivity computing system). Focusing on the demands of large-scale parallel applications, the big challenges are analyzed and the main efforts and plans are surveyed in this paper. The latest progresses in computer architecture, programming environment, system management and reliability design are further discussed surrounding this topic. Finally, the future development of application-driven high productivity computing systems is outlined.
A Brief View on Requirements and Development of High Performance Computing Application
Zhao Yi, Zhu Peng, Chi Xuebin, Niu Tie, and Cao Zongyan
2007, 44(10):  1640-1646. 
Asbtract ( 757 )   HTML ( 3)   PDF (319KB) ( 1509 )  
Related Articles | Metrics
With the development of HPC (high performance computing) technologies, HPC has been used to solve the problems of scientific research and industry production in more and more fields. HPC applications have developed rapidly in many scientific fields and boosted significant technology innovations. Since 2004, the Supercomputing Center of CAS (the Chinese Academy of Sciences) has investigated the HPC application requirements of the Eleventh Five-Year Plan in CAS several times. The total HPC requirements of CAS have been calculated based on the collected detail requirements in each application field. The results provide references for building the HPC environment and developing the HPC applications of the Eleventh Five-Year Plan in CAS. In this paper the state-of-arts in the research and development of HPC applications is briefly introduced, the progress of building HPC environment and developing HPC applications in CAS is described, and the HPC application requirements of the Eleventh Five-Year Plan in CAS are also analyzed. Finally, the prospect of HPC application development is presented.
Parallel Computation for Particle Simulations Based on Object-Oriented Design
Cao Xiaolin, Zhang Aiqing, and Mo Zeyao
2007, 44(10):  1647-1651. 
Asbtract ( 417 )   HTML ( 0)   PDF (280KB) ( 479 )  
Related Articles | Metrics
Particle simulations including classical molecular dynamics (MD) and particle-in-cell (PIC) have some common characteristics, which include dynamic movement of particles, spatial locality of particle computation, etc. In consideration of these characteristics, firstly, particle-patch-data object is presented. The object is a group of particles in a single patch. The patch is a rectangular block including many cells. Secondly, a corresponding parallel algorithm is designed. The algorithm deals with dynamical load balancing, particles migration and data exchange among these objects. Finally, the algorithm is implemented on JASMIN infrastructure. Then, parallel classical molecular dynamics program and parallel PIC program are developed. Numerical experimental results of PIC program show that the parallel efficiency with 80% is demonstrated by simulating a complex physical model including 3 million cells and 20 million particles on 64 processors of one MPP.
Optimization Design of Turbine Engine Foundation on Grid
Cui Zhendong, and Wang Xicheng
2007, 44(10):  1652-1660. 
Asbtract ( 360 )   HTML ( 0)   PDF (519KB) ( 366 )  
Related Articles | Metrics
Optimization design plays a very important role in the engineering applications, which often involves huge computational effort and requires powerful computing environment. However, the distributed, dynamic and heterogeneous characteristics make it more difficult to make good use of the abundant idle resources. In order to integrate the massive idle resources to super-powerful environment by resources sharing and cooperative working on Internet, a four-layer high-performance grid platform is constructed for solving the complex engineering optimization problems. The dynamic analysis program developed is sealed to the nodes of the grid as black-box for computing dynamic response. An effective optimization method using Kriging model is proposed to do dynamic optimization design of the turbine engine foundation on the grid platform. The optimization problem is to find the design variables such that both the maximum dynamic displacement and structural weight are minimum and certain side constraints are satisfied. The cross sections of the beams and columns are considered as design variables. Kriging model is used to build the approximate mapping relationship between the forced vibration amplitude and design variables, reducing expensive dynamic reanalysis. Two engineering examples are carried out successfully on the platform using the proposed method, and the computing efficiency is compared with different number grid nodes. In comparison with the result of sequential linear programming, the optimization results show that the method has good accuracy. The speed-up is also analyzed when the nodes with different number are used showing that the method has very high efficiency for grid computing, and the grid platform constructed is suitable for the engineering optimization design.
A Parallel Algorithm for Fresnel Tomography
Zhang Jianzhong, Yang Guohui, Lin Wen, and Cai Jun
2007, 44(10):  1661-1666. 
Asbtract ( 425 )   HTML ( 0)   PDF (373KB) ( 394 )  
Related Articles | Metrics
In contrast with ray-based traveltime tomography, Fresnel tomography accounts for the band-limited nature of seismic waves and gives the higher resolution tomograms. Because Fresnel tomography demands much computer memory and much running time, a parallel algorithm for it is proposed. The tomographic inversion is transformed to resolving respectively a series of single equation in light of backprojection principle, each equation corresponding to a Fresnel zone. The forward and inverse computation concerning a Fresnel zone is allocated to one process and is independent of other processes. Then the storage and calculation of the large-scale matrix in the tomography are avoided. No message delivers between the slave processes, and only a little of data delivers between a master process and the slave ones. By using the portable message passing interface standard (MPI) for the communication, the computing code of the algorithm is implemented on Linux system, which allows to distribute the work on several PCs connected via standard Ethernet in an in-house network, and greatly expands the applicability of Fresnel tomography. The tests on the synthetic and observed seismic travel time data show that this parallel algorithm has a good performance on Linux PCs.
A Study of Parallel DVM-DAC Algorithm in Multiscale Physics
Lin Jiao, Wang Shanying, and Wang Chongyu
2007, 44(10):  1667-1672. 
Asbtract ( 402 )   HTML ( 0)   PDF (368KB) ( 412 )  
Related Articles | Metrics
The phenomenon of multiscale and the corresponding theoretical model is an important scientific problem in the complex system. It is difficult for the traditional theory to describe the nature of spanning scale and multilevel problem. Being an efficient model for large scale system, the DVM-DAC algorithm has the O(n) linear scaling character originated from using the divide-and-conquer theory. However, the solving of eigenvalue equations brings serious computing bottleneck. In this paper, a parallel DVM-DAC method is presented to solve this problem. The performed results demonstrate that the parallel DVM-DAC algorithm which has succeeded in processing the carbon nanotube system with 10\+4 atoms, and has good scalability. It can be predicted that the parallel DVM-DAC method is a important tool for the study of multiscale physics.
Transparent Shader-Based Direct3D9 Application Parallelization
Liu Zhen, Shi Jiaoying, Xiong Hua, and Peng Haoyu
2007, 44(10):  1673-1681. 
Asbtract ( 532 )   HTML ( 0)   PDF (524KB) ( 392 )  
Related Articles | Metrics
Vertex shader and pixel shader are the newest programmable units of graphics processing unit. According to their architecture and Direct3D9 application execution flow on single machine, transparent shader-based Direct3D9 application running on graphics cluster parallelization strategy is proposed. Graphics cluster can be divided into rendering resource distributing node and rendering resource rendering node. Among them, distributing node is responsible for converting Direct3D9 application to six kinds of rendering resource, including command stream, vertex shader, pixel shader, vertex stream, index stream and texture stream. Rendering node is responsible for reconstructing Direct3D9 rendering command based on the description information and resource data of rendering resource. All of the rendering nodes reserve the entire rendering resource and distribute rendering task through computing the bounding box of multi-stream based scene data in the screen space. Experimental results show that this strategy can realize transparent shader-based Direct3D9 application parallelization. In contrast to single machine rendering, parallel graphics rendering based on graphics cluster can not only promote rendering performance but also achieve higher rendering speedup.
A Network Parallel Computing Scheme for Alternative Splicing of Biology Genome
Xu Guoshi, Lu Fakai, Xu Zhuoqun, Yu Huashan, and Ding Wenkui
2007, 44(10):  1682-1687. 
Asbtract ( 334 )   HTML ( 0)   PDF (304KB) ( 495 )  
Related Articles | Metrics
Alternative splicing is a major mechanism for adjusting gene expression and generating protein diversity, which has important biological significance. With the rapid increase of biological data, the single computer can hardly meet the requirement for massive computing power of alternative splicing research works. In such context, a network parallel computing scheme for alternative splicing problems is presented. With careful consideration of challenges, a service-oriented network resource virtualization mechanism is designed, which provides uniform selection, access and monitoring interfaces to network resources. Furthermore, a suite of API provides user-oriented application layer support, which hides the details of accessing network resources, and support quick and efficient application development.
A Parallel SOR Algorithm for Linear Systems on SMP
Hu Changjun, Wei Shuo, Zhang Jilin, and Wang Jue
2007, 44(10):  1688-1693. 
Asbtract ( 502 )   HTML ( 1)   PDF (366KB) ( 483 )  
Related Articles | Metrics
Reservoir simulation is an important area in high performance computing. SOR (successive over relaxation) iterative solution technique is used to solve the pressure equations in simulation. Parallel implementation of iterative solution technique is important for decreasing simulation execution time and improving simulation precision. Current published approaches are mostly restricted to implement parallel algorithms based on data partition in one iteration step. However, these approaches do not consider partitioning and merging iteration space, which makes algorithm have a very low efficiency. Considering the characteristics of SOR and SMP (symmetric multi-processors) system, using OpenMP as an implementation tool of parallel program, a parallel SOR algorithm is proposed. The parallel algorithm makes data in different area execute a number of iteration steps in parallel by rearranging points updating sequence. A new strategy of data partition and merging iteration steps during solving pressure equations in reservoir simulation is discussed. The new algorithm enables the locality of a program to be improved by iteration reordering, and the synchronization times to be decreased by merging iteration steps. Shape of each mesh block in different iteration step is presented. Compared with the current parallel implementation of SOR, this algorithm can decrease cost of synchronization and improve data locality. The experimental results on speedup and efficiency are also presented.
Research and Implementation of Parallel Program Profiler Based on System-Sampling
Chen Yongran, Dou Wenhua, Qian Yue, and Qi Xingyun
2007, 44(10):  1694-1701. 
Asbtract ( 409 )   HTML ( 0)   PDF (536KB) ( 379 )  
Related Articles | Metrics
Analyzing the performance of program is the basis of understanding the behaviors of programs. It is important to find the performance bottlenecks and learn the utilization of hardware and software resource. In the performance evaluation of parallel computing system, the performance characters of the application cannot be analyzed completely because of the restriction of the time and the space. An effective method is analyzing the performance characters of part of the application codes, and regarding them as the performance characters of the whole applications. In this paper, a method of analyzing the performance of the programs based on sampling is put forward. Furthermore, a profiler named SamplePro based on sample theory is developed. Compared with other methods under the condition of the same sample error, this technique can effectively reduce the needed instruction numbers, which means only 1%~3% instructions are used to achieve less than 3% analyzing error.
A Parallel Data Processing Middleware Based on Clusters
Wang Nianbin, Song Yibo, Yao Nianmin, and Liu Daxin
2007, 44(10):  1702-1708. 
Asbtract ( 345 )   HTML ( 0)   PDF (385KB) ( 459 )  
Related Articles | Metrics
Urgent performance requirements in large scale database applications have led to the use of parallelism for database processing. In this situation, providing a method to manipulate data paralleling can greatly improve the efficiency of data process. The HPDPM system is a middleware system applied in shared nothing cluster architecture to allow the database system to take advantage of the performance of parallel shared nothing systems. Presenting a new method to use parallel data processing middleware instead of parallel database system provides the ability for high performance computing. A framework is given for realizing parallel data manipulation. The primary modules of the middleware are described. Key techniques, used to improve system performance, including data placement and semantic caching, are discussed in detail. Then, the work principles and work steps of the middleware are presented. Implementation and experiments of this study show that this approach can improve system performance efficiently, enhance system availability and maintainability, and gain high performance price ratio. At present, the middleware system which uses data placement strategies and semantic caching have been applied to some large national engineering projects whose capacity of data is a little more than 1000 gigabytes.
Implementation and Evaluation of MPI Checkpointing System over Lustre File System
Xie Min, Lu Yutong, Zhou Enqiang, Cao Hongjia, and Yang Xuejun
2007, 44(10):  1709-1716. 
Asbtract ( 450 )   HTML ( 0)   PDF (371KB) ( 456 )  
Related Articles | Metrics
As one of the most important fault-tolerant techniques, coordinated checkpoint based rollback-recovery has been adopted in large scale parallel computer systems. Coordinating protocol and checkpoint image storage are two major factors that affect the overhead of parallel checkpointing systems. A novel application-transparent parallel checkpointing system implemented in MPICH2 is proposed. Compared with the existing techniques, the advantages of this system are summarized as follows: 1) Utilize the feature of near-neighbor communication in applications and virtual connection method to reduce the number of internal messages exchanged in coordinating stage, and hence to reduce the latency of protocol processing; 2) Store checkpoint images using Lustre file system to simplify the checkpoint files management; and 3) Implement parallel I/O in image storage stage to improve the system performance. Experiments suggest that the approach proposed results in low runtime overhead and enhances system scalability.
Workflow-Based User Environment for High Performance Computing
Tu Bibo, Hong Xuehai, Zhan Jianfeng, and Fan Jianping
2007, 44(10):  1717-1723. 
Asbtract ( 443 )   HTML ( 1)   PDF (386KB) ( 378 )  
Related Articles | Metrics
In high performance computing environment in many fields, to finish a task requires not only computing service, but also data service and all kinds of device services, which results in different relation dependences and is very complex. However, job management systems such as PBS, LSF, and Condor and so on, only support time dependence among jobs, and are short of fault-tolerance and complete flow control. Workflow-based user environment for high performance computing is successfully applied in petroleum geophysics, which not only benefits construction and control of task flow, but also enhances all kinds of relation dependences and flow semantics, especially for the workflow engine based on event model with checkpoint function. It provides good reliability. Thus, workflow-based user environment for high performance computing can be flexibly suitable for other different user environments.
Efficient Parallel Blocked Algorithms for Generalized Hermitian Eigenproblem
Zhao Yonghua, Chi Xuebin, and Cheng Qiang
2007, 44(10):  1724-1732. 
Asbtract ( 374 )   HTML ( 0)   PDF (563KB) ( 318 )  
Related Articles | Metrics
The performance of a generalized eigenproblem solver relies on many factors, which include selected parallel algorithms and matrix mapping strategy. A new parallelization is presented, which combines the Cholesky into the transformation from generalized to standard form. By reducing the communication cost and extending the parallelism, the new algorithm can obviously improve the performance and scalability of the original algorithm. Moreover, an efficient parallel algorithm is proposed to compute a triangular AX=B with multiple right hand sides. From the tests using the parallel software PSEPS, the speed of the parallel algorithm is about two times that of the classical parallel algorithms, and it has better performance and scalability than the classical parallel algorithms.
Optimization of B-NIDS for Multicore
Sun Xiaojuan, Sun Ninghui, and Chen Mingyu,
2007, 44(10):  1733-1740. 
Asbtract ( 391 )   HTML ( 0)   PDF (488KB) ( 324 )  
Related Articles | Metrics
With the rapid increase of network bandwidth and the growing variety of Internet applications, the backbone network intrusion detection systems (B-NIDS) meet the great requirements of delivering higher performance and enhancing effectiveness according to different features of network streams. Computing is entering a new phase in which CPU improvements are driven by the addition of multiple cores on a single chip, rather than higher frequencies. Parallel processing on these systems is in a primitive stage, and the parallelization of a sequential B-NIDS requires the explicit use and knowledge of underlying thread architecture. In this paper the bottleneck of the thread synchronization using fine-grained lock operations is discovered, and the new synchronization mechanism with no contention for shared structures is proposed based on the characteristics of data flow. Then a pipelining programming model of multithreading system with three contexts is issued, and the differential service for streams is implemented with the multiple weighed queues. In performance evaluation, the optimized system shows much better performance in three aspects of resource utilization, throughput, and response time on 8 core server. The improved system with the proposed synchronization mechanism shows good scalability. The processing capability on tested server can exceed over 1Gbps traffic flow. Also the multiple weighed queues for service quality introduce little latency, and a kind of probe-based sampling test shows that the response times of prioritized streams are shorter than those of non-prioritized.
Dependent Task Scheduling in Grid Based on T-RAG Optimization Selection
Chen Tingwei, Zhang Bin, and Hao Xianwen
2007, 44(10):  1741-1750. 
Asbtract ( 506 )   HTML ( 1)   PDF (554KB) ( 528 )  
Related Articles | Metrics
Efficient task scheduling is critical for grid application to achieve high performance. In grid computing, an application is decomposed into a set of dependent tasks. In the grid environment where resources have different capability and resources are interconnected over the world, the dependence among tasks affects the scheduling strategy greatly. The general task scheduling problem includes the problem of assigning the tasks of an application to suitable resource and the problem of ordering task executions on each resource. In this paper a task-resource assignment graph (T-RAG) is used to represent a potential resource assignment plan. And a model of task scheduling based on optimization selection of T-RAG is proposed, which maps the dependent task scheduling problem into a graph optimization problem. As a result of this optimization selection, the optimal graph is obtained and such optimal graph is the optimal scheduling plan which determines the resource assignment plan and the execution order of tasks. Finally, the task scheduling algorithm based on the proposed scheduling model is implemented. Compared with the ILHA algorithm in the simulation environment, the proposed algorithm shows better performance in the situation of a large body of data transported among tasks and significant differences in resources capability and network bandwidth.
A Detection Algorithm of Link Non-Correlated Multi-Paths in Wireless Mesh Networks
Ding Xuyang, Fan Mingyu, and Luo Huiqiong
2007, 44(10):  1751-1756. 
Asbtract ( 415 )   HTML ( 0)   PDF (501KB) ( 463 )  
Related Articles | Metrics
Wireless mesh networks are emerging as a key technology for next generation wireless networking. Because of their advantages over other wireless networks, wireless mesh networks are undergoing rapid progress and inspiring numerous applications. In order to provide better QoS for wireless mesh networks, it is very important to use multiple non-correlated paths to improve the performance of networks. However, the currently routing protocols of wireless mesh networks don't support the search of link non-correlated multi-paths. On the basis of analyzing the flaws of the DSR routing protocol, a novel detection algorithm EDSR (enhanced DSR) is proposed to solve this problem. The EDSR uses the routing buffers of nodes to search and find link non-correlated multi-paths among the source and destination nodes. And it enhances the throughput of networks by using these multiple link non-correlated paths. Compared with the DSR routing protocol, the multi-paths search processes of the EDSR will result in descending of the performance of networks at the beginning, but the performance of networks will be improved by the EDSR after the search processes. The simulation results show that the EDSR can acquire the link non-correlated multi-paths with lower cost and improve the performance of networks effectively.
A Text Digital Watermarking Technology Based on Graph Theory
Liu Dong, Sun Ming, and Zhou Mingtian
2007, 44(10):  1757-1764. 
Asbtract ( 603 )   HTML ( 0)   PDF (458KB) ( 529 )  
Related Articles | Metrics
Because of text's properties, it's so difficult to find perfect algorithms of embedding watermark into text documents, when compared with other media. Nowadays, text watermarking technologies is far behind other multimedia watermarking technologies such as images, audios, etc. A novel text digital watermarking technology based on graph theory is presented in this paper. Through changing its topology structures of a character or string, different figures are schemed out, which represent the same semanteme. Then those figures are mapped on graphs of graph theory, and the graphs or their properties should be correctly encoded in order to denote different watermarking information. The mathematic models of embedding and detecting watermarking are also described, and the experimental methods of robustness and visual influence, as well as the relative results are given. Finally, general rules of removing attack on text watermarking system is proposed, and the attack-resisting capability and methods of this text digital watermarking is analyzed in detail. According to the experimentation and analysis, the text watermarking technology has the following advantages: great capacity of watermarking, strong robustness, little visual influence and quite good attack-resisting capability. With this technology, it is more convenient to embed watermark into characters for ideograph languages such as Chinese and Korean, while it is suitable for embedding watermark into strings for alphabetic languages such as English and French.
Description-Logic-Based Temporal ER Model with Attribute Dependencies
Jiang Yuncheng, Tang Yong, Wang Ju, and Ji Gaofeng
2007, 44(10):  1765-1773. 
Asbtract ( 396 )   HTML ( 0)   PDF (648KB) ( 458 )  
Related Articles | Metrics
ER model with attribute dependencies may be translated into description logic ALCQI(D) knowledge bases, and the reasoning on ER model with attribute dependencies may be reduced to model reasoning on ALCQI(D) knowledge bases. The current research progresses and the existing problems of description logic for data bases, especially the relationship between the description logic and the temporal ER model, are analyzed. The formal definition of the temporal ER model with attribute dependencies εR\-\{VTAD\} is presented based on the work of Artale. Aiming at the characteristics and requirement of the temporal ER model with attribute dependencies εR\-\{VTAD\}, a kind of new description logic, i.e., the temporal description logic ALCQI(D)\-\{US\}, is presented. The temporal description logic ALCQI(D)\-\{US\} is the temporal extension of description logic ALCQI(D) through temporal logic, and the syntax and semantics of ALCQI(D)\-\{US\} are given. The ALCQI(D)\-\{US\}-based temporal ER model with attribute dependencies is presented, how to translate temporal ER model with attribute dependencies εR\-\{VTAD\} into temporal description logic ALCQI(D)\-\{US\} knowledge bases is studied, and the reasoning problem of satisfiability, redundancy, subsumption relationship, and implication relationship of temporal ER model with attribute dependencies εR\-\{VTAD\} may reason automatically through reasoning mechanism of temporal description logic ALCQI(D)\-\{US\}, and the correctness of these reasoning problems is proved.
pgi-distance: An Efficient Method Supporting Parallel KNN-join Process
He Honghui, Wang Lizhen, and Zhou Lihua
2007, 44(10):  1774-1781. 
Asbtract ( 723 )   HTML ( 1)   PDF (555KB) ( 672 )  
Related Articles | Metrics
The KNN-join has extensive applications where high-dimensional data is involved, because it is powerful and can provide more meaningful results than range-join. With its set-a-time nature, the KNN-join is able to facilitate many data mining tasks such as classification, outlier detection and clustering. MuX and Goreder are two algorithms specially designed for KNN-join. Since the MuX makes use of R-tree to reduce CPU cost, it suffers like all of the algorithms based on R-tree. In order to optimize both I/O and CPU cost, the Goreder applies a grid on data space. With the difference of data distribution, however, it is difficult to determine the cell's appropriate granularity. In the consequence, the Goreder tends to perform poorly. In this paper, a novel parallel method—pgi-distance (parallel grid index-distance)—is proposed, which combines the advantages of MuX and Goreder. On the one hand, the pgi-distance utilizes a two-tier structure to optimize I/O and CPU time separately. On the other hand, the index based on distance makes the pgi-distance adapted well to data dimensionality and data distribution. Especially the index in the pgi-distance is B\++-tree, which is supported by all commercial DBMS. So, this method becomes more applicable than that based on R-tree. Extensive experiments on both synthetic datasets and real datasets are conducted, and the results illustrate that pgi-distance is efficient.
Model Analysis of Supply Chain System Based on Color Stochastic Petri Net
Tang Da and Li Ye
2007, 44(10):  1782-1789. 
Asbtract ( 485 )   HTML ( 0)   PDF (515KB) ( 334 )  
Related Articles | Metrics
The supply chain of a virtual enterprise is dynamic alliance formed from different types of enterprises in order to attain some objective. It is task oriented, has uncertainty and goes after high efficiency. So that it is required that analyzed model of the supply chain system of virtual enterprise should have more flexible structure to fit frequent changes in generic situation. A model of the unified supply chain system based on color and stochastic Petri net is presented. Any supply chain system is described abstractly as a color stochastic Petri net that has the same structure in the model, and a certain supply chain is determined by the initial mark and the route time delay function in the model. Making use of the process simulation of production and transportation, operational status of the supply chain system is analyzed with given structure and speed of supply goods. The model is decomposed further, and the performance of the supply chain model is calculated by using isomorphic Markov chain. Complexity of calculation is avoided that is due to status explosion. An example through out the paper illustrates how the supply chain system is affected by its structure and speed according to analysis qualitatively and quantitatively.
Improvement and Implementation of a Polynomial Time Approximation Scheme for Euclidean Traveling Salesman Problem
Zhao Weizhong, Feng Haodi, and Zhu Daming
2007, 44(10):  1790-1795. 
Asbtract ( 568 )   HTML ( 0)   PDF (320KB) ( 464 )  
Related Articles | Metrics
In the traveling salesman problem(“TSP” for short), given n nodes and for each pair {i,j} of distinct nodes, a distance d\-\{i,j\}, it desires a closed path that visits each node exactly once and incurs the least cost, which is the sum of the distances along the path. Traveling salesman problem is a NP-hard combinatorial optimization problem, and one of hot topics in computer algorithm field. The classic problem has proved a rich testing ground for most important algorithmic ideas during the past few decades, and influenced the emergence of fields such as polyhedral theory and complexity theory. However, exact optimization for TSP is NP-hard. So is approximating the optimum within any constant factor. TSP instances arising in practice are usually quite special, so the hardness result may not necessarily apply to them. In Euclidean TSP, the nodes lie in R\+\{2\}and distance is defined using the l\-\{2\} norm. However, even Euclidean TSP is NP-hard. In 1996, S. Arora gave the first polynomial time approximation scheme (“PTAS”, for short) for this problem. An improvement on this scheme is proposed by reducing the constant in the so-called “patching lemma” from 6 to 3. At the same time, an implementation of the improved version is given.
A Multi-Commodity Flow Linear Programming with Polynomial Constraints for Black and White Traveling Salesman Problem
Jiang He, Zhang Xianchao, Che Haoyang, and Chen Guoliang
2007, 44(10):  1796-1800. 
Asbtract ( 591 )   HTML ( 0)   PDF (292KB) ( 494 )  
Related Articles | Metrics
The black and white salesman problem (BWTSP) is a new NP-hard problem, which can be divided into the undirected BWTSP and the directed BWTSP according to the symmetry of edges in graph. As a basis of complete algorithm design for NP-hard problems, the number of constraints in linear programming (LP) remarkably influences the complexity of algorithm design. The existing Ghiani LP for the undirected BWTSP contains an exponential number of constraints. For a special case of the direct BWTSP whose L=+∞, its LP with polynomial constraints can be obtained by being transformed into RATSP. However, there exists no LP for the general directed BWTSP. A new LP with polynomial constraints for the directed BWTSP is proposed. Firstly, the directed BWTSP is reduced to ATSP whenever Q=L=+∞. Then, based on the Finke-Claus-Gunn LP with n(n+4) constraints of ATSP, the n\+2+2|W| cardinality constraints can be acquired by defining residual and used cardinality commodity flow. Finally, the new n\+2+n+2|B| weight constraints are obtained by similar ways. In this way, the new LP with 3n\+2+7n constraints only can be eventually acquired. Since the undirected BWTSP and the directed BWTSP with L=+∞ are both special cases of the directed BWTSP, the results can be applicable to them too.
An Operational Semantics for UML Activity Diagrams
Wang Cong and Wang Zhixue
2007, 44(10):  1801-1807. 
Asbtract ( 477 )   HTML ( 0)   PDF (438KB) ( 501 )  
Related Articles | Metrics
More and more large systems are taking UML (unified model language) as requirements description language for system analysis and design, especially in those safety-critical systems. One of the most important dynamic behaviors specifying the mechanism of UML is the UML activity diagram, which is widely used for modeling of business. The UML activity diagram lacks strictly defined formal dynamic semantics. It is difficult using the UML activity diagram to make formal analysis, verification and assertation. An operational semantics for UML activity diagram is proposed according to UML1.5 specification documents. First, the formal syntax of UML activity diagram is provided. Then, the configuration and the transitions are defined in detail. Finally, the deduct rules of UML activity diagram are described based on the LTS. The main contributions are: Firstly, in order to get more precise and clearer description of formal semantics, the hierarchy of activity state is abandoned and instead the concept of state bag is introduced to define the configuration. Secondly, the LTS semantics and the detailed deduction rules of UML activity diagram are defined. The global state transition diagram can be computed from the LTS semantics conveniently. Therefore, the formal semantics is able to deal with most of the features of UML activity diagram and can be used in formal verification.