ISSN 1000-1239 CN 11-1777/TP

Table of Content

01 April 2019, Volume 56 Issue 4
A Survey on Architecture Research of Novel Non-Volatile Memory Based on Dynamical Trade-Off
Zhang Mingzhe, Zhang Fa, Liu Zhiyong
2019, 56(4):  677-691.  doi:10.7544/issn1000-1239.2019.20170985
Asbtract ( 545 )   HTML ( 8)   PDF (3273KB) ( 315 )  
Related Articles | Metrics
As a promising alternative candidate for DRAM, non-volatile memory (NVM) technique gains increasing interests from both industry and academia. Currently, the main problems that limit the wide utilization of NVM include considerable long latency for write operation, high energy consumption for write operation and limited write endurance. To solve these problems, the traditional solutions are based on computer architecture methods, such as adding extra level or scheduling scheme. Unfortunately, these solutions often suffer from unavoidable high soft/hardware overheads and can hardly optimize the architecture for more than one target at the same time. In recent years, as the improvement of research on non-volatile materials, several dynamical trade-offs lies in the materials are introduced, which also provides new opportunity for computer architecture research. Based on these trade-offs, several novel NVM architectures have been proposed. Compared with the traditional solutions, these proposed architectures have a series of advantages, such as low hardware overhead and the ability of optimizing for multi-targets. In this survey, we firstly introduce the existing problems of NVM and the traditional solutions. Then, we present three important dynamical trade-offs of NVM. After that, we introduce the newly proposed architectures based on these trade-offs. Finally, we make the conclusion for this kind of research work and point out some potential opportunities.
Paxos-like Consensus Algorithms: A Review
Wang Jiang, Zhang Mingxing, Wu Yongwei, Chen Kang, Zheng Weimin
2019, 56(4):  692-707.  doi:10.7544/issn1000-1239.2019.20170973
Asbtract ( 696 )   HTML ( 16)   PDF (3735KB) ( 467 )  
Related Articles | Metrics
With the rapid growth of data volume and Web services, the cluster size is getting bigger and bigger in datacenters. The probability of service interruption grows dramatically due to machine and network failures. How to achieve a fault-tolerant distributed system becomes very important. State machine replication is one of the most general methods for building a fault-tolerant system, and distributed consensus problem is one of the most basic and core issues in replicated state machine systems. Paxos and a series of Paxos-like consensus algorithms can effectively solve this problem. In recent years, more and more systems use consensus-related techniques to ensure their reliability and availability, and studies on distributed consensus algorithms are also emerging in an endless stream. These consensus algorithms can be divided into two categories, leader-based consensus algorithms and leaderless consensus algorithms. With the development of network technologies such as remote direct memory access(RDMA) and hardware technologies such as field-programmable gate array(FPGA), some consensus algorithms combining with new network technologies and hardware technologies have appeared, which are used to improve the performance of distributed systems. In this paper, we introduce Paxos series algorithms from the perspective of the development of distributed consensus algorithms, discuss the advantages and disadvantages of the algorithms in different scenarios, and further give a future outlook on the research and application directions.
A Dynamic and Static Combined Register Mapping Method in Binary Translation
Wang Jun, Pang Jianmin, Fu Liguo, Yue Feng, Shan Zheng, Zhang Jiahao
2019, 56(4):  708-718.  doi:10.7544/issn1000-1239.2019.20170905
Asbtract ( 372 )   HTML ( 2)   PDF (3281KB) ( 207 )  
Related Articles | Metrics
To reduce the redundant memory access caused by unnecessary registers overflow in binary translation, as the registers mapping in binary translation ignores the difference of register requirements among basic blocks and loop blocks, an efficient dynamic and static combined registers mapping optimization algorithm based on priority is proposed, introduces the idea of allocating global register statically and allocating local register dynamically. Firstly, global register is mapped statically to reduce the global register overflow cost and maintenance overhead, according to statistical features of different registers used on the source platform and the life cycle of variable. Then, the number of registers requested by intermediate instruction can be obtained, based on the intermediate representation. Therefore, the priority of registers allocation is determined. Lastly, dynamically allocate the registers in order to reduce the number of registers overflow, to reduce the expansion rate of the generated local code and memory access times. Thus, the performance of the target program is improved. The test results of NBENCH, representative recursive programs and SPEC2006 show that, the algorithm effectively reduces the memory access of local code, and improves the program performance with an average increase of 8.56%, 8.14%, and 8.01%, respectively.
A Simple and Efficient Cache Coherence Protocol Based on Self-Updating
He Ximing, Ma Sheng, Huang Libo, Chen Wei, Wang Zhiying
2019, 56(4):  719-729.  doi:10.7544/issn1000-1239.2019.20170898
Asbtract ( 433 )   HTML ( 6)   PDF (3017KB) ( 208 )  
Related Articles | Metrics
As the number of cores in a chip multiprocessor increases, cache coherence protocols have become a performance bottleneck of the share-memory system. The overhead and complexity of current cache coherence protocols seriously restrict the development of the share-memory system. Specifically, directory protocols need high storage overhead to keep track of sharer list and snooping protocols consume significant network bandwidth to broadcast messages. Some coherence protocols, such as MESI (modified exclusive shared or invalid) protocol, are extremely complex and have numerous transient states and data race. This paper implements a simple and efficient cache coherence protocol named VISU (valid/invalid states based on self-updating) for data-race-free programs. VISU is based on a self-updating mechanism and only includes two stable states (valid and invalid). Furthermore, the VISU protocol eliminates the directory and indirection transactions and reduces significant overheads. First, we propose self-updating shared blocks at synchronization points for correction with the data-race-free guarantee of parallel programming. Second, taking advantage of techniques that dynamically classify private data (only accessed by one processor) and shared data, we propose write-back for private data and write-through for shared data. For private data, a simple write-back policy can reduce the unnecessary on-chip network traffic. In L1 cache, a write-through policy for shared data which can keep the newest shared data in LLC, would obviate almost all coherence states. Our approach implements a truly cost-less two-state coherence protocol. The VISU protocol does not require directory or indirect transfer and is easier to verify while at the same time obtains similar even better performance of MESI directory protocol.
Dynamic Binary Instrumentation Based on QEMU
Zou Wei, Gao Feng, Yan Yunqiang
2019, 56(4):  730-741.  doi:10.7544/issn1000-1239.2019.20180166
Asbtract ( 1176 )   HTML ( 22)   PDF (5571KB) ( 381 )  
Related Articles | Metrics
Software instrumentation is a basic technology of software dynamic analysis, such as program optimization, debugging, testing, fault location and so on. The dynamic binary instrumenta-tion technology, because of its non-invasive, which does not need to modify the source code to compile, and does not need to reassemble the binary program, will not cause the expansion of the object code, and is widely used in software dynamic analysis, especially in resource constrained, low power consumption, high real-time embedded field, so dynamic binary instrumentation is the very key technology. However, the existing binary instrumentation tool can only be applied to user mode software, and the embedded whole system software also needs a corresponding binary instrumentation tool. In order to solve this problem, this paper based on the dynamic binary translation open source instruction set simulator QEMU(quick emulator), breaks through run time statistics collection on the basic blocks, and eliminates interrupt’s adverse effects of control flow analysis in the embedded the system software, and achieves the implementation of instrumentation on the intermediate code level to the embedded system software code, full completion of the embedded system software running control flow tracking, and the development of log information processing tool. Experiments show that the method proposed in this paper can accomplish call graph, function profile, coverage, control flow analysis and so on, which can solve the problem of dynamic binary analysis of embedded system software.
An Efficient Feedback Static Binary Translator for Solving Indirect Branch
Wang Jun, Pang Jianmin, Fu Liguo, Yue Feng, Zhang Jiahao
2019, 56(4):  742-754.  doi:10.7544/issn1000-1239.2019.20170412
Asbtract ( 348 )   HTML ( 4)   PDF (3531KB) ( 233 )  
Related Articles | Metrics
In order to solve the problem of indirect branch efficiently in static binary translation, a feedback static binary translation method is proposed, with two-level address mapping table to realize the fast mapping of indirect branch target address. This method can solve the problem of less code optimization and more redundant code in existing linear traversal translation. Firstly, the two-level address mapping table is used to address the code location quickly, using array address to store the target platform code block address in the order of the source platform base block start address and using array index to save the index position of the basic block start address in array address. Then, the monitoring feedback mechanism is added to the target executable program to carry on the code discovery, and the uncertain indirect branch target address would be returned so that the source code can be divided to new basic blocks and re-translated. The feedback static binary translation framework FD-QEMU is implemented based on QEMU(quick emulator), an open source binary translator. As the experimental results on SPEC2006 and NBENCH show, compared with QEMU, the speedup ratio of FD-SQEMU (feedback static QEMU) is 3.97 and 6.94 times on average; compared with SQEMU, a static translator with all instructions’ address mapping originally proposed by our group, the average acceleration ratio of FD-SQEMU is 1.18 times, and the maximum speedup ratio is 1.36 times, which verifies the effectiveness of the framework and method proposed in this paper.
Research of SSD Array Architecture Based on Workload Awareness
Zhang Qiang, Liang Jie, Xu Yinlong, Li Yongkun
2019, 56(4):  755-766.  doi:10.7544/issn1000-1239.2019.20170832
Asbtract ( 396 )   HTML ( 9)   PDF (8347KB) ( 239 )  
Related Articles | Metrics
The fixed data layout of traditional array system and the locality of workloads cause the partial disks of the array system to become hot disks, which affects the reliability and the overall concurrency performance of the array system. This paper proposes a new RAID architecture for SSD array systems, HA-RAID, which leverages hot/cold data separation and sliding window techniques. The main idea is that HA-RAID divides the disk array into hot disks and ordinary disks, which stores hot data on hot disks and cold data on ordinary disks, and it changes the role of each disk dynamically by moving a fixed-length sliding window. So, each disk has the opportunity to become a hot disk and stores hot data which achieves the purpose of storing hot data evenly on each disk. Experiments under real-world workloads on a RAID-0 array system composed of eight commercial SSDs show that HA-RAID can achieve an even distribution of hot data across all disks and reduce the percentage of hot disks appearing in the array to almost zero. This implies that HA-RAID achieves load balance and wear balance at the device level. In terms of performance, HA-RAID reduces the average response time by 12.01%~41.06% which achieves the I/O performance enhancement, compared with traditional RAID-0 array.
An Efficient Failure Reconstruction Based on In-Network Computing for Erasure-Coded Storage Systems
Tang Yingjie, Wang Fang, Xie Yanwen
2019, 56(4):  767-778.  doi:10.7544/issn1000-1239.2019.20170834
Asbtract ( 415 )   HTML ( 1)   PDF (3184KB) ( 262 )  
Related Articles | Metrics
Nowadays, the scale of distributed storage systems is getting increasingly larger. No matter whether the storage devices are disks or solid-state drives, the system is always faced with the risk of data loss. Traditional storage systems maintain three copies of each data block to ensure high reliability. Today, a number of distributed storage systems are increasingly shifting to the use of erasure codes because they can offer higher reliability and lower storage overhead. The erasure codes, however, have an obvious shortcoming in the reconstruction of an unavailable block, because they need to read multiple disks, which results in a large amount of network traffic and disk operations and ultimately high recovery overhead. In this paper, INP (in-network pipeline), an effective failure reconstruction scheme based on in-network computing that utilizes SDN (software defined networking) technology is presented in order to reduce the overhead of recovery without sacrificing any other performance. We use the global topology information for network from SDN controller to establish the tree of reconstruction, and transmit data according to it. The switches do part of the calculation that can reduce the network traffic, therefore to eliminate the bottleneck of the network, and to enhance the recovery performance. We evaluate the recovery efficiency of INP in different network bandwidths. Compared with the common erasure code system, it greatly reduces the network traffic and in a certain bandwidth, the degraded read time is the same as that of normal reading.
Virtual Machine Resources Allocation Methods Based on History Data
Wang Haitao, Li Zhanhuai, Zhang Xiao, Bu Hailong, Kong Lanxin, Zhao Xiaonan
2019, 56(4):  779-789.  doi:10.7544/issn1000-1239.2019.20170831
Asbtract ( 465 )   HTML ( 5)   PDF (1711KB) ( 242 )  
Related Articles | Metrics
Virtualization technology is widely used in cloud datacenters to realize on-demand resources allocation so as to lower operating costs. Moreover, the technology can also improve the flexibility and scalability of datacenters. Despite various merits, these features of virtualization technology also introduce an issue about how to allocate the virtual machines to make the best of physical resources while reducing the resource collision rate in the meantime. To this end, this paper proposes two resource allocation methods for virtual machines based on statistical analysis of history data. Combined with commonly-used placement strategies, these two methods are more effective compared with some state-of-art virtual machine resource allocation methods. In addition, existing independent indicators are incomplete to reflect the overall effectiveness of allocation methods. In order to solve the issue, this paper also proposes an integrated effectiveness indicator which combines different indicators from three separate aspects including the number of consumed physical machines, resource utilization and resource collision of physical machines to evaluate the effectiveness of allocation schemes. In the end, through tests of realistic cloud computing overhead, we prove that the proposed allocation methods of virtual machines are superior to common methods, and the integrated effectiveness indicator can reasonably evaluate the overall effectiveness of virtual machine allocation schemes.
An Overlap Store Optimization for Large-Scale Parallel Earth Science Application
Chen Jingkun Du Yunfei
2019, 56(4):  790-797.  doi:10.7544/issn1000-1239.2019.20170906
Asbtract ( 280 )   HTML ( 2)   PDF (2634KB) ( 144 )  
Related Articles | Metrics
Weather forecast, atmosphere or ocean simulations have much output data during the iterative computation for the intermediate status or check point. However, an unreasonable output design limits the performance of the earth science application in large-scale parallel computation. In this paper, we propose an overlap store optimization to solve this problem. The key issue of this overlap store optimization is setting some I/O processes to hide the I/O cost. This optimization has three main advantages: first, we hide the I/O operation through the overlap of output and computing; second, we limit the cost of gather operation, break though the bottleneck of gather communication bandwidth and memory size; third, the I/O process is flexible to use different high-performance parallel I/O API. We use this method to optimize WRF, ROMS_AGRIF and GRAPES in Tianhe II super computer, and test their performance after the optimization. The result of the tests shows that we obtain about 30% to 900% improvement in the peak. We also discuss the best proportion of computer process and I/O process when the total number of processes is fixed. The optimized version is very easy to used, and the only cost is the scientists need to setup two more variables in the namelist.
An Efficient Processing In Memory Framework Based on Skyrmion Material
Liu Bicheng, Gu Haifeng, Chen Mingsong, Gu Shouzhen, Chen Wenjie
2019, 56(4):  798-809.  doi:10.7544/issn1000-1239.2019.20180157
Asbtract ( 628 )   HTML ( 2)   PDF (3997KB) ( 347 )  
Related Articles | Metrics
As a new computing paradigm, processing in memory (PIM) allows the parallel computation in both processors and memories, which drastically reduce the movements between computation units and storage units. Therefore, PIM can be considered as an efficient technology to somewhat address the shortcomings of the von neumann architecture. Compared with traditional random access memories, racetrack memory has many merits including high density, non-volatility, and low static power. Therefore, it can be used for efficient PIM computing. To address the shortages of domain-wall based PIM, this paper proposes a novel PIM framework based on the Skyrmion material. In this framework, we use Skyrmion-based racetrack memories to construct storage units, and use Skyrmion-based logic gates to compose both adders and multipliers for the computation units. Since our framework does not need CMOS (complementary metal oxide semiconductor) circuits to assist the underlying computation unit construction, the design complexity is significantly reduced. Meanwhile, based on our proposed optimization methods for read and write operations at the circuit layer and address mapping mode of the memory at the system level, the performance of our framework is drastically improved. Experimental results show that compared with domain-wall based PIM framework, our approach can achieve 48.1% time improvement and 42.9% energy savings on average.
Energy Modeling and Plan Evaluation for Queries in Relational Databases
Guo Binglei, Yu Jiong, Yang Dexian, Liao Bin
2019, 56(4):  810-824.  doi:10.7544/issn1000-1239.2019.20180138
Asbtract ( 411 )   HTML ( 6)   PDF (5989KB) ( 226 )  
Related Articles | Metrics
In relational database systems, the original policy model of the query optimizer ignores energy consumption and only concentrates on improving performance when selecting execution plans for queries. As a consequence, this kind of plan selection strategy will limit the energy-saving penitential of future database systems. Firstly, an energy model for query plans is proposed based on the resource consumption characteristics of queries (i.e., CPU instructions, disk block reads, and memory block reads). The energy model can predict energy cost for plans before query execution and hence laid a foundation for the optimizer to select energy-efficient plans in the decision-making phase. Secondly, to enable the optimizer to regulate the weight of power and performance in the total cost of each query plan, a query-plan evaluation model is proposed. According to a specific requirement of users, the evaluation model can change the optimization goal (performance, power, and energy) of each query and select the best execution plan for a certain query. Experimental results show that the average prediction accuracy of the energy model is 95.68%, the power savings range from 8.95% to 29.25% for the optimization goal of power, and the energy savings range from 3.62% to 11.34% for the optimization goal of energy.
Research and Optimization of Fast Convolution Algorithm Winograd on Intel Platform
Wu Zheng, An Hong, Jin Xu, Chi Mengxian, Lü Guofeng, Wen Ke, Zhou Xin
2019, 56(4):  825-835.  doi:10.7544/issn1000-1239.2019.20170932
Asbtract ( 846 )   HTML ( 6)   PDF (3492KB) ( 483 )  
Related Articles | Metrics
With the rapid development of deep learning, it’s applied extensively for many fields, such as speech processing, image recognition, natural language understanding and so on, bringing great changes for scientific research and daily life. Intel which follows the trend of deep learning launched the second generation of Xeon Phi processor Intel KNL(knights landing), and released the third generation Intel KNM (knights mill), which brings new impetus and vitality for the prosperous development of deep learning. This paper mainly contributes to promoting perfect Intel MKL (math kernel library) DNN (deep neural network), and develops deep learning on Intel platform, according to research and optimization for the fast convolution algorithm Winograd. Combined with characteristics of Intel latest deep learning platform, such as AVX512, high-speed memory MCDRAM, various memory/SNC modes, two-dimensional grid-type cores structure and so on, this work aims to design and optimize the implementation of Winograd algorithm by analyzing memory allocation, data dependency, etc. Finally, on one hand, the typical CNN (convolutional neural network) model VGG19 is used to test and compare performance with Intel MKL convolution, achieving more than doubled acceleration of performance. On the other hand, the common different types of convolutions are used to test and compare performance with Intel MKL DNN and NVIDIA cuDNN, verifying applicability and objective use value about Winograd. The purpose of the paper is to provide important guiding significance for development of Intel platform in the field of deep learning.
Partition Order Product Space: Partition Based Granular Computing Model
Xu Yi, Yao Yiyu
2019, 56(4):  836-843.  doi:10.7544/issn1000-1239.2019.20180325
Asbtract ( 310 )   HTML ( 0)   PDF (1070KB) ( 129 )  
Related Articles | Metrics
Granular computing solves complex problem based on granular structure. The existing studies on the granulation methods in granular structures mainly focus on multilevel granulation methods and multiview granulation methods respectively, without combining multilevel granulation methods and multiview granulation methods. Granular structure based on multilevel granulation methods is composed of a linearly ordered family of levels, which only provides one view with multiple levels. Granular structure based on multiview granulation methods provides multiple views, but each view only consists of one level. In order to understand and describe problem in a more comprehensive way, and then solve the problem more effectively and reasonably, given a universe, we take partition as the granulation method. Combining multilevel granulation methods with multiview granulation methods, we propose partition order product space. Firstly, using a partition on the universe to define a level. Secondly, using a nested sequence of partitions to define a hierarchy, which represents a view with linearly ordered relation. Finally, given a number of views determining a number of linearly ordered relations, based on the product of multiple linearly ordered relations, we propose partition order product space, which gives a granular computing model based on partition. Example demonstrates the superiority of partition order product space in real application.
A Construction Method of Triadic Concepts
Wang Xia, Jiang Shan, Li Junyu, Wu Weizh
2019, 56(4):  844-853.  doi:10.7544/issn1000-1239.2019.20180315
Asbtract ( 326 )   HTML ( 0)   PDF (1607KB) ( 135 )  
Related Articles | Metrics
Triadic concept analysis is a new approach for data analysis and information processing. One of the important problems of triadic concept analysis is the construction of triadic concepts. Firstly, properties of a triadic concept in which one of its extent, intent and modus is an empty set are studied, and the judgment method of this kind of triadic concept is also obtained. Secondly, another kind of special triadic concept, called object-conditional triadic concept, is researched in detail. An operation is defined on the set of conditional triadic concepts, and a construction method of triadic concepts is then proposed based on object-conditional triadic concepts using the operation. It is shown that every triadic concept can be generated by some object-conditional triadic concepts if its extent and modus are both non-empty sets. Thirdly, a triadic context is decomposed into a series of dyadic contexts according to its conditions. The relationship between object-conditional triadic concepts of the triadic context and object dyadic concepts of those decomposed dyadic contexts is then studied. Furthermore, steps are given to generate all triadic concepts from object-conditional triadic concepts. Finally, the detailed process of constructing the triadic concepts is demonstrated by an example, and the triadic diagram is given to describe all generated triadic concepts more clearly.
Weighted Lattice Based Recurrent Neural Networks for Sentence Semantic Representation Modeling
Zhang Xiangwen, Lu Ziyao, Yang Jing, Lin Qian, Lu Yu, Wang Hongji, Su Jinsong
2019, 56(4):  854-865.  doi:10.7544/issn1000-1239.2019.20170917
Asbtract ( 571 )   HTML ( 4)   PDF (2028KB) ( 259 )  
Related Articles | Metrics
Currently, recurrent neural networks (RNNs) have been widely used in semantic representation modeling of text sequences in natural language processing. For those languages without natural word delimiters (e.g., Chinese), RNNs generally take the segmented word sequence as input. However, sub-optimal segmentation granularity and segmentation errors may affect sentence semantic modeling negatively, as well as subsequent natural language processing tasks. To address these issues, the proposed weighted word lattice based RNNs take the weighted word lattice as input and produce current state at each time step by integrating arbitrarily many input vectors and the corresponding previous hidden states. Weighted word lattice expresses a compressed data structure that contains exponential word segmentation results. To a certain extent, the weighted word lattice reflects the consistency of different word segmentation results. Specifically, lattice weights are further exploited as a supervised regularizer to refine weights modeling of the semantic composition operation in this model, leading to better sentence semantic representation learning. Compared with traditional RNNs, the proposed model not only alleviates the negative impact of segmentation errors but also is more expressive and flexible to sentence representation learning. Experimental results on sentiment classification and question classification tasks demonstrate the superiority of the proposed model.
Uroad:An Efficient Method for Large-Scale Many to Many Ride Sharing Matching
Cao Bin, Hong Feng, Wang Kai, Xu Jinting, Zhao Liwei, Fan Jing
2019, 56(4):  866-883.  doi:10.7544/issn1000-1239.2019.20180035
Asbtract ( 1276 )   HTML ( 35)   PDF (3449KB) ( 372 )  
Related Articles | Metrics
Due to the congested road and the increasing cost of private car, more and more people are willing to choose carpooling for travel. Although there are a lot of algorithms for ride sharing research, there is no algorithm to consider this problem from a global view. Plan all the matching routes from the global view and make all drivers’ detour distances minimize, which not only can reduce air pollution, but also can ease the traffic pressure. This paper presents an efficient large-scale matching method for many to many ride sharing algorithm, called Uroad, to make up for the shortcomings of existing algorithms. Uroad allows the rider to request a ride service which includes the periods of departure time when he gets ready to start off and the maximum cost of the ride-sharing he is willing to pay for this service. At the same time, Uroad allows the driver to set the departure time to indicate when he will set out and the arrival time that he must reach to his destination before. The same to the other Carpooling algorithms, Uroad calculates the fare based on the distance of rider’s trip and the detour caused by the rider. According to the requirements of riders and drivers, Uroad supports the global optimal matching among multiple riders and drivers, matches a driver who can meet the requirements to each rider as far as possible, and minimizes the total detour of all the drivers to reach the goal of environmental protection and reducing the traffic pressure. Uroad uses a series of time pruning techniques and Euclidean distance pruning techniques to reduce the calculation of the shortest path which can make the overall algorithm more quick and efficient. The experiments show that it is less than 2 minutes for Uroad to find the optimal combination of ride-sharing for 1 000 riders in 100 000 drivers, which is 40% shorter than the direct calculation of the shortest path. Compared with the random selection of drivers, the total detours of all drivers can be reduced by about 60% with the global optimization strategy.
Online Service Reputation Measurement Method Based on Kendall tau Distance
Zheng Susu, Fu Xiaodong, Yue Kun, Liu Li, Liu Lijun, Feng Yong
2019, 56(4):  884-894.  doi:10.7544/issn1000-1239.2019.20180034
Asbtract ( 398 )   HTML ( 4)   PDF (2569KB) ( 192 )  
Related Articles | Metrics
Due to the inconsistent user preferences and the inconsistent rating criteria, the ratings given by different users to one service are actually incomparable, and the reputation mechanism based on assumption of the consistent rating criteria cannot guarantee the comparability among different service reputations, which will result in unobjective outcome when the reputations are used to choose services. To improve the objectivity of online services reputation measurement under the circumstance referred above, this paper presents a method of online service reputation measurement based on Kendall tau distance. Firstly, a distance metric is defined to measure the consistency between the two rating vectors. Secondly, the measurement of online service reputation is modeled as an optimization problem to find a reputation vector that minimizes the Kendall tau distance between the reputation vector and the user-service rating matrix. Finally, simulated annealing algorithm is used to solve the optimization problem and the reputation vector is served as a service reputation. The rationality and effectiveness of the method have been verified by experimental study. The experiments show that the method can meet the preferences of most users, so that users can make right services choice decision, and ensure the efficiency while improving the manipulation resistance ability of the reputation measurement method.
Optimized Mutation Testing Techniques for WS-BPEL Programs
Sun Chang’ai, Wang Zhen, Pan Lin
2019, 56(4):  895-905.  doi:10.7544/issn1000-1239.2019.20180037
Asbtract ( 492 )   HTML ( 1)   PDF (1750KB) ( 216 )  
Related Articles | Metrics
Business process execution language for Web service (WS-BPEL) is an executable XML-based and process-oriented service composition language. Due to unique features of Web services, such as dynamics, loose coupling, and open deployment and execution environment, it is an important issue how to assure the quality of WS-BPEL programs. Although mutation testing has a strong fault detection capability, it fails to be widely practiced due to the large number of mutants, the long execution period, and the high computation cost. In order to improve the practicability of mutation testing, we investigate how to decrease the cost of mutation testing for WS-BPEL programs, and propose two kinds of optimization techniques from the perspectives of second-order mutation and prioritization of operators. We also develop an integrated tool named μBPEL to support the mutant generation, optimization, and execution of the proposed optimization techniques. Finally, an empirical study has been conducted where six representative WS-BPEL programs are used to validate and evaluate the effectiveness of the proposed optimized mutation testing techniques. Experimental results show that the proposed optimization techniques for WS-BPEL programs are able to reduce the number of mutants without significantly jeopardizing their fault detection effectiveness and thus improve the efficiency of mutation testing.