2017 Vol. 54 No. 1
2017, 54(1): 1-19.
DOI: 10.7544/issn1000-1239.2017.20151076
Abstract:
Proteins are large molecules consisting of a linear sequence of amino acids. In the natural environment, a protein spontaneously folds into specific tertiary structure to perform its biological functionality. The main factors that drive proteins to fold are interactions between residues, including hydrophobic interaction, Van der Waals’ force and electrostatic interactions. The interactions between residues usually lead to residue-residue contacts, and the prediction of residue-residue contacts should greatly facilitate understanding of protein structures and functionalities. A great variety of techniques have been proposed for residue-residue contacts prediction, including machine learning, statistical models, and linear programing. It should be pointed out that most of these techniques are based on the biological insight of co-evolution, i.e., during the evolutionary history of proteins, a residue’s mutation usually leads its contacting partner to mutate accordingly. In this review, we summarize the state-of-art algorithms in this field with emphasis on the construction of statistical models based on biological insights. We also present the evaluation of these algorithms using CASP (critical assessment of techniques for protein structure prediction) targets as well as popular benchmark datasets, and describe the trends in the field of protein contact prediction.
Proteins are large molecules consisting of a linear sequence of amino acids. In the natural environment, a protein spontaneously folds into specific tertiary structure to perform its biological functionality. The main factors that drive proteins to fold are interactions between residues, including hydrophobic interaction, Van der Waals’ force and electrostatic interactions. The interactions between residues usually lead to residue-residue contacts, and the prediction of residue-residue contacts should greatly facilitate understanding of protein structures and functionalities. A great variety of techniques have been proposed for residue-residue contacts prediction, including machine learning, statistical models, and linear programing. It should be pointed out that most of these techniques are based on the biological insight of co-evolution, i.e., during the evolutionary history of proteins, a residue’s mutation usually leads its contacting partner to mutate accordingly. In this review, we summarize the state-of-art algorithms in this field with emphasis on the construction of statistical models based on biological insights. We also present the evaluation of these algorithms using CASP (critical assessment of techniques for protein structure prediction) targets as well as popular benchmark datasets, and describe the trends in the field of protein contact prediction.
2017, 54(1): 20-33.
DOI: 10.7544/issn1000-1239.2017.20160299
Abstract:
The current Internet architecture based on TCP/IP is facing with many unprecedented challenges, including scalability, security, mobility and controllability. New clean-slate architecture designs are expected to address these challenges and provide better evolvability. As such, they have been attracting great attention in recent years. Before deployment in production networks, the future Internet architectures, protocols and algorithms should be comprehensively validated, evaluated and optimized in large-scale and realistic testbeds. The testbeds for network innovation should closely simulate the real network, and provide more flexibility. These requirements make it critical to study architectures and key technologies of testbeds. Although there have been long-term interests in network testbeds, a comprehensive survey is still missing. In this paper, we first analyze the requirements of network innovation testbeds based on four common problems of network experiments, namely cost, feasibility, credibility and controllability. We then summarize technical challenges of testbeds design and development from the perspectives of virtualization and programmability of network, federate management/control of resources, as well as monitor/measurement of infrastructures and experiments. In particular, we summarize the state-of-the-art technologies and architectures that aim at addressing the aforementioned challenges. Finally, we present some representative testbeds, and discuss the future trends and open questions.
The current Internet architecture based on TCP/IP is facing with many unprecedented challenges, including scalability, security, mobility and controllability. New clean-slate architecture designs are expected to address these challenges and provide better evolvability. As such, they have been attracting great attention in recent years. Before deployment in production networks, the future Internet architectures, protocols and algorithms should be comprehensively validated, evaluated and optimized in large-scale and realistic testbeds. The testbeds for network innovation should closely simulate the real network, and provide more flexibility. These requirements make it critical to study architectures and key technologies of testbeds. Although there have been long-term interests in network testbeds, a comprehensive survey is still missing. In this paper, we first analyze the requirements of network innovation testbeds based on four common problems of network experiments, namely cost, feasibility, credibility and controllability. We then summarize technical challenges of testbeds design and development from the perspectives of virtualization and programmability of network, federate management/control of resources, as well as monitor/measurement of infrastructures and experiments. In particular, we summarize the state-of-the-art technologies and architectures that aim at addressing the aforementioned challenges. Finally, we present some representative testbeds, and discuss the future trends and open questions.
2017, 54(1): 34-49.
DOI: 10.7544/issn1000-1239.2017.20150729
Abstract:
The availability of multi-source data in big data era can potentially lead to a revolution in ride sharing, which has been widely studied in academia as a means of reducing the number of cars, congestion, and pollution by sharing empty seats, and is lack of popularity in practice due to the inflexibility of off-line booking, limited resources and so on. In big data era, dynamic ride sharing, powered by mobile computation, location based service, and social networks, emerges and gains popularities recently for providing real-time and flexible ride sharing services through real-time travel planning systems. These systems raise research opportunities as well as challenges, including how to process real-time location data and traffic data, to match ride requests and cars in real time, and to provide fair, secure, and low-priced services to gain more participants. This paper defines the problem of dynamic ride sharing formally and discusses its variants and recent developments. The framework of filter and refine to solve the real-time challenges of matching requests and cars is then discussed. In particular, in the filter step, we introduce the method of pre-computing, spatio-temporal index, grouping, and parallelizing. In the refine step, we introduce the method of lazy calculation strategy, new index tree structure, and evolutionary computation. We also discuss the techniques related to social factors such as pricing strategies, credit system, human-computer interaction, and security in big data era. Finally, this paper ends with a panoramic summary and a discussion on possible future research directions.
The availability of multi-source data in big data era can potentially lead to a revolution in ride sharing, which has been widely studied in academia as a means of reducing the number of cars, congestion, and pollution by sharing empty seats, and is lack of popularity in practice due to the inflexibility of off-line booking, limited resources and so on. In big data era, dynamic ride sharing, powered by mobile computation, location based service, and social networks, emerges and gains popularities recently for providing real-time and flexible ride sharing services through real-time travel planning systems. These systems raise research opportunities as well as challenges, including how to process real-time location data and traffic data, to match ride requests and cars in real time, and to provide fair, secure, and low-priced services to gain more participants. This paper defines the problem of dynamic ride sharing formally and discusses its variants and recent developments. The framework of filter and refine to solve the real-time challenges of matching requests and cars is then discussed. In particular, in the filter step, we introduce the method of pre-computing, spatio-temporal index, grouping, and parallelizing. In the refine step, we introduce the method of lazy calculation strategy, new index tree structure, and evolutionary computation. We also discuss the techniques related to social factors such as pricing strategies, credit system, human-computer interaction, and security in big data era. Finally, this paper ends with a panoramic summary and a discussion on possible future research directions.
2017, 54(1): 50-62.
DOI: 10.7544/issn1000-1239.2017.20150843
Abstract:
Taxonomy matching, i.e., an operation of taxonomy merging across different knowledge bases, which aims to align common elements between taxonomies, has been extensively studied in recent years due to its wide applications in knowledge base population and proliferation. However, with the continuous development of network big data, taxonomies are becoming larger and more complex, and covering different domains. Therefore, to pose an effective and general matching strategy covering cross-domain or large-scale taxonomies is still a considerable challenge. In this paper, we presents a composite structure based matching method, named BiMWM, which exploits the composite structure information of class in taxonomy, including not only the micro-structure but also the macro-structure. BiMWM models the taxonomy matching problem as an optimization problem on a bipartite graph. It works in two stages: it firstly creates a weighted bipartite graph to model the candidate matched classes pairs between two taxonomies, then performs a maximum weight matching algorithm to generate an optimal matching for two taxonomies in a global manner. BiMWM runs in polynomial time to generate an optimal matching for two taxonomies. Experimental results show that our method outperforms the state-of-the-art baseline methods, and performs good adaptability in different domains and scales of taxonomies.
Taxonomy matching, i.e., an operation of taxonomy merging across different knowledge bases, which aims to align common elements between taxonomies, has been extensively studied in recent years due to its wide applications in knowledge base population and proliferation. However, with the continuous development of network big data, taxonomies are becoming larger and more complex, and covering different domains. Therefore, to pose an effective and general matching strategy covering cross-domain or large-scale taxonomies is still a considerable challenge. In this paper, we presents a composite structure based matching method, named BiMWM, which exploits the composite structure information of class in taxonomy, including not only the micro-structure but also the macro-structure. BiMWM models the taxonomy matching problem as an optimization problem on a bipartite graph. It works in two stages: it firstly creates a weighted bipartite graph to model the candidate matched classes pairs between two taxonomies, then performs a maximum weight matching algorithm to generate an optimal matching for two taxonomies in a global manner. BiMWM runs in polynomial time to generate an optimal matching for two taxonomies. Experimental results show that our method outperforms the state-of-the-art baseline methods, and performs good adaptability in different domains and scales of taxonomies.
2017, 54(1): 63-70.
DOI: 10.7544/issn1000-1239.2017.20150796
Abstract:
Partial label learning is a weakly-supervised machine learning framework proposed recently. Since it loosens the requirement to training data set, i.e. the learning model can be obtained when each training example is only associated with a candidate set of the ground-truth labels, and partial label learning framework can be used to deal with many real-world tasks more conveniently. The ambiguity in training data inevitably makes partial label learning problem more difficult to be addressed than traditional classification problem, and only several algorithms for small-scale training set are available up to the present. Based on ECOC technology and variational Gaussian process model, this paper presents a fast kernel-based partial label learning algorithm which can deal with large-scale problem effectively. The basic strategy is to convert the original training data set into several standard two-class data sets by using ECOC technology firstly, and then to develop a binary classify with lower computational complexity on each two-class data set by using variational Gaussian process model. The experimental results show that the proposed algorithm can achieve almost the same accuracy as the existing state-of-the-art kernel-based partial label learning algorithms but use shorter computing time. More specifically, the proposed algorithm can deal with the problems with millions samples within 40 minutes on a personal computer.
Partial label learning is a weakly-supervised machine learning framework proposed recently. Since it loosens the requirement to training data set, i.e. the learning model can be obtained when each training example is only associated with a candidate set of the ground-truth labels, and partial label learning framework can be used to deal with many real-world tasks more conveniently. The ambiguity in training data inevitably makes partial label learning problem more difficult to be addressed than traditional classification problem, and only several algorithms for small-scale training set are available up to the present. Based on ECOC technology and variational Gaussian process model, this paper presents a fast kernel-based partial label learning algorithm which can deal with large-scale problem effectively. The basic strategy is to convert the original training data set into several standard two-class data sets by using ECOC technology firstly, and then to develop a binary classify with lower computational complexity on each two-class data set by using variational Gaussian process model. The experimental results show that the proposed algorithm can achieve almost the same accuracy as the existing state-of-the-art kernel-based partial label learning algorithms but use shorter computing time. More specifically, the proposed algorithm can deal with the problems with millions samples within 40 minutes on a personal computer.
Abstract:
It is a research hot to achieve the object recognition of the massive network media data nowadays. To address the problem, an object recognition strategy is proposed to handle the massive network media data flow which adopts heterogeneous multimodal structure while utilizing the temporal coherence. Firstly, based on the video and audio co-existing feature of media network data, a heterogeneous multimodal structure is constructed to incorporate the convolutional neural network(CNN) and the restricted Boltzmann machine(RBM). The audio information is processed by restricted Boltzmann machine and the video information is processed by convolutional neural network respectively. The heterogeneous multimodal structure can exploit the merits of different deep learning neural networks. After that, the share characteristic representation are generated by using the canonical correlation analysis(CCA). Then the temporal coherence of video frame is utilized to improve the recognizing accuracy further. There kinds of experiments are adopted to validate the effectiveness of the proposed strategy. The first type of experiment compares the proposed strategy with single-mode algorithm. The second type of experiment illustrates the result based on composite database. Finally the videos coming from real websites are extracted to compare the proposed strategy with other algorithms. These experiments prove the effectiveness of the proposed heterogeneous multimodal strategy.
It is a research hot to achieve the object recognition of the massive network media data nowadays. To address the problem, an object recognition strategy is proposed to handle the massive network media data flow which adopts heterogeneous multimodal structure while utilizing the temporal coherence. Firstly, based on the video and audio co-existing feature of media network data, a heterogeneous multimodal structure is constructed to incorporate the convolutional neural network(CNN) and the restricted Boltzmann machine(RBM). The audio information is processed by restricted Boltzmann machine and the video information is processed by convolutional neural network respectively. The heterogeneous multimodal structure can exploit the merits of different deep learning neural networks. After that, the share characteristic representation are generated by using the canonical correlation analysis(CCA). Then the temporal coherence of video frame is utilized to improve the recognizing accuracy further. There kinds of experiments are adopted to validate the effectiveness of the proposed strategy. The first type of experiment compares the proposed strategy with single-mode algorithm. The second type of experiment illustrates the result based on composite database. Finally the videos coming from real websites are extracted to compare the proposed strategy with other algorithms. These experiments prove the effectiveness of the proposed heterogeneous multimodal strategy.
2017, 54(1): 80-93.
DOI: 10.7544/issn1000-1239.2017.20150492
Abstract:
As large-scale data collecting and processing are being widely studied in recent years, several released big data processing platforms are increasingly playing important roles in the operations of many Internet businesses. Open source ecosystems, the engine of big data innovation, have been evolving so rapidly that a number of them are successfully adopted as the components of mainstream data processing platforms. In reality, however, the open source software is still far from perfect while dealing with real large-scale data. On the basis of the industrial practice at Xiaomi Inc, this paper proposes an improved platform for collecting and processing large-scale data in face of varied business requirements. We focus on the problems in terms of the functionality, consistency and availability of the software when they are executed for data collecting, storing and processing procedures. In addition, we propose a series of optimizations aiming at load balance, failover, data compression and multi-dimensional scheduling to significantly improve the efficiency of the current system. All these designs and optimizations described in this paper have been practically implemented and deployed to support various Internet services provided by Xiaomi Inc.
As large-scale data collecting and processing are being widely studied in recent years, several released big data processing platforms are increasingly playing important roles in the operations of many Internet businesses. Open source ecosystems, the engine of big data innovation, have been evolving so rapidly that a number of them are successfully adopted as the components of mainstream data processing platforms. In reality, however, the open source software is still far from perfect while dealing with real large-scale data. On the basis of the industrial practice at Xiaomi Inc, this paper proposes an improved platform for collecting and processing large-scale data in face of varied business requirements. We focus on the problems in terms of the functionality, consistency and availability of the software when they are executed for data collecting, storing and processing procedures. In addition, we propose a series of optimizations aiming at load balance, failover, data compression and multi-dimensional scheduling to significantly improve the efficiency of the current system. All these designs and optimizations described in this paper have been practically implemented and deployed to support various Internet services provided by Xiaomi Inc.
2017, 54(1): 94-110.
DOI: 10.7544/issn1000-1239.2017.20150804
Abstract:
Mining regular pattern is an emerging area. To the best of our knowledge, no method has been proposed to mine the maximal regular patterns about data stream. In this paper, the problem of mining maximal regular patterns based on the landmark window over data stream is focused at the first time. In order to resolve the issue that the nave algorithm which is used to handle the maximal regular patterns mining based on the landmark window over data stream does not have the characteristic of incremental computation, the DSMRM-BLW(data stream maximal regular patterns mining based on boundary landmark window) algorithm is proposed. It takes the first window as the boundary landmark window, and handles it with the nave algorithm. For all other windows, it can obtain the maximal regular patterns over them based on the ones over the adjacent last window incrementally, and can overcome the drawback of the nave algorithm. It is revealed by the extensive experiments that the DSMRM-BLW algorithm is effective in dealing with the maximal regular patterns mining based on the landmark window over data stream, and outperforms the nave algorithm in execution time and space consumption.
Mining regular pattern is an emerging area. To the best of our knowledge, no method has been proposed to mine the maximal regular patterns about data stream. In this paper, the problem of mining maximal regular patterns based on the landmark window over data stream is focused at the first time. In order to resolve the issue that the nave algorithm which is used to handle the maximal regular patterns mining based on the landmark window over data stream does not have the characteristic of incremental computation, the DSMRM-BLW(data stream maximal regular patterns mining based on boundary landmark window) algorithm is proposed. It takes the first window as the boundary landmark window, and handles it with the nave algorithm. For all other windows, it can obtain the maximal regular patterns over them based on the ones over the adjacent last window incrementally, and can overcome the drawback of the nave algorithm. It is revealed by the extensive experiments that the DSMRM-BLW algorithm is effective in dealing with the maximal regular patterns mining based on the landmark window over data stream, and outperforms the nave algorithm in execution time and space consumption.
2017, 54(1): 111-122.
DOI: 10.7544/issn1000-1239.2017.20150784
Abstract:
With the advancement of mobile computing technology and the widespread use of GPS-enabled mobile devices, research on semantic trajectories has attracted a lot of attentions in recent years, and the semantic trajectory pattern mining is one of the most important issues. Most existing methods discover the similar behavior of moving objects through the analysis of sequences of stops. However, these methods have not considered the duration of staying on a stop which affects the accuracy to distinguish different behavior patterns. In order to solve the problem, this paper proposes a novel approach for discovering common behavior using staying duration on semantic trajectory (DSTra) which can easily differentiate trajectory patterns. DSTra can be used to detect the group that has similar lifestyle, habit or behavior patterns. Semantic trajectory patterns of each moving object are mined firstly. Then, the time-weight based pattern similarity measurement is designed. After that, a hierarchical clustering method with pruning strategy is proposed, where each cluster represents the common behavior patterns from moving objects. Finally, experiments on both real-world dataset and synthetic dataset demonstrate the effectiveness, precision and efficiency of DSTra.
With the advancement of mobile computing technology and the widespread use of GPS-enabled mobile devices, research on semantic trajectories has attracted a lot of attentions in recent years, and the semantic trajectory pattern mining is one of the most important issues. Most existing methods discover the similar behavior of moving objects through the analysis of sequences of stops. However, these methods have not considered the duration of staying on a stop which affects the accuracy to distinguish different behavior patterns. In order to solve the problem, this paper proposes a novel approach for discovering common behavior using staying duration on semantic trajectory (DSTra) which can easily differentiate trajectory patterns. DSTra can be used to detect the group that has similar lifestyle, habit or behavior patterns. Semantic trajectory patterns of each moving object are mined firstly. Then, the time-weight based pattern similarity measurement is designed. After that, a hierarchical clustering method with pruning strategy is proposed, where each cluster represents the common behavior patterns from moving objects. Finally, experiments on both real-world dataset and synthetic dataset demonstrate the effectiveness, precision and efficiency of DSTra.
2017, 54(1): 123-133.
DOI: 10.7544/issn1000-1239.2017.20160102
Abstract:
With the development of the Internet and cloud computing, more and more applications are migrated from local host to the cloud. In the cloud computing environment, these applications will finally be deployed to run in data centers, with the sharing of computer infrastructures. Influenced by the complexity and the variability of the applications running in data center, some fixed-function hardware components in traditional computer architecture, such as last-level cache, memory controller, I/O controller, can not meet the requirements of deploying these application together in one data center. To adapt to these dynamic requirements, programmable hardware is needed from the view of computer architecture level, to make the function of computer hardware adaptable according to the application requirements. A programmable data plane design for computer architecture is presented, which brings programmability to hardware components by integrating programmable processors into the state-of-the-art hardware components, and let these new processors process hardware requests by firmware code. The functions of hardware components can be extended by updating firmware running on the processors. An FPGA prototype is implemented. Evaluation results show that the programmable data plane design brings flexible programmability to hardware by reasonable resource consumption, without introducing too much overhead to the original system performance. This makes it possible for the computer hardware to adapt to the dynamic requirement of application running in data centers.
With the development of the Internet and cloud computing, more and more applications are migrated from local host to the cloud. In the cloud computing environment, these applications will finally be deployed to run in data centers, with the sharing of computer infrastructures. Influenced by the complexity and the variability of the applications running in data center, some fixed-function hardware components in traditional computer architecture, such as last-level cache, memory controller, I/O controller, can not meet the requirements of deploying these application together in one data center. To adapt to these dynamic requirements, programmable hardware is needed from the view of computer architecture level, to make the function of computer hardware adaptable according to the application requirements. A programmable data plane design for computer architecture is presented, which brings programmability to hardware components by integrating programmable processors into the state-of-the-art hardware components, and let these new processors process hardware requests by firmware code. The functions of hardware components can be extended by updating firmware running on the processors. An FPGA prototype is implemented. Evaluation results show that the programmable data plane design brings flexible programmability to hardware by reasonable resource consumption, without introducing too much overhead to the original system performance. This makes it possible for the computer hardware to adapt to the dynamic requirement of application running in data centers.
2017, 54(1): 134-141.
DOI: 10.7544/issn1000-1239.2017.20150674
Abstract:
IEEE 802.15.3c is international unified standard of high-rate wireless personal area networks (high-rate WPANs) to support high data rate applications such as high-definition streaming content downloads, home theater and etc, which needs to finish 512 FFT sizes operations in only 222.2ns at the sampling rate of 2.592GHz. To satisfy this demand, some FFT processors adopt parallel PEs and reformulated radix-2\+4 FFT algorithm which can reduce the required number of butterfly stages. When parallel PEs are employed, only memory system supporting these PEs parallel accessing operating data and normal order I/O can express the full advantages of parallel PEs. According to the accessing law of four reformulated radix-2\+4 FFT PEs, this paper designs an address transformation method supporting four reformulated radix-2\+4. And the method in this paper supports normal order I/O, which solves the difficulty caused by bit reversal operation of initial or result data, to get a high-throughput design result. The implementation of the single address transformation unit is simple which requires only three two-input XOR gates and one three-input XOR gate. At the same synthesis condition, this method saves area 47% and power 24% compared with the method before. And this method supports continuous flow and in-place operation.
IEEE 802.15.3c is international unified standard of high-rate wireless personal area networks (high-rate WPANs) to support high data rate applications such as high-definition streaming content downloads, home theater and etc, which needs to finish 512 FFT sizes operations in only 222.2ns at the sampling rate of 2.592GHz. To satisfy this demand, some FFT processors adopt parallel PEs and reformulated radix-2\+4 FFT algorithm which can reduce the required number of butterfly stages. When parallel PEs are employed, only memory system supporting these PEs parallel accessing operating data and normal order I/O can express the full advantages of parallel PEs. According to the accessing law of four reformulated radix-2\+4 FFT PEs, this paper designs an address transformation method supporting four reformulated radix-2\+4. And the method in this paper supports normal order I/O, which solves the difficulty caused by bit reversal operation of initial or result data, to get a high-throughput design result. The implementation of the single address transformation unit is simple which requires only three two-input XOR gates and one three-input XOR gate. At the same synthesis condition, this method saves area 47% and power 24% compared with the method before. And this method supports continuous flow and in-place operation.
2017, 54(1): 142-153.
DOI: 10.7544/issn1000-1239.2017.20150644
Abstract:
This paper proposes a novel clustered page-level flash translation layer (CPFTL) algorithm which is based on classification strategy. Firstly, CPFTL divides RAM into hot cached mapping table (H-CMT), cold cached mapping table (C-CMT) and sequential cached mapping table (S-CMT), which are responsible for buffering map entries of requests with high temporal locality, low temporal locality and high spatial locality, respectively. Secondly, in order to benefit from the spatial locality of sequential requests, CPFTL prefetches multiple sequential map entries into S-CMT, and thus it can improve the response time of sequential requests. Finally, in order to reduce the read and write overhead of translation pages, CPFTL clusters the map entries which belong to the same translation page in C-CMT together, and manage these clusters by LRU (least recently used)strategy. When C-CMT is full, according to the map entry number and LRU of clusters, CPFTL chooses an appropriate cluster to evict into Flash. CPFTL has been extensively evaluated under various realistic workloads. Compared with the state-of-art FTL schemes such as classic DFTL and the latest SDFTL, our benchmark results show that CPFTL can improve cache hit ratio, operation counts of translation pages, response time and erase counts.
This paper proposes a novel clustered page-level flash translation layer (CPFTL) algorithm which is based on classification strategy. Firstly, CPFTL divides RAM into hot cached mapping table (H-CMT), cold cached mapping table (C-CMT) and sequential cached mapping table (S-CMT), which are responsible for buffering map entries of requests with high temporal locality, low temporal locality and high spatial locality, respectively. Secondly, in order to benefit from the spatial locality of sequential requests, CPFTL prefetches multiple sequential map entries into S-CMT, and thus it can improve the response time of sequential requests. Finally, in order to reduce the read and write overhead of translation pages, CPFTL clusters the map entries which belong to the same translation page in C-CMT together, and manage these clusters by LRU (least recently used)strategy. When C-CMT is full, according to the map entry number and LRU of clusters, CPFTL chooses an appropriate cluster to evict into Flash. CPFTL has been extensively evaluated under various realistic workloads. Compared with the state-of-art FTL schemes such as classic DFTL and the latest SDFTL, our benchmark results show that CPFTL can improve cache hit ratio, operation counts of translation pages, response time and erase counts.
2017, 54(1): 154-162.
DOI: 10.7544/issn1000-1239.2017.20150903
Abstract:
Racetrack memory (RM) is a competitive emerging non-volatile memory technology for future memory designs. It achieves ultra-high storage density by integrating multiple bits into a tape-like nanowire (called racetrack) and provides fast access speed. In order to access required bits in RM, a unique shift operation is introduced. However, it has been observed that the shift operation requires higher current than read and write operations and causes significant amount of energy dissipation, which degrades reliability and performance or even destroys RM cells. However, there still lacks an analytical thermal model to estimate run-time temperature of RM. More important, corresponding architecture level management schemes are needed to avoid thermal emergency that violates the constraint of peak temperature. In this work, we first propose a thermal model to explore relationship between temperature and design parameters. At the same time, in order to improve thermal reliability, we propose a quota-based shift management scheme to ensure the intensity of shift operations which is constrained under a specific threshold. Experiments show that the temperature increase is limited in 20℃ with only 3.5% performance degradation.
Racetrack memory (RM) is a competitive emerging non-volatile memory technology for future memory designs. It achieves ultra-high storage density by integrating multiple bits into a tape-like nanowire (called racetrack) and provides fast access speed. In order to access required bits in RM, a unique shift operation is introduced. However, it has been observed that the shift operation requires higher current than read and write operations and causes significant amount of energy dissipation, which degrades reliability and performance or even destroys RM cells. However, there still lacks an analytical thermal model to estimate run-time temperature of RM. More important, corresponding architecture level management schemes are needed to avoid thermal emergency that violates the constraint of peak temperature. In this work, we first propose a thermal model to explore relationship between temperature and design parameters. At the same time, in order to improve thermal reliability, we propose a quota-based shift management scheme to ensure the intensity of shift operations which is constrained under a specific threshold. Experiments show that the temperature increase is limited in 20℃ with only 3.5% performance degradation.
2017, 54(1): 163-171.
DOI: 10.7544/issn1000-1239.2017.20150937
Abstract:
The globalization trend of design and manufacture of IC raises serious concerns about hardware security since there is possibility that in each phase of design and manufacture hardware Trojan can be inserted. From the defender’s perspective, hardware Trojan in the host circuit may stay inactive for most of time but will result in disastrous consequences once activated, such as information leakage, false output, system crash, etc. As far as an attacker concerned, one of important design criteria is to prevent the Trojan circuit from being accidently activated. It is believed that inactive nets with lower switching probabilities in the circuit are more likely to be selected as the trigger signals of Trojan circuits. Hence finding these inactive nets is one of the existing countermeasures. However, current techniques only focus on finding inactive nets in test mode of the circuit. This paper proposes a method that can find inactive nets in function mode of the testing circuit. From an attacker’s point of view, the nets that are inactive in both test mode and function mode are the best candidates for Trojan triggers—it will result in the lowest probability of accidental activation of Trojan circuits in both modes. Hence for a defender, focusing on these nets will improve the efficiency of Trojan detection significantly.
The globalization trend of design and manufacture of IC raises serious concerns about hardware security since there is possibility that in each phase of design and manufacture hardware Trojan can be inserted. From the defender’s perspective, hardware Trojan in the host circuit may stay inactive for most of time but will result in disastrous consequences once activated, such as information leakage, false output, system crash, etc. As far as an attacker concerned, one of important design criteria is to prevent the Trojan circuit from being accidently activated. It is believed that inactive nets with lower switching probabilities in the circuit are more likely to be selected as the trigger signals of Trojan circuits. Hence finding these inactive nets is one of the existing countermeasures. However, current techniques only focus on finding inactive nets in test mode of the circuit. This paper proposes a method that can find inactive nets in function mode of the testing circuit. From an attacker’s point of view, the nets that are inactive in both test mode and function mode are the best candidates for Trojan triggers—it will result in the lowest probability of accidental activation of Trojan circuits in both modes. Hence for a defender, focusing on these nets will improve the efficiency of Trojan detection significantly.
2017, 54(1): 172-183.
DOI: 10.7544/issn1000-1239.2017.20150900
Abstract:
As an important issue of cloud storage security, data integrity checking has attracted a lot of attention from academia and industry. In order to verify data integrity in the cloud, the researchers have proposed many public audit schemes for data integrity. However, most of the existing schemes are inefficient and waste much computing resource because they adopt fixed parameters for auditing all the files. In other words, they have not considered the issue of coordinating and auditing the large-scale files. In order to improve the audit efficiency of the system, we propose a self-adaptive provable data possession (SA-PDP), which uses a self-adaptive algorithm to adjust the audit tasks for different files and manage the tasks by the audit queues. By the quantitative analysis of the audit requirements of files, it can dynamically adjust the audit plans, which guarantees the dynamic matching between the audit requirements of files and the execution strength of audit plans. In order to enhance the flexibility of updating audit plans, SA-PDP designs two different update algorithms of audit plans on the basis of different initiators. The active update algorithm ensures that the audit system has high coverage rate while the lazy update algorithm can make the audit system timely meet the audit requirements of files. Experimental results show that SA-PDP can reduce more than 50% of the total audit time than the traditional method. And SA-PDP effectively increases the number of audit files in the audit system. Compared with the traditional audit method, SA-PDP can improve the standard-reaching rate of audit plans by more than 30%.
As an important issue of cloud storage security, data integrity checking has attracted a lot of attention from academia and industry. In order to verify data integrity in the cloud, the researchers have proposed many public audit schemes for data integrity. However, most of the existing schemes are inefficient and waste much computing resource because they adopt fixed parameters for auditing all the files. In other words, they have not considered the issue of coordinating and auditing the large-scale files. In order to improve the audit efficiency of the system, we propose a self-adaptive provable data possession (SA-PDP), which uses a self-adaptive algorithm to adjust the audit tasks for different files and manage the tasks by the audit queues. By the quantitative analysis of the audit requirements of files, it can dynamically adjust the audit plans, which guarantees the dynamic matching between the audit requirements of files and the execution strength of audit plans. In order to enhance the flexibility of updating audit plans, SA-PDP designs two different update algorithms of audit plans on the basis of different initiators. The active update algorithm ensures that the audit system has high coverage rate while the lazy update algorithm can make the audit system timely meet the audit requirements of files. Experimental results show that SA-PDP can reduce more than 50% of the total audit time than the traditional method. And SA-PDP effectively increases the number of audit files in the audit system. Compared with the traditional audit method, SA-PDP can improve the standard-reaching rate of audit plans by more than 30%.
2017, 54(1): 184-191.
DOI: 10.7544/issn1000-1239.2017.20150890
Abstract:
Both effective resource utilization and meeting the deadlines of high-criticality tasks are objectives of mixed-criticality systems. Currently, it is considered the moment that any of high-criticality tasks execute for their low-critical worst case execution time without completing the system switch from low to high-criticality mode immediately. However, in real embedded applications, the increasing criticality is event-triggered which includes outer circumstance changing, control switching, and so on. That is why the time of raising criticality level can be occurred before, during, or after a task implementation. In this paper we center on that the time of event-triggered increasing criticality is how to influent the scheduling of high-criticality and fixed priority periodic tasks, which is based on the response time analysis. A sufficient condition of scheduling high criticality tasks is derived. Then we discuss when and how to exchange priorities between two high-critical tasks in order to meet their deadlines at the same time, and propose an algorithm of exchanging priorities. Evaluations illustrate the benefits of the schedulable condition and the priority exchanging algorithm.
Both effective resource utilization and meeting the deadlines of high-criticality tasks are objectives of mixed-criticality systems. Currently, it is considered the moment that any of high-criticality tasks execute for their low-critical worst case execution time without completing the system switch from low to high-criticality mode immediately. However, in real embedded applications, the increasing criticality is event-triggered which includes outer circumstance changing, control switching, and so on. That is why the time of raising criticality level can be occurred before, during, or after a task implementation. In this paper we center on that the time of event-triggered increasing criticality is how to influent the scheduling of high-criticality and fixed priority periodic tasks, which is based on the response time analysis. A sufficient condition of scheduling high criticality tasks is derived. Then we discuss when and how to exchange priorities between two high-critical tasks in order to meet their deadlines at the same time, and propose an algorithm of exchanging priorities. Evaluations illustrate the benefits of the schedulable condition and the priority exchanging algorithm.
2017, 54(1): 192-201.
DOI: 10.7544/issn1000-1239.2017.20150139
Abstract:
software-defined networking (SDN) separates the control plane and the data plane, and this kind of separation can achieve flexible control via deploying fine-grained rules on the flow tables in switches, while potentially improving the utilization of network bandwidth. With the development of SDN, more and more campus and enterprise network begin to deploy network based on SDN. During this procedure, SDN has encountered some problems which dont exist in the traditional IP network. For example, some protocols used in the existing IP network are subject to great challenge in SDN based network, such as TCP, which is the most basic protocol in TCP/IP network. First, we make a penetrating analysis on the working mechanism of SDN, and three examples are given to illustrate that it is quite possible to generate large volume of Packet-In messages even in proactive mode. Then experiments are carried out and the results show that the end-to-end TCP connections have experienced a large delay caused by the SDN unique operations such as re-organizing of rules in TCAM and fast Packet-In message generating. In the worst case, the delay caused by the reordering of the rules can reach up to 10 seconds when the TCAM contains 4000 flow entries in our experiments. Based on the experimental results, we highlight two major problems when applying traditional TCP protocol in SDN networks: one is that it is hard to establish the connection, and the other is the transmission inefficiency. Through the analysis of the experimental results, we propose the possible direction to solve TCP inefficiency issue in SDN.
software-defined networking (SDN) separates the control plane and the data plane, and this kind of separation can achieve flexible control via deploying fine-grained rules on the flow tables in switches, while potentially improving the utilization of network bandwidth. With the development of SDN, more and more campus and enterprise network begin to deploy network based on SDN. During this procedure, SDN has encountered some problems which dont exist in the traditional IP network. For example, some protocols used in the existing IP network are subject to great challenge in SDN based network, such as TCP, which is the most basic protocol in TCP/IP network. First, we make a penetrating analysis on the working mechanism of SDN, and three examples are given to illustrate that it is quite possible to generate large volume of Packet-In messages even in proactive mode. Then experiments are carried out and the results show that the end-to-end TCP connections have experienced a large delay caused by the SDN unique operations such as re-organizing of rules in TCAM and fast Packet-In message generating. In the worst case, the delay caused by the reordering of the rules can reach up to 10 seconds when the TCAM contains 4000 flow entries in our experiments. Based on the experimental results, we highlight two major problems when applying traditional TCP protocol in SDN networks: one is that it is hard to establish the connection, and the other is the transmission inefficiency. Through the analysis of the experimental results, we propose the possible direction to solve TCP inefficiency issue in SDN.
2017, 54(1): 202-211.
DOI: 10.7544/issn1000-1239.2017.20150595
Abstract:
The distributed TDMA approach for vehicular ad hoc network (VANET) does not take advantage of idle slots, failing to effectively utilize radio resources, and the approach is not free from packet dropping due to a poor channel condition. Cooperative communication, on the other hand, has drawn significant attention from both academia and industry in recent years, since it can be effective in mitigating wireless channel impairments by utilizing the broadcast nature of the wireless channel. In the paper, a cooperative scheme for medium access control (MAC), referred to as cooperative distributed TDMA (Co-DTDMA) is presented for VANET. In Co-DTDMA, neighboring nodes utilize unreserved slots for cooperatively retransmitting a packet which has failed to reach the destination node owing to a poor channel condition. Different from traditional cooperative approaches, all Co-DTDMA operations, such as synchronization among nodes, reserving a time slot, cooperation decision and cooperative transmission are done in a fully distributed manner, which makes it suitable for VANET. In addition, cooperative transmission is conducted in unreserved slots, without interrupting the normal transmission. Both theoretical analysis and experimental results demonstrate that the proposed scheme significantly increases the probability of successful packet transmission and decreases the delay of packet transmission in various network parameters.
The distributed TDMA approach for vehicular ad hoc network (VANET) does not take advantage of idle slots, failing to effectively utilize radio resources, and the approach is not free from packet dropping due to a poor channel condition. Cooperative communication, on the other hand, has drawn significant attention from both academia and industry in recent years, since it can be effective in mitigating wireless channel impairments by utilizing the broadcast nature of the wireless channel. In the paper, a cooperative scheme for medium access control (MAC), referred to as cooperative distributed TDMA (Co-DTDMA) is presented for VANET. In Co-DTDMA, neighboring nodes utilize unreserved slots for cooperatively retransmitting a packet which has failed to reach the destination node owing to a poor channel condition. Different from traditional cooperative approaches, all Co-DTDMA operations, such as synchronization among nodes, reserving a time slot, cooperation decision and cooperative transmission are done in a fully distributed manner, which makes it suitable for VANET. In addition, cooperative transmission is conducted in unreserved slots, without interrupting the normal transmission. Both theoretical analysis and experimental results demonstrate that the proposed scheme significantly increases the probability of successful packet transmission and decreases the delay of packet transmission in various network parameters.
2017, 54(1): 212-220.
DOI: 10.7544/issn1000-1239.2017.20150785
Abstract:
Virtual network embedding (VNE) is critical fundamental technology to archive multi resource tenancy in cloud environment. It aims to embed virtual networks into appropriate underlying physical substrate network under the premise of satisfying the resource demand of virtual networks. Most research achievements of the existing VNE algorithms aim at maximizing the physical resource utilization, but consider less about the fairness problem in virtual network request reception. This paper puts forward a pre-configured virtual network mapping algorithm to improve the mapping of fairness, in which the mapping process are divided into two steps: topology pre-configuration phase and embedding phase. In pre-configuration phase, larger virtual network topologies are transformed to equivalent but smaller ones with less number of nodes and links. Such mechanism can reduce differences between virtual network requests so as to improve reception fairness. In mapping phase, we establish a formal VNE model, and use the discrete particle swarm optimization algorithm to solve the model. In order to improve the physical network resource utilization, a particle position enhancement mechanism is introduced leveraging node repeatable technology to save bandwidth resource. Simulation results show that the proposed algorithm is superior to the existing similar algorithms in physical network resource utilization, revenue/cost ratio and reception fairness.
Virtual network embedding (VNE) is critical fundamental technology to archive multi resource tenancy in cloud environment. It aims to embed virtual networks into appropriate underlying physical substrate network under the premise of satisfying the resource demand of virtual networks. Most research achievements of the existing VNE algorithms aim at maximizing the physical resource utilization, but consider less about the fairness problem in virtual network request reception. This paper puts forward a pre-configured virtual network mapping algorithm to improve the mapping of fairness, in which the mapping process are divided into two steps: topology pre-configuration phase and embedding phase. In pre-configuration phase, larger virtual network topologies are transformed to equivalent but smaller ones with less number of nodes and links. Such mechanism can reduce differences between virtual network requests so as to improve reception fairness. In mapping phase, we establish a formal VNE model, and use the discrete particle swarm optimization algorithm to solve the model. In order to improve the physical network resource utilization, a particle position enhancement mechanism is introduced leveraging node repeatable technology to save bandwidth resource. Simulation results show that the proposed algorithm is superior to the existing similar algorithms in physical network resource utilization, revenue/cost ratio and reception fairness.
2017, 54(1): 221-231.
DOI: 10.7544/issn1000-1239.2017.20150452
Abstract:
K-barrier coverage is one of the hotspots in directional sensor network. However, there are few directional barrier construction schemes considering both movement and rotation until now. This paper proposes a distributed directional strong barrier construction based on neighbor actuation (DBCNA) to create directional barrier coverage with minimum actuation energy consumption, which is the total of mobility and motility energy consumption. It is just the preceding node that determines the target node location of the next node. If there is a node in the sensing region of the preceding node, the node with the largest X coordinate is selected as the next node. If not, the point sensing radius distance from the preceding node in the horizontal direction is selected as the target location of the next node in barrier, of which the target working direction is determined by its original working direction. If this original sensing direction is in [α/2,π] (α is the sensing angle), the target working direction of node is β=α/2. On the contrary, if this original sensing direction is in [π,2π-α/2], the target working direction of node is β=2π-α/2. This paper also first adopts the maximum energy consumption of a single node and the mean square error of energy consumption to evaluate the performance besides energy consumption. Simulation results show this method can save 50% nodes and decrease 40%-50% mean energy consumption than other methods. This research has important theoretical and practical significance.
K-barrier coverage is one of the hotspots in directional sensor network. However, there are few directional barrier construction schemes considering both movement and rotation until now. This paper proposes a distributed directional strong barrier construction based on neighbor actuation (DBCNA) to create directional barrier coverage with minimum actuation energy consumption, which is the total of mobility and motility energy consumption. It is just the preceding node that determines the target node location of the next node. If there is a node in the sensing region of the preceding node, the node with the largest X coordinate is selected as the next node. If not, the point sensing radius distance from the preceding node in the horizontal direction is selected as the target location of the next node in barrier, of which the target working direction is determined by its original working direction. If this original sensing direction is in [α/2,π] (α is the sensing angle), the target working direction of node is β=α/2. On the contrary, if this original sensing direction is in [π,2π-α/2], the target working direction of node is β=2π-α/2. This paper also first adopts the maximum energy consumption of a single node and the mean square error of energy consumption to evaluate the performance besides energy consumption. Simulation results show this method can save 50% nodes and decrease 40%-50% mean energy consumption than other methods. This research has important theoretical and practical significance.