• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search

2010  Vol. 47  No. 10

Abstract:
Distributed system security is an important research field for the scene of multi-domain cooperation that has been developed abundantly in recent years. During most practical cooperating processes, both developers and administrators wont abandon the legacy systems of entitlement management and access control completely, but expect to hold the balance between authorization opening and rebuilding costs. Just in such background, authorization interoperation becomes a representative research method. From multidimensional perspectives, this paper focuses on carding and dissecting the progress and evolution of the theory and technology of interoperation. For example: from the perspective of inter-domain cooperative architecture, the interopertion can be divided into loosely-coupled pattern and federated pattern; from the perspective of security detection mechanism, it can be divided into mediator-based scenario and mediator-free scenario; from the perspective of modeling approach, it can be divided into arbitrary management advance modeling and request-driven real time modeling; from the perspective of assistive technology, it relates to trust-based, risk-based and semantic based assistance, etc; and from the perspective of policy integration level, it can be separated into authorization management oriented integration and resource aggregation oriented integration. For every typical scheme, the survey elaborates the basic theory and applicable scene, and analyzes technical features and limitation through comparison. Finally, a conclusion is drawn which includes some mainstream characteristics of this area, and then summarizes and forecasts future potential research trends.
Abstract:
In the Map based publish/subscribe systems, most of the typical event matching approaches start from the published events, and then move through to looking for the matched subscriptions. Since there are always some redundant attributes in different events, the same attribute would probably exist in more than one event. When the number of published events is big, the same attributes in different events would be matched more than once with the constraints in the subscriptions, i.e., there are redundant matches during the event matching process. To alleviate this redundant matching problem, a new event matching algorithm which is based on the judgment of redundant attributes in different events is presented in this paper. By judging the redundancy of attributes, merging the event sets to eliminate the matching redundancy, and maintaining the constraints in subscriptions set into a multi-level index structure, this new event matching algorithm improves the matching efficiency in time complexity and also the maintainability in space complexity. Based on the above approach, a series of experiments with comparison are made. The experiment results clearly show that this event matching algorithm has higher efficiency compared with similar approaches when the number of events and subscriptions is big approaching hundreds or thousands.
Abstract:
Bandwidth smoothing can effectively improve network bandwidth utilization by reducing the burstiness of VBR traffics. The main idea is to transmit the video data of big frames to client buffer prior to their due time. PCRTT can result in a suboptimal transmission schedule with lower computational complexity. However, it is difficult to obtain good-performance parameters, including segmentation and rate computation. In this paper, we first analyze the key difference between the classic smoothing algorithms that can all achieve some optimal metrics. Then we propose an adaptive segmentation algorithm of PCRTT called t-PCRTT according to the fluctuation trend of the playback traffic around the average rate line. As a result, the algorithm can significantly decrease the required client buffer size with the equal number of segments due to its tightly tracking the playback data curve. Using the analysis function, we verify that the required buffer size of t-PCRTT is less than half of PCRTT. The final simulation results using the MPEG4 and H.264 video traces show that t-PCRTT outperforms PCRTT with the same number of segments, and that t-PCRTT almost achieves the optimal number of rate changes and well supports the VCR operation based on VCR-window under various buffer sizes.
Abstract:
The location service is foundation of pervasive computing, intelligent transportation and application of WSN, so the wireless positioning is the focus of the current research. Current location techniques include ToA, TDoA and RSSI. And considering that non-line-of-sight propagation introduces the significant bias in measurements and thus the very large error in current location estimation technique, a two-step position tracking scheme is proposed to mitigate the position tracking error. In this scheme, an SVR model of node position with radio parameters is firstly established by supervising learning. Then the position estimation from SVR model is smoothed by the game. Because the radio parameter is not considered as the distance measure, but as the feature to train the SVR model, the side effect of non-line-of-sight is mitigated. And by modeling the noise as the adversary of position estimator, a position smooth method based on game theory is introduced. Because of its ability to handle the uncertain and unmodeled noise, game theory can attain in theory more accuracy of position tracking compared with Kalman-filter, which can only deal with the Gaussian noise. Simulations show that the scheme proposed results in the more accurate performance of location estimation, especially in the harsh environment.
Abstract:
The defender needs to predict, detect and understand attacks, and makes good decisions about defense strategies. Because the target of attackers and defenders is oppositional and their strategies are interdependent, the selection of optimal defense strategy is a complex issue. In this paper, the issue of optimal defense strategy selection is defined and formalized. A new attack-defense stochastic game model is proposed to describe the offensive and defensive conflict of attackers and defenders in network security and address the issue of optimal defense strategy selection. The model is a dynamic multi-player and multi-state model which is expanded by normal attack-defense game and Markov decision process. By viewing privilege state in node of attacker as elements in attack-defense stochastic game, we can model the dynamic transition of attack and defense state and compute the probabilities of attacker and defender behavior. This paper analyzes the cost factors related to attack and defense and provides a cost-benefit analysis method that helps defender evaluate and select defense strategies. An algorithm for defense strategy selection based on those models is proposed. A representative network example is provided to illustrate our models and demonstrate the efficacy of our models in the prediction of attack behaviors and decision of optimal defense strategies.
Abstract:
A hierarchical online risk assessment model is proposed in this paper, which can assess real-time risks caused by an ongoing intrusion scenario at service level, host level and network level. D-S evidence theory is used to fuse multiple variables of an alert thread which can reflect risk trend to calculate the risk index at service level. The risk situation at service level is evaluated by combining the risk index with target risk distribution character. The character is determined by the importance of attacked services. A risk assessment approach based on the barrel principle is proposed to evaluate the risk situation at host level. The security dependency network concept and its corresponding properties are defined. An improved algorithm of risk propagation is used to assess the risk situation in network level. The proposed assessment model properly combines alert processing algorithms, including alert verification, alert aggregation, alert correlation and alert confidence learning, with risk assessment. As a result, the model can deal with the problems of subjectivity fuzziness and uncertainty very well. The experiments show that the online risk assessment results accord with the real situation of attacks. Therefore the hierarchical risk assessment approach gives a strong support to intrusion response time decision-making, intrusion response measure decision-making and response adjustment.
Abstract:
Affinity propagation (AP) is a newly developed and effective clustering algorithm. For its simplicity, general applicability, and good performance, AP has been used in many data mining research fields. In AP implementations, the similarity measurement plays an important role. Conventionally, text mining is based on the whole vector space model (VSM) and its similarity measurements often fall into Euclidean space. By clustering texts in this way, the advantage is simple and easy to perform. However, when the data scale puffs up, the vector space will become high-dimensional and sparse. Then, the computational complexity grows exponentially. To overcome this difficulty, a non-Euclidean space similarity measurement is proposed based on the definitions of similar feature set (SFS), rejective feature set (RFS) and arbitral feature set (AFS). The new similarity measurement not only breaks out the Euclidean space constraint, but also contains the structural information of documents. Therefore, a novel clustering algorithm, named weight affinity propagation (WAP), is developed by combining the new similarity measurement and AP. In addition, as a benchmark dataset, Reuters-21578 is used to test the proposed algorithm. Experimental results show that the proposed method is superior to the classical k-means, traditional SOFM and affinity propagation with classic similarity measurement.
Abstract:
Attribute value extraction is an important task of information extraction. However, the heterogeneous attributes and the natural language processing bottleneck make this problem more difficult and complex. In addition, most quantity attributes are single-valued and variable, thus its difficult to find out the accurate value of those attributes. Most research works are based on semi -supervision methods or lexico-syntactic patterns, however these methods overlook the properties of quantity attributes and require much effort to ensure the reliability of extraction results. In this paper, the definition of meta-property is given to avoid these drawbacks, and a novel approach to attribute-value extraction based on meta-property is proposed to avoid the drawback of traditional methods. The system is implemented and the overall structure and major components of the system are presented, including textual information source selection, candidate extraction, candidate evaluation and automatic verification. Experiments are carried out on 5 kinds of entity types and their 9 subtypes from Baidu encyclopedia. Experimental results show that the new approach achieves an average precision up to 71% and an average recall of 89%, significantly higher than general query-based approaches and traditional lexico-syntactic pattern based methods. The new approach has a better generalization capability on open domain attribute-value extraction, especially on the singled-valued attribute.
Abstract:
The leading partitional clustering technique, K-Modes, is one of the most computationally efficient clustering methods for categorical data. In the traditional K-Modes algorithm, the simple matching dissimilarity measure is used to compute the distance between two values of the same categorical attributes. This compares two categorical values directly and results in either a difference of zero when the two values are identical or one if otherwise. However, the similarity between categorical values is not considered. In this paper, a new distance measure based on rough set theory is proposed, which overcomes the shortage of the simple matching dissimilarity measure and is used along with the traditional K-Modes clustering algorithm. While computing the distance between two values of the same categorical attributes, the new distance measure takes into account not only their difference but also discernibility of other relational categorical attributes to them. The time complexity of the modified K-Modes clustering algorithm is linear with respect to the number of data objects which can be applied for large data sets. The performance of the K-Modes algorithm with the new distance measure is tested on real world data sets. Comparisons with the K-Modes algorithm based on many different distance measures illustrate the effectiveness of the new distance measure.
Abstract:
Action based encoding method is a novel propositional planning encoding method by reducing state variables adopted by SATPLAN2006, which is the state-of-the-art optimal planner of the International Planning Competition in 2006. In this paper, based on action based encoding method, another automated propositional planning encoding method by reducing action variables, called proposition based encoding method, is proposed. Firstly, the theoretical foundations to propose a new encoding method in planning as satisfiability framework are analyzed, and the theoretical composition of proposition based encoding method is proposed for automatically constructing a series of propositional encodings during planning processing. And then, the soundness and completeness of the proposed encoding method are justified and the feasible actual encodings for some axioms are described. Finally, the distinctions between the proposed encoding method and the existing excellent encoding methods are also compared. To analyze their specialties for different planning domains, the proposition based encoding method is implemented in SATPLAN2006 planner, and also is compared with action based encoding using several benchmarks adopted by the International Planning Competition. The experimental results validate that for sequential planning Blocks World problems, proposition based encoding is superior to action based encoding, and for parallel planning Logistics problems, the latter is superior to the former.
Abstract:
Accurately predicting the geographic position of users can efficiently improve the quality of location-based service. To solve the problem of low prediction power of the basic Markov model and the problem of being hard to determine the order of higher order Markov model in practice, the authors present a position prediction approach based on the novel adaptive multi-order Markov model. The approach first processes the raw location information of a user based on regular shape abstraction, and automatically determines the order of the model in a heuristic way in which the heuristic is obtained from the users training data represented by a tree structure. Then, it predicts the future position of the user by using the Markov model with the most appropriate order. Finally, the performance of the adaptive multi-order Markov model is evaluated based on real location data. The results show that the prediction accuracy and prediction length of adaptive multi-order Markov model is always higher than multi-order Markov model. The average prediction accuracy is improved by nearly 20% and the average prediction length is improved by nearly 10 unit regions. Moreover, the performance of the adaptive multi-order Markov model is not apt to be influenced by the quality of training data.
Abstract:
Sequence data is an important data type, ubiquitous in many domains such as text, Web access log and biological database. Similarity query in this kind of data is a very important means for extracting useful information. One key factor for high performance of similarity query in huge sequence database is the filtering level of query algorithm,namely, designing those filters that can quickly filter out the unpromising strings for query string is very important. Combining metric property of sequences distance with sequences characteristics, an algorithm called SSQ_MF based on multiple filters is proposed, whose filtering level is further improved compared with those query algorithms with a single filter. The filters used in SSQ_MF are length filter,prefix filter, and reference-based filter. Then, the statistical information about the string database is pre-computed and some data structures are used to store those information. Furthermore, every filters filtering size is effectively estimated and a model in which optimal filtering order is determined by every filters filtering size is built. Comprehensive experimental results demonstrate that in terms of query performance, SSQ_MF is better than algorithms with a single filter and algorithms with multiple filters but executing in a random order.
Abstract:
Log is important to the database system, which is the foundation of maintaining the correctness and consistency. The existing database log mechanism only stores the history of transactions, but can not record the relationship between transactions. Facing the attack, databases with traditional log system can only stop the service of database and recover to the point of attack occurrence. This kind of recovery will abandon all the transactions after the malicious transaction regardless of whether these transactions are related to the malicious transaction. That means the database system is out of service between the fault-point to the end of recovery. By using this vulnerability, the attacker can commit malicious transactions constantly and the database will always be in the state of recovery. In this paper, we present a new model of transaction recovery log and intrusion response. We use ASM to describe the model, give a formal definition of transaction dependency and prove the correctness and categoricalness of the model. Databases with transaction recovery log and intrusion response mechanism roll back only affected transactions rather than all the transactions after malicious attack. This method will not stop the service of the database system, significantly enhancing the performance of recovery for defensive information warfare.
Abstract:
Failure recovery optimization is one important way for enhancing efficiency of long running transaction (LRT) processing. In this paper, aiming at the efficiency problem of LRT failure recovery, LHM (long running transaction hierarchical model), a hierarchical model for LRTs, is established, which divides LRTs into a series of sub-transactions in different levels. LHM supports versatile transaction properties of LRT and provides techniques for processing branch and loop structures of LRTs. Based on LHM, LHFR (LRT hierarchical failure recovery), a hierarchical failure recovery algorithm is proposed. This algorithm uses methods of compensation and functional equivalent replacement. It supports auto-recovery of failures during the execution of LRTs. LHFR algorithm can guarantee long businesss semantic atomicity property and durability property. By restricting the compensation scope in lower level of complex LRTs, LHFR limits the quantity of sub-transactions to be compensated. Thus, it reduces unnecessary loss of time and enhances the efficiency of failure recovery. Also presented is a comprehensive simulation, which confirms the accuracy and high efficiency of LHFR algorithm. Experiment results show that LHFR can reduce the time required for failure recovery. The results also illustrate that LHFR can decrease the probability of manual intervention required by sub-transactions that are unable to compensate.
Abstract:
To effectively detect copyrighted images suffering both single processing attacks and geometrical attacks is one of the hotspots and difficulties in the image copy detection research. For most existing block-based image copy detection algorithms, the performances can successfully resist to the noise-like distortion, but they are quite fragile to geometric attacks, such as rotation, shift, translate, scale, cropping and so on, because the content of each block is changed after these kinds of attacks. In this paper, a robust image copy detection algorithm based on ordinal measure of the full DCT coefficients is proposed. The ordinal measure of the selected low and middle frequency coefficients which are calculated from full DCT transformation on Y-component of YCbCr color space is used as the image eigenvector. The middle and low frequency full DCT coefficients are changing regularly with scaling, so the proposed method can resist scaling distortion. Meanwhile, the ordinal measure is introduced to resist single processing attacks. Moreover, the rotation compensation strategy used in this algorithm can resist the rotation attacks in which the angles are less than 30°. The experimental results show that the algorithm has higher precision and recall rate than the existing methods in terms of rotation and scaling distortion.
Abstract:
In recent years, the football video retrieval and abstract techniques develop more and more rapidly. As one of the key technologies of video retrieval and abstract for soccer video, highlights extracting technique has high academic research value and wide application perspective. Based on the analysis of existing football video research methods, it is proposed that a highlights extracting system using affection arousal modeling from the angle of audiences emotional change, which is based on the theory of Hanjalics affection curve. The arousal features are extracted from football video to construct arousal modeling and generate arousal curve, then highlights can be extracted based on arousal curve. Compared with Hanjalics work, some improvements and further research are proposed based on Hanjalics excitement curve. A new feature, named shot intensity, is designed to construct arousal modeling instead of the motion intensity feature. This has improved the systems recall rate, accuracy and computing performance. Domain knowledge is applied in the highlights shot locating process to achieve high accuracy, and the highlights can also be automatically filtered according to the users required watching time. The experimental results indicate that the proposed approach is effective and the shot intensity can replace motion intensity to construct arousal model and the proposed algorithms are effective.
Abstract:
With the increasing demand of high I/O throughput, highly parallel data transfer and highly scalable storage system, especially in the supercomputing field, the traditional storage system has been difficult to meet the requirements. OBS (object-based storage) system is a new network storage architecture. In OBS system, local storage space allocation is managed by intelligent OSD(object-based storage device), and at present, the OSD is mainly managed by the common file system. However, when the common file system deals with the flat namespace, especially in the course of long-term use, the performance degenerates seriously. In this paper a new file system is proposed. XOBFS, which stands for extensible hashing object-based storage file system, uses fixed-length block allocation strategy and manages free block bitmap combination. Based on expanding hash manage object attribute, the same object attributes are stored in adjacent hash bucket. XOBFS applied in OSD has a lot of advantages. For example, the metadata has small scale, long-term performance is not degraded, and attribute of the object is managed effectively and so on. Test results show that big-object based XOBFS throughput rate is better than that of the traditional file system. XOBFS provides an effective method for storage management in OSD.
Abstract:
Testing is very important and time consuming in the development of high-quality software systems. This paper proposes an automatic testing tool JUTA for unit testing of Java programs. The approach is based on sharp analysis of the programs. JUTA firstly employs the Java optimization framework Soot to parse a single Java method into byte code and translates it into a control flow graph (CFG). It then performs depth-first or breadth-first search on the CFG to extract paths from it. Some techniques such as path length restriction are used to prevent path number explosion. Finally JUTA analyzes the paths based on the combination of symbolic execution and constraint solving. The goal of path analysis lies in two folds. It can generate a set of test cases satisfying the test criterion such as statement coverage. The test set typically has small number of test cases that are all executable. In addition to test generation for dynamic testing, it can also be used in static testing. JUTA can reveal certain kinds of errors from the source code automatically if the user provides proper assertions to describe the errors. The experimental results show that this tool is efficient for both dynamic and static testing.
Abstract:
Aiming at the deficiencies of existing context-aware recommendation algorithms, this paper proposes a context-aware recommendation algorithm based on fuzzy C-means clustering. Firstly, the fuzzy C-means clustering algorithm is used to perform clustering of historical contextual information and produce clusters and membership matrix. Then contextual information for the active user is matched with the cluster of historical contextual information, and non-membership data, which accord with the condition, are mapped into membership data by using membership degree of clustering as a mapping coefficient. Finally, we choose ratings of membership users, which conform to the active contextual information, to generate recommendation for the user. Compared with the existing algorithms, the proposed algorithm can not only solve the problem of inaccuracy recommendation due to the change of user context, but also overcome traditional hard clustering by using fuzzy C-means clustering algorithm. Moreover, the problem of data sparseness caused by clustering is solved by using membership mapping function. The experiments are conducted on the MovieLens dataset and the effectiveness of the proposed algorithm is verified.