ISSN 1000-1239 CN 11-1777/TP

Table of Content

15 January 2011, Volume 48 Issue 1
Research on Hyper-Node Controller for High Performance Computer
Wang Kai, Chen Fei, Li Qiang, Li Xiaomin, An Xuejun, and Sun Ninghui,
2011, 48(1):  1-8. 
Asbtract ( 539 )   HTML ( 4)   PDF (1606KB) ( 475 )  
Related Articles | Metrics
A traditional high performance computer (HPC) consists of two parts: nodes and interconnection network, and the node part can be further divided into a processing unit and a node controller. The processing unit usually adopts symmetric multi-processors (SMP) or non-uniform memory access (NUMA) structure with cache coherence. In order to maintain the cache coherence efficiently, the number of processors in a processing unit is very limited. Therefore, a HPC of petaflops would possess tens of thousands of nodes, which makes a very high requirement of both latency and bandwidth for the interconnection network. The hyper-node controller introduced in this paper can connect several processing units simultaneously, and they together construct a hyper-node. Implementing hyper-nodes can largely reduce the scale of the interconnection network, which reduces the design complexity of the interconnection network and guarantees the performance of the interconnection network. The key techniques in the hyper-node controller, including supporting global address space, direct memory access, remote load store, global hardware lock, and multi-rail interconnection network, can effectively lower the communication latency, guarantee the sufficient bandwidth and enhance its synchronization performance. The hyper-node controller is implemented with FPGA, and a prototype system is built. The test result shows that the cluster with hyper-nodes has very low latency, and it has a good extendibility in bandwidth.
Research on FPGA-Based Paging-Simulation Model for SIMD Architecture
He Yi, Ren Ju, Wen Mei, Yang Qianming, Wu Nan, Zhang Chunyuan, and Guo Min
2011, 48(1):  9-18. 
Asbtract ( 622 )   HTML ( 0)   PDF (2888KB) ( 493 )  
Related Articles | Metrics
With the appearance of massively compute-intensive applications such as multimedia and complicated scientific computing, SIMD architecture has become a hotspot of research due to its intrinsic scalable data-parallel structure. The low simulating speed of software simulation brings lots of inconveniency in large scale SMID architecture research. FPGA-based simulation system is much faster, but the scale of the target system is often limited by the capacity of the FPGA chips. Using more FPGA chips or larger capacity FPGA chip may not only increase the complexity of design, but also increase the cost of research. In this paper, we propose a novel FPGA-based paging-simulation model for SIMD architecture, which can reduce the computing resource and memory resource consuming of the simulation system efficiently. On the basis of this model, large scale SIMD architectures can be simulated with limited FPGA resources. We build the simulation system of MASA on Altera StratixII series chip EP2S180. The experiment results show that without any simulation optimizing technology, the largest scale system can be implemented on EP2S180 is MASA of 8 clusters, while based on the paging-simulation model, the largest scale system can be implemented on EP2S180 with MASA of 256 clusters, and the increment of the simulation time is acceptable.
Decoded Instruction Cache for Reducing Startup Overhead in Co-Designed Virtual Machines
Chen Wei, Wang Zhiying, Xiao Nong, Shen Li, and Lu Hongyi
2011, 48(1):  19-27. 
Asbtract ( 470 )   HTML ( 1)   PDF (3395KB) ( 470 )  
Related Articles | Metrics
Co-designed virtual machines (co-VM) provide the processor designer with new opportunities for innovation through the combined hardware and software. Co-VM uses dynamic binary translation to implement binary compatibility between different instruction set architectures (ISA). Interpreting and translating the source ISA binaries will affect the startup performance of a co-VM. In the exploration of startup performance of our VM which employs interpretation and superblock translation, we observe that the cold code interpretation causes the major startup overhead of co-VM and the redundant source instruction decoding forms the bottleneck of interpretation. We oberserve the interpretation routine locality and propose a hardware decoded instruction cache (DICache) for saving instruction information decoded during interpretation. DICache can be organized as normal cache and maintained by hardware. We implement a co-VM and conduct some benchmarks from SYSmark 2004 SE to evaluate the DICache performance on a co-VM. We also evaluate the implementation overhead of DICache, such as area and power consumption. It is demonstrated that DICache could significantly reduce the redecoding operations and speedup the interpretation, thus bringing a speedup of 2.4 on average relative to the startup performance of the normal co-VM. Compared with other related optimization techniques, DICache performs more efficiently with better adaptability.
Optimizing Soft Error Rate and Overhead of Circuits Based on Sensitive Registers Replacement
Sun Yan, Zhang Minxuan, Li Shaoqing, and Gao Changlei
2011, 48(1):  28-35. 
Asbtract ( 405 )   HTML ( 0)   PDF (1078KB) ( 543 )  
Related Articles | Metrics
Continuous scaling of VLSI technology results in increasing vulnerability to radiation-induced soft errors of combinational and sequential logic circuits. Consequently, soft error issues are playing an important role in design considerations in nanometer integrate circuits. The conventional triple modular redundancy can solve soft error problem, but the serious area overhead are unacceptable. Other existing hardening techniques also lead to large area overhead. We present novel soft error immunity techniques in logic circuits, which can optimize soft error rate and area overhead of logic circuits effectively. Firstly, we propose an evaluation metrics FAP for circuit hardening, considering both circuits’ soft error rate and area overhead. Secondly, we propose an efficient greedy algorithm-based register replacement technique to immunize soft errors, which only replaces most sensitive registers by triple modular redundancy registers in the path of the largest soft error rate. Thirdly, to solve the limitation that greedy algorithm can not achieve optimization between reliability and area, we present a heuristic algorithm which has the best balance in soft error rate and area overhead. Experimental results show that the greedy algorithm can reduce 90% soft error rate with 50% area overhead, while the heuristic algorithm can reduce over 90% soft error rate with only 45% overhead. Compared with previous works, the proposed techniques achieve better tradeoff between soft error rate and area overhead.
An on-Chip Dynamic Virtual Channel Router Based on Two-Step Flow Control Method
Peng Yuanxi, Zhu Honglei, and Chen Haiyan
2011, 48(1):  36-44. 
Asbtract ( 496 )   HTML ( 2)   PDF (3070KB) ( 397 )  
Related Articles | Metrics
The on-chip area and power are limited seriously, so as to the capacity of packet buffer. So one of key issues of NoC design is how to use packet buffer efficiently. Dynamic virtual channel buffer is one of efficient ways, but may make congestion heavier, or even may make dead congestion appear. This paper proposes an on-chip dynamic virtual channel (DAVC) router based on two-step flow control method, which splits a packet into a head and a body to apply flow control algorithms separately, and it can retard congestion and stop dead congestion. There is a centralized share buffer on every input port in the router. The centralized share buffer is dynamic allocated to every packet of this input port according to the conditions of traffic and flow control, and can implement zero cycle delay between a read and a write and support variable packet length. The router also uses a two-level crossbar arbitrator and a novel virtual channel allocator based on tree, which allocates output virtual channel dynamically. The experiment results show that, compared with static virtual channel (SAVC), DAVC router provides 23.2% throughput improvement and 27.2% delay reduction with same buffer capacity, and provides 28.3% area and 23.8% power savings while having similar performance with half buffer capacity.
Study of Sampling Methods on Data Mining and Stream Mining
Hu Wenyu, Sun Zhihui, Wu Yingjie,
2011, 48(1):  45-54. 
Asbtract ( 852 )   HTML ( 3)   PDF (1087KB) ( 805 )  
Related Articles | Metrics
Sampling is an efficient and most widely-used approximation technique. It enables lots of algorithms to be applied to huge dataset by use of scaling down dramatically dataset for data mining and streaming mining. Throughout the detailed review, a kind of taxonomic frame of sampling algorithms based on uniform sampling and biased sampling is presented; meanwhile, analysis, comparisons and evaluations of representative sampling algorithms such as reservoir sampling, concise sampling, count sampling, chain-sampling, DV sampling and so on are performed. Due to the limitations of uniform sampling in some applications—queries with relatively low selectivity, outlier detection in large multidimensional data sets, and clustering over data streams with skewed Zipf distribution, the importance of need for using biased sampling methods in these scenarios is fully dissertated. In addition to listing successful applications of sampling techniques in data mining, statistics estimating and stream mining up to now, we survey the application and development of sampling techniques, especially those traditional classic sampling techniques such as progressive sampling, adaptive sampling, stratified sampling and two-phase sampling etc. Finally, future challenges and directions with respect to data stream sampling are further discussed.
A Huge Dimension Table Join Algorithm for Construction of StreamCube
Gan Liang, Jia Yan, Li Aiping, and Jin Xin
2011, 48(1):  55-67. 
Asbtract ( 483 )   HTML ( 0)   PDF (1876KB) ( 435 )  
Related Articles | Metrics
Join is one of the most important operations in relational database, and is also important in data stream management system. In group-bys which construct StreamCube, join will be done before them, and join between data stream and huge dimension tables (such as IPaddress table) would consume limited power of CPU and capacity of memory. Generally, a huge dimension table must be partitioned into small tables and each partition table is loaded into memory in turn that causes frequent disk IO. To avoid this shortage, it compress huge dimension tables losslessly by taking characters of dimension tables and their join-key layer into account and finding join-key redundancies in those tables. So, one dimension table with n concept columns is compressed into n ranged join-key dimension tables (RJ-DT) by reducing join-key redundancies and using decomposed of storage model of column-store. Each RJ-DT is composed of start and end columns and several concept columns. However, a new issue that non-equijoin called range join between data stream and RJ-DT is brought out. Then, it proposes a multi-join algorithm of huge dimension table, called multi dynamic index nested-loop join (MDI-NL), which implements non-equijoin efficiently, also supports multi-join. MDI-NL constructs RB+Tree index of each RJ-DT before join. In join operation, it dispatches index dynamically referring to demand of group-by which get the exact smallest index and makes MDI-NL more powerful. Through theoretical analysis and extensive experiments, it is found that MDI-NL outperforms other join algorithms by an order of magnitude for huge dimension table join and has a strong practicability.
Probabilistic Skyline Computation on Existentially Uncertain Data
Wang Xiaowei, Jia Yan, Yang Shuqiang, and Tian Li
2011, 48(1):  68-76. 
Asbtract ( 436 )   HTML ( 1)   PDF (1585KB) ( 470 )  
Related Articles | Metrics
In recent years, many advanced technologies have been developed to collect and analyze massive data. In many cases, the data may contain errors or may be unreliable. Traditional methods often lead to unacceptable complexity when managing and mining these uncertain data, thus a lot of attention has been paid to the query evaluation on uncertain data. The probabilistic Skyline computation is to find the set of objects whose probabilities in skyline exceed a given threshold, which has a great value in multi-criteria optimization problem. Indices should be built in advance in existing algorithms. Building indices is frequently impracticable or not profitable when data has a high volume or large dimensionality or high update frequency, thus a non-indexed algorithm is necessary. In this paper, we propose the first non-indexing algorithm of probabilistic Skyline on existentially uncertain data. This algorithm dynamically maintains a probabilistic constrained space using objects scanned before. Future objects can be safely pruned if fall into that space. The algorithm is evaluated through extensive experiments on both real and benchmark synthetic data sets. It is shown that when the dimensionality is no more than 4 the pruning rate exceeds 99.8%, and the computation time is saved more than 50% over the nave algorithm.
Continuous Dynamic Skyline Queries over Data Stream
Zhang Li, Zou Peng, Jia Yan, and Tian Li
2011, 48(1):  77-85. 
Asbtract ( 590 )   HTML ( 0)   PDF (1635KB) ( 475 )  
Related Articles | Metrics
Skyline queries are capable of retrieving interesting points from a large data set according to multiple criteria. As an essential query, skyline computation over data stream is very important for many online applications, including mobile environment, network monitoring, communication, sensor network and stock market trading, etc. The problem of skyline computation has attracted considerable research attention. Different from most popular skyline processing methods, this paper focuses on constrained skyline and dynamic skyline processing over data stream. Instead of computing the skyline results on the whole data set, this kind of skyline query only needs to process parts of the data set, and there are maybe thousands of such queries in the system. To deal with the challenges of the random additions and deletions of the tuples over data stream, we employ a grid based index to store the tuples and put forward an algorithm to compute and maintain skyline set based on it. By making use of the advantage of grid index, we define influence area for every query to minimize the cells need to be processed when new tuples arrive and old tuples expire. Only tuples in the cells that belong to influence area will be processed. This way, the tuples which are not in the influence area will be ignored and the CPU time is saved. Theoretical analysis and experimental evidences show the efficiency of the proposed approaches.
All-Nearest-Neighbor Queries Processing in Spatial Databases
Liao Haojun, Han Jizhong, and Fang Jinyun
2011, 48(1):  86-93. 
Asbtract ( 565 )   HTML ( 2)   PDF (985KB) ( 476 )  
Related Articles | Metrics
All-nearest-neighbor(All-NN) query processing for R-tree-like hierarchical access methods in spatial database system follows a batch paradigm, based on indexed nested-loop for NN querying of all objects and the single-directional expansion strategy in tree traversal. The authors propose a method for All-NN queries by exploiting location proximity of objects, when the primary data set P and the reference data set R are referred to the same one (P=R). A heuristic distance metric is used to reduce the volume of search space, instead of conventional MINDIST and MINMAXDIST metrics. And hence, the number of retrieved index nodes is reduced considerably, as well as the CPU time. The All-NN processing consists of two steps. NN query is performed within each leaf node by using plane-sweeping algorithm, in which all objects, as well as their nearest neighbor candidates, reside in current leaf node and, then an enhanced range query is performed to identify nearest neighbor and discard false alarms by retrieving objects that locate in other leaf nodes, based on the distance between objects and its might-be nearest neighbor of the first step. Experimental study shows the superb of this method over other approaches(e.g., PaS, BNN), in terms of IO cost and CPU time, by avoiding indexed nested-loop traversal, MINDIST and MINMAXDIST computation, and the maintenance of priority heap overhead.
Retrieving Deep Web Data Based on Hierarchy Tree Model
Tian Jianwei and Li Shijun
2011, 48(1):  94-102. 
Asbtract ( 450 )   HTML ( 0)   PDF (1005KB) ( 517 )  
Related Articles | Metrics
While the Web provides a platform for information search and dissemination, massive information is hidden behind in the query restricted Web databases, which makes it difficult to obtain these high-quality data records. The current research on Deep Web search has focused on crawling the Deep Web data via Web interfaces with keywords queries. However, these keywords-based methods have inherent limitations because of the multi-attributes and top-k features of the Deep Web. This poses a great challenge for Web information search and retrieval. To address this problem, we propose an approach for siphoning structured data based on hierarchy tree, which can retrieve all the data non-repeatedly in the hidden databases. Firstly, we model the hidden database as a hierarchy tree. Under this theoretical framework, data retrieving is transformed into a traversing problem in the hierarchy tree. Secondly, we also propose techniques to narrow the query space and obtain the attribute values by sorting the attributes according to the ascending order. Thirdly, we leverage the mutual information to measure the attribute values dependency. Based on the attribute values dependency, we narrow the traversal space by using heuristic rule to guide the traversal process. Finally, we conduct extensive experiments over real Deep Web sites and controll databases to illustrate the coverage and efficiency of our techniques.
Load Shedding Strategies on Sliding Window Joins over Data Streams
Han Donghong, Gong Pizhen, Xiao Chuan , and Zhou Rui
2011, 48(1):  103-109. 
Asbtract ( 464 )   HTML ( 0)   PDF (837KB) ( 394 )  
Related Articles | Metrics
With the development of data stream application, data stream management system DSMS brings tremendous challenges in database techniques. As a data stream is continual and time-varying, it requires that DSMS should be adaptive. When the data arrival rate exceeds the system resource limit, the system performance degrades or system may even breaks down. Load shedding is one of the most promising ways to solve the problem. In this paper, several load shedding techniques over sliding window joins are addressed. Firstly, a dual window architectural model including aux-windows and join-windows is proposed. The former is used in the join of two streams, while the latter is used in building the statistics of the estimated join results. With the statistics, an effective load shedding strategy can produce maximum subset of join outputs. In order to accelerate the load shedding process, segment trees have been utilized to reduce the cost on shedding evaluation. Secondly, front-shedding will be cooperated with rear-shedding when streams have high arrival rates, in which the front-shedding adopts random shedding and rear-shedding adopts semantic shedding. Lastly, the experiments based on extensive experiments with synthetic data and real life data show that these new load shedding methods have superb performance of join outputs compared with dominates the existing strategies.
A Dynamic Threshold Schedule Algorithm for Media Traffic
Wang Xueshun, Yu Shaohua, Xu Ning, and Dai Jinyou
2011, 48(1):  110-117. 
Asbtract ( 421 )   HTML ( 0)   PDF (1316KB) ( 455 )  
Related Articles | Metrics
Real-time multimedia applications are getting more and more popular in the last several years with internet capability keeping improved, in order to guarantee the transmission quality of medium traffic, effective strategies for buffer management are applied in exchanging devices in flow medium network, such as switches and routers. Considering the characteristics of packet delay and loss sensitivity of medium data, this paper proposes a novel dynamic threshold algorithm at media traffic level for shared buffer packet switches, known as dual-dynamic threshold control based on E-model(EDTA). EDTA divides buffer management of switch into two parts: the global threshold strategy and the queue threshold strategy. The global threshold strategy controls the shared buffer by a whole threshold through judging different traffic scenarios to ensure good transmission performance in all situations. The queue threshold strategy decides and adjusts output queue threshold based on E-model traffic levels, which transforms several medium service quality effecting parameters into users’ psychological factors, thus guarantees the quality of flow medium data. Experimental results show that our proposed algorithm (EDTA) can improve the traffic performance for flow media more effectively, achieve better quality of network flow medium transmission, comparing with those algorithms using traditional control. And it is suitable to be used in any network condition and can improve the utilization of the network resources.
A MAP Channel Estimation Algorithm for MIMO-OFDM Systems with Better Performance
Xu Peng, Wang Jinkuan, and Qi Feng
2011, 48(1):  118-124. 
Asbtract ( 451 )   HTML ( 0)   PDF (803KB) ( 520 )  
Related Articles | Metrics
Maximum a posteriori (MAP) channel estimation algorithm usually uses expectation maximum (EM) algorithm to decrease the high computation. However, this kind of operation has a difficulty in obtaining ideal estimation performance at high signal to noise ratio (SNR) because of the convergent feature of EM algorithm. In addition, for pilot-based multiple-input multiple-output with orthogonal frequency division multiplexing (MIMO-OFDM) systems, data transmission efficiency of OFDM symbol will be reduced with the increasing number of transmit antennas. In order to improve these two drawbacks, firstly, an equivalent signal model is introduced to improve the convergent property of EM algorithm at high SNR. Then, to enhance the data transmission efficiency, joint estimation is implemented by making use of phase shifted orthogonal pilot sequences over multiple OFDM symbols. Whats more, channel matrix is transformed between time domain and angle domain and the concept of angle domain is used to reduce the effect of noise on the estimation by using the spatial independence of MIMO channel in channel matrix of angle domain. Through performance analysis and simulation results, it is indicated that the proposed algorithm has lower estimation error and higher data transmission efficiency than the raw MAP algorithm based on EM process, which only requires increasing the computational complexity a little bit.
A Novel Passive-Landmark Based Network Distance Prediction Method
Wu Guofu, Dou Qiang, Ban Dongsong, Dou Wenhua, and Song Lei
2011, 48(1):  125-132. 
Asbtract ( 443 )   HTML ( 0)   PDF (1061KB) ( 508 )  
Related Articles | Metrics
With the direction of network topology information, the performance of large scale distributed applications could be enhanced greatly. However, if the topology information between nodes is obtained by directly measure, the cost of the probing packets may be more than the gain from the performance improvement. This paper proposes a novel passive landmark based network distance prediction method-PLNDP. The vector of transmission delay from normal node to landmarks is embedded into the metric space R\+n by the Lipschitz transformation. After getting the network coordinates, normal nodes use the distance function to compute the distance between coordinates. Then the network distances between nodes is predicted by the distance between nodes coordinates. Unlike other network coordinates system, landmarks in PLNDP only need to respond to probes passively, while not measuring distances to other landmarks actively. Existing high performance public servers, such as DNS servers and Web servers, can be used as landmarks. So the cost of deployment can be reduced greatly. In order to improve the prediction accuracy, valid landmarks and correctional factor are used in the distance function. Experiment results show that, for several different accuracy metrics, PLNDP is better than classical network distance prediction methods GNP and Vivaldi, especially when some landmarks have been failed.
Sponsored Search Performance Analysis Based on User Behavior Information
Wang Jiazhuo, Liu Yiqun, Ma Shaoping, and Zhang Min
2011, 48(1):  133-138. 
Asbtract ( 468 )   HTML ( 0)   PDF (799KB) ( 1014 )  
Related Articles | Metrics
With the explosive growth of information available on the Web, more and more users adopt search engines to collection information on the Internet. Meanwhile, sponsored search has become one of the most popular forms of Internet advertising because of its effectiveness and feasibility. As one of the predominant business model for Web search engines, sponsored search plays an important part in search engine studies. Sponsored search is popular among the small scale companies because it lowers risks and broadens market channels for these companies. But there remain many questions to be answered for sponsored search researches: Are sponsored results providing Web users good experience?Are Web search users disturbed by sponsored links on search engine result pages?How does sponsored search result affect the behavior of the search users?It is necessary to look into these questions and give advices to search engines and advertisers to make sponsored search more useful and acceptable. Click-through information is extracted from Web access logs collected by a commercial search engine in China. Based on analysis into this information, click through behavior on sponsored search results and ordinary results are compared. Experimental results show that some hypothesis proposed by previous researches dont accord with practical Web user behavior patterns. Meanwhile, a number of conclusions are also made to answer these proposed questions.
Dynamic Radio Map Based Particle Filter for Indoor Wireless Localization
Lin Yiming, Luo Haiyong, Li Jintao, and Zhao Fang
2011, 48(1):  139-146. 
Asbtract ( 719 )   HTML ( 0)   PDF (1775KB) ( 832 )  
Related Articles | Metrics
Indoor wireless localization is the basis of various location-based applications. So far, the majority of Radio Map based wireless indoor localization algorithms adopted the static radio map in which signal environment was regarded as stable over time, and didn’t make good use of continuous motion information of the target. In this paper a dynamic Radio Map based particle Filter for wireless indoor localization (DRMPF) algorithm is proposed, which combines the particle filter with Radio Map based positioning technology. By constructing spatial correlation model (SCM) based dynamic Radio Map, DRMPF uses some reference nodes to capture the real-time signal changes in the environment. Contiguous spatial correlation model breaks the limitation of traditional grid-shaped Radio Map, and converts the wireless indoor localization from classification problem into regression problem. It also reduces the training cost and algorithm complexity of the online localization stage. Extensive experiments demonstrates that the proposed SCM based Radio Map model has good time generalization ability. Compared with the static Radio Map, the DRMPF algorithm improves the positioning accuracy by about 20%, which demonstrates a good ability to adapt to the environment changes.
A Multi-Level l-Diversity Model for Numerical Sensitive Attributes
Han Jianmin, Yu Juan, Yu Huiqun, and Jia Jiong
2011, 48(1):  147-158. 
Asbtract ( 1011 )   HTML ( 4)   PDF (2662KB) ( 650 )  
Related Articles | Metrics
Privacy preservation in data publishing has gained wide concern in databases recently. There are various anonymity models proposed for preserving privacy. The l-diversity is an effective model to preserve individual privacy while publishing data. However, the l-diversity model is suitable for processing categorical sensitive attributes, rather than numerical sensitive attributes, which can not effectively thwart homogeneity attack and background knowledge attack for numerical sensitive attributes. To address this problem, a multi-level l-diversity model based on level distance is proposed especially for numerical sensitive attribute. The main idea of the multi-level l-diversity model is that it divides numerical sensitive values into several levels at first, and then realizes sensitive attribute l-diversity based on these levels and level distance. Instantiations of the multi-level l-diversity model, such as multi-level distinct l-diversity, multi-level l-entropy diversity and multi-level recursive (c,l)-diversity, are introduced. The properties of the multi-level l-diversity model are also analyzed. Based on the properties, an l-incognito algorithm is designed to realize the multi-level l-diversity. Experiments compare the proposed model and the existing l-diversity model in terms of the diversity of anonymity tables. Experimental results show that the anonymity data generated by the l-incognito algorithm on the multi-level l-diversity model have higher sensitive attributes diversity than that on mono-level l-diversity model, so it can resist homogeneity attack and background knowledge attack effectively.
Application Layer Anomaly Detection Based on Application Layer Protocols’ Keyword Sequences
Xie Bailin and Yu Shunzheng
2011, 48(1):  159-168. 
Asbtract ( 459 )   HTML ( 0)   PDF (1126KB) ( 595 )  
Related Articles | Metrics
This paper presents an application layer anomaly detection method based on keyword sequences of application layer protocols. In this method, a hidden semi-Markov model is used to describe the behavior of a normal user who is using an application layer protocol, and the keywords as well as their inter-arrival times generated in using the protocol are used as the observation sequence on the user’s behavior. This method is divided into a training phase and a detection phase. In the training phase, the parameters of the hidden semi-Markov model are determined, through the forward-backward algorithm for the hidden semi-Markov model. In the detection phase, the average log likelihood of every observation sequence is calculated in real time. If a user’s behavior is abnormal while using some application layer protocol, the priority or the bandwidth of the packets belonging to the application will be reduced. In this way the user’s anomalous behavior will be restricted automatically. An experiment is conducted to validate this method, which is based on some data sets,including the DARPA dataset. The experimental results show that the model is effective in measuring the behavior of the normal users who are using some application layer protocol, and this method has high detection accuracy and low false positive ratio.
Highlights Detection for a Single Gray-Scale Image Based on Surface Shape
Ma Jiquan, , Ma Peijun, and Su Xiaohong
2011, 48(1):  169-175. 
Asbtract ( 803 )   HTML ( 0)   PDF (2138KB) ( 502 )  
Related Articles | Metrics
The existence of specular highlights is a great obstacle of shape-from-shading (SFS). For a single gray-scale image with only intensity information, the existing highlights detection methods based on chroma or polarization analysis can not directly be applied to it. So, a new method using surface shape is provided. Firstly, it makes full use of the imaging process to estimate the surface normal. Secondly, based on the physical illumination model, diffuse and specular reflection components are calculated by minimizing the error function of brightness through the simulated annealing, then locate the highlights areas by setting a threshold-value. Finally, an illumination-constrained inpainting method based on the assumption of curvature continuity is provided. Surface shape can be originated by using different methods, but the iterations will be affected by it. The physical illumination model is complex than the geometrical model, so computing will cost more time for a more accurate estimation. The constrained inpainting method will be started along the boundary of specular reflection area. It will be seen that the inpainting direction can be changed from horizontal to vertical. The experimental results show that the proposed algorithm has good stability in synthetic and real-world images, improves the accuracy of surface recovery for image combined specular highlights.