ISSN 1000-1239 CN 11-1777/TP

Table of Content

01 June 2017, Volume 54 Issue 6
Satisfaction Prediction of Web Search Users
Liu Yiqun
2017, 54(6):  1133-1143.  doi:10.7544/issn1000-1239.2017.20160804
Asbtract ( 636 )   HTML ( 2)   PDF (5359KB) ( 837 )  
Related Articles | Metrics
User satisfaction is one of the prime concerns for Web search related studies. It is a non-trivial task for three major reasons: 1) Traditional approaches for search performance evaluation mainly rely on editorial judgments of the relevance of search results. The relationship between search satisfaction and relevance-based evaluation still remains under-investigated. 2) Most existing researches are based on the hypothesis that all results on search result pages (SERPs) are homogeneous while a variety of heterogeneous components have been aggregated into modern SERPs to improve search performance. 3) Most existing studies on satisfaction prediction primarily rely on users’ click-through and query reformulation behaviors but there are plenty of search sessions without such information. In this paper, we summarize our recent efforts to shed light on these research questions. Firstly, we perform a laboratory study to investigate the relationship between relevance and users’ perceived usefulness and satisfaction. After that, we also investigate the impact of vertical results with different qualities, presentation styles and positions on search satisfaction with specifically designed SERPs. Finally, inspired by recent studies in predicting result relevance based on mouse movement patterns, we propose novel strategies to extract high quality mouse movement patterns from SERPs for satisfaction prediction. Experimental results show that our proposed method outperforms existing approaches in heterogeneous search environment.
Recent Advances in Neural Machine Translation
Liu Yang
2017, 54(6):  1144-1149.  doi:10.7544/issn1000-1239.2017.20160805
Asbtract ( 1369 )   HTML ( 16)   PDF (3970KB) ( 1410 )  
Related Articles | Metrics
Machine translation, which aims at automatically translating between natural languages using computers, is one of important research directions in artificial intelligence and natural language processing. Recent years have witnessed the rapid development of neural machine translation, which has replaced conventional statistical machine translation to become the new mainstream technique in both academia and industry. This paper first introduces the basic ideas and state-of-the-art approaches in neural machine translation and then reviews recent important research findings. The paper concludes with a discussion about possible future directions.
A Survey on Sentiment Classification
Chen Long, Guan Ziyu, He Jinhong, Peng Jinye
2017, 54(6):  1150-1170.  doi:10.7544/issn1000-1239.2017.20160807
Asbtract ( 1273 )   HTML ( 6)   PDF (9364KB) ( 1313 )  
Related Articles | Metrics
Sentiment analysis in text is an important research field for intelligent multimedia understanding. The aim of sentiment classification is to predict the sentiment polarity of opinionated text, which is the core of sentiment analysis. With rapid growth of online opinionated content, the traditional approaches such as lexicon-based methods and classic machine learning methods cannot well handle large-scale sentiment classification problems. In recent years, deep learning has achieved good performance on the intelligent understanding of large-scale text data and has attracted a lot of attention. More and more researchers start to address text classification problems with deep learning. The content of this survey is organized as two parts. We firstly summarize the traditional approaches including lexicon-based methods, machine learning based methods, hybrid methods, methods based on weakly labeled data and deep learning based methods. Secondly, we introduce our proposed weakly-supervised deep learning framework to deal with the defects of the previous approaches. Moreover, we briefly summarize the research work on the extraction of opinion aspects. Finally, we discuss the challenges and future work on sentiment classification.
Label Enhancement for Label Distribution Learning
Geng Xin, Xu Ning, Shao Ruifeng
2017, 54(6):  1171-1184.  doi:10.7544/issn1000-1239.2017.20170002
Asbtract ( 1084 )   HTML ( 3)   PDF (4492KB) ( 876 )  
Related Articles | Metrics
Multi-label learning (MLL) deals with the case where each instance is associated with multiple labels. Its target is to learn the mapping from instance to relevant label set. Most existing MLL methods adopt the uniform label distribution assumption, i.e., the importance of all relevant (positive) labels is the same for the instance. However, for many real-world learning problems, the importance of different relevant labels is often different. For this issue, label distribution learning (LDL) has achieved good results by modeling the different importance of labels with a label distribution. Unfortunately, many datasets only contain simple logical labels rather than label distributions. To solve the problem, one way is to transform the logical labels into label distributions by mining the hidden label importance from the training examples, and then promote prediction precision via label distribution learning. Such process of transforming logical labels into label distributions is defined as label enhancement for label distribution learning. This paper first proposes the concept of label enhancement with a formal definition. Then, existing algorithms that can be used for label enhancement have been surveyed, and compared in the experiments. Results of the experiments reveal that label enhancement can effectively discover the difference of the label importance hidden in the data, and improve the performance of multi-label learning.
Probability Distribution Based Evolutionary Computation Algorithms for Multimodal Optimization
Chen Weineng, Yang Qiang
2017, 54(6):  1185-1197.  doi:10.7544/issn1000-1239.2017.20160891
Asbtract ( 757 )   HTML ( 1)   PDF (6078KB) ( 946 )  
Related Articles | Metrics
Evolutionary computation (EC) is a category of algorithms which simulate the intelligent evolutionary behavior in nature for solving optimization problems. As EC algorithms do not rely on the mathematical characteristics of the problem model, they have been regarded as an important tool for complex optimization. Estimation of distribution algorithm (EDA) is a new class of EC algorithms, which works by constructing a probability model to estimate the distribution of the predominant individuals in the population, and sampling new individuals based on the probability model. With this probability-based search behavior, EDA is good at maintaining sufficient search diversity, and is applicable in both continuous and discrete search space. In order to promote the research of probability-based EC (PBEC) algorithms, this paper gives a survey on EC algorithms for multimodal optimization, and then further builds two frameworks for PBEC: PBEC framework for seeking multiple solutions in multimodal optimization, and PBEC framework for discrete optimization. The first framework presents a method to combine probability-based evolutionary operators with the niching strategy, so that higher search diversity can be maintained for seeking multiple solutions in multimodal optimization. In particular, the framework understands PBEC algorithms in a broad sense, that is, it allows both explicit PBEC algorithms (e.g. EDA) and implicit PBEC algorithms (e.g. ant colony optimization) to operate in the framework, resulting in two representative algorithms: multimodal EDA (MEDA) and adaptive multimodal ant colony optimization (AM-ACO). The second framework aims at extending the applicability of EC algorithms on both continuous and discrete space. Since some popular EC algorithms are originally defined on continuous real vector space and they cannot be directly used to solve discrete optimization problems, this framework introduces the idea of probability distribution based evolution and redefines their evolutionary operators on discrete set space. As a result, the applicability of these algorithms can be significantly improved.
Survey of Database Usability for Query Results
Liu Qing, Gao Yunjun
2017, 54(6):  1198-1212.  doi:10.7544/issn1000-1239.2017.20160806
Asbtract ( 672 )   HTML ( 0)   PDF (5085KB) ( 918 )  
Related Articles | Metrics
Database usability has received much attention in the database community because of its importance. The goal of database usability is to help users utilize database more efficiently and conveniently, and thus improving the user’s satisfaction for the database. In this survey, we focus on the database usability for query results. Currently, the queries only return the query results to users. If the query result is unexpected for the users, it will frustrate users. However, the database system neither gives explanations for the unexpected query results, nor offers any suggestion on how to get the expected results for users. The users only can debug the queries by themselves, which is cumbersome and time-consuming. If the database system can offer such explanations and suggestions, it helps the users understand initial query better, and know how to change the query until the satisfactory results are found, hence improving the usability of the database. Towards this, the studies on unexpected query results have been explored. In this paper, we provide a comprehensive survey of the most recent research on database usability for query results. The paper first analyses the unexpected query results, and introduces the corresponding three problems, i.e., causality & responsibility, why-not & why questions, and why-few & why-many questions, and highlights the importance of these three problems. Then, the state of the art progresses of the unexpected query result research have been surveyed and summarized. Finally, the paper raises some directions for the future work.
A Survey of Distributed RDF Data Management
Zou Lei, Peng Peng
2017, 54(6):  1213-1224.  doi:10.7544/issn1000-1239.2017.20160908
Asbtract ( 973 )   HTML ( 0)   PDF (5363KB) ( 919 )  
Related Articles | Metrics
Recently, RDF (resource description framework) has been widely used to expose, share, and connect pieces of data on the Web, while SPARQL (simple protocol and RDF query language) is a structured query language to access RDF repository. As RDF datasets increase in size, evaluating SPARQL queries over current RDF repositories is beyond the capacity of a single machine. As a result, a high performance distributed RDF database system is needed to efficiently evaluate SPARQL queries. There are a huge number of works for distributed RDF data management following different approaches. In this paper we provide an overview of these works. This survey considers three kinds of distributed data management approaches, including cloud-based distributed data management approaches, partitioning-based distributed data management approaches and federated RDF systems. Simply speaking, cloud-based distributed data management approaches use existing cloud platforms to manage large RDF datasets; partitioning-based distributed data management approaches divide an RDF graph into several fragments and place each fragment at a different site in a distributed system; and federated RDF systems disallow for re-partitioning the data, since the data has been distributed over their own autonomous sites. In each kind of distributed data management approaches, further discussions are also provided to help readers to understand the characteristics of different approaches.
High-Throughput Image and Video Computing
Tang Jinhui, Li Zechao, Liu Shaoli, Qin Lei
2017, 54(6):  1225-1237.  doi:10.7544/issn1000-1239.2017.20170001
Asbtract ( 749 )   HTML ( 2)   PDF (3639KB) ( 641 )  
Related Articles | Metrics
In recent years, image and video data grows and spreads rapidly in the Internet. The data not only has huge amount, but also has the characteristics of high concurrency, high dimension and high throughput, which brings huge challenges into the real-time analysis and processing of them. To promote the image and video data processing efficiency of big data platforms, it is necessary and important to study the task of high-throughput image and video computing, and propose a series of high-throughput image and video computing theories and methods by considering the new hardware structures. Towards this end, this work first overviews previous high-throughput image and video computing theories and methods in details, and then discusses the disadvantages of the existing high-throughput image and video computing methods. Furthermore, this work analyzes three research directions of the high-throughput image and video computing task in future: the high-throughput image and video computing theories, the high-throughput image and video analysis methods, and the high-throughput video coding methods. Finally, this work introduces three key scientific problems of high-throughput image and video computing. The solutions of these problems will provide key technical support for the applications of content monitoring of Internet images and videos, the large-scale video surveillance, and the image and video search.
Video Copy Detection Method: A Review
Gu Jiawei, Zhao Ruiwei, Jiang Yugang
2017, 54(6):  1238-1250.  doi:10.7544/issn1000-1239.2017.20170003
Asbtract ( 873 )   HTML ( 0)   PDF (5737KB) ( 903 )  
Related Articles | Metrics
Currently, there exist large amount of copy videos on the Internet. To identify these videos, researchers have been working on the study of video copy detection methods for a long time. In recent years, a few new video copy detection algorithms have been proposed with the introduction of deep learning. In this article, we provide a review on the existing representative video copy detection methods. We introduce the general framework of video copy detection system as well as the various implementation choices of its components, including feature extraction, indexing, feature matching and time alignment. The discussed approaches include the latest deep learning based methods, mainly the application of deep convolutional neural networks and siamese convolutional neural networks in video copy detection system. Furthermore, we summarize the evaluation criteria used in video copy detection and discuss the performance of some representative methods on five popular datasets. In the end, we envision future directions on this topic.
The Semantic Knowledge Embedded Deep Representation Learning and Its Applications on Visual Understanding
Zhang Ruimao, Peng Jiefeng, Wu Yang, Lin Liang
2017, 54(6):  1251-1266.  doi:10.7544/issn1000-1239.2017.20171064
Asbtract ( 1081 )   HTML ( 3)   PDF (12595KB) ( 1212 )  
Related Articles | Metrics
With the rapid development of deep learning technique and large scale visual datasets, the traditional computer vision tasks have achieved unprecedented improvement. In order to handle more and more complex vision tasks, how to integrate the domain knowledge into the deep neural network and enhance the ability of deep model to represent the visual pattern, has become a widely discussed topic in both academia and industry. This thesis engages in exploring effective deep models to combine the semantic knowledge and feature learning. The main contributions can be summarized as follows: 1)We integrate the semantic similarity of visual data into the deep feature learning process, and propose a deep similarity comparison model named bit-scalable deep hashing to address the issue of visual similarity comparison. The model in this thesis has achieved great performance on image searching and people’s identification. 2)We also propose a high-order graph LSTM (HG-LSTM) networks to solve the problem of geometric attribute analysis, which realizes the process of integrating the multi semantic context into the feature learning process. Our extensive experiments show that our model is capable of predicting rich scene geometric attributes and outperforming several state-of-the-art methods by large margins. 3)We integrate the structured semantic information of visual data into the feature learning process, and propose a novel deep architecture to investigate a fundamental problem of scene understanding: how to parse a scene image into a structured configuration. Extensive experiments show that our model is capable of producing meaningful and structured scene configurations, and achieving more favorable scene labeling result on two challenging datasets compared with other state-of-the-art weakly-supervised deep learning methods.
Query and Feedback Technologies in Multimedia Information Retrieval
Zha Zhengjun, Zheng Xiaoju
2017, 54(6):  1267-1280.  doi:10.7544/issn1000-1239.2017.20170004
Asbtract ( 662 )   HTML ( 1)   PDF (6830KB) ( 738 )  
Related Articles | Metrics
In spite of the remarkable progress made in the past decades, multimedia information retrieval still suffers from the “intention gap” and “semantic gap”. To address this issue, researchers have proposed a wealth of query technologies to help user express search intent clearly as well as feedback technologies to help retrieval system understand user intent and multimedia data accurately, leading to significant improvements of retrieval performance. This paper presents a survey of the query and feedback technologies in multimedia information retrieval. We summarize the evolution of query styles and the development of feedback approaches. We elaborate the query approaches for retrieval on PC, mobile intelligent devices and touch-screen devices etc. We introduce the feedback approaches proposed in different periods and discuss the interaction issue in exploratory multimedia retrieval. Finally, we discuss future research directions in this field.
The Construction, Analysis, and Applications of Dynamic Protein-Protein Interaction Networks
Li Min, Meng Xiangmao
2017, 54(6):  1281-1299.  doi:10.7544/issn1000-1239.2017.20160902
Asbtract ( 857 )   HTML ( 0)   PDF (7481KB) ( 741 )  
Related Articles | Metrics
The rapid development of proteomics and high-throughput technologies, has produced a large amount of protein-protein interaction (PPI) data, which provides a foundation for further understanding the interactions between proteins and the biomedical mechanism of complex diseases. In an organism, a protein-protein interaction network (PIN) consists of all the proteins and their interactions. Most of the traditional studies on PINs are based on static networks. However, due to the dynamics of protein expressions and the dynamics of PPIs, the real PINs change with time and conditions. Protein function modules related with the occurrence and development of diseases are also bound with this dynamic change. Researchers have shifted their attentions from the static properties to dynamic properties, and proposed a series of methods for the construction of dynamic PINs. This paper is to review the construction, analysis and applications of dynamic PINs. Firstly, the existing dynamic PIN construction methods are discussed in three categories: the methods based on dynamic protein expressions, the methods based on multi-state expression and correlation changes and the methods based on spatial-temporal dynamic changes. The first category embodies the protein dynamic expression varying with time; the second category reflects the changes in the expression-related relationship between proteins under different conditions; while the third category describes the dynamic of proteins and the interactions in time and space. Then, the dynamic analysis of the proteins and the related subnetworks of the dynamic PINs are reviewed. Furthermore,the main applications in the complex diseases of dynamic PINs are discussed in details, such as the identification of protein complexes/functional modules, the detection of biomarkers, and the prediction of disease genes, etc. Finally, the challenges and future research directions of dynamic PINs are discussed.
Realtime Capture of High-Speed Traffic on Multi-Core Platform
Ling Ruilin, Li Junfeng, Li Dan
2017, 54(6):  1300-1313.  doi:10.7544/issn1000-1239.2017.20160823
Asbtract ( 636 )   HTML ( 2)   PDF (9190KB) ( 856 )  
Related Articles | Metrics
With the development of Internet application and the increase of network bandwidth, security issues become increasingly serious. In addition to the spread of the virus, spams and DDoS attacks, there have been lots of strongly hidden attack methods. Network probe tools which are deployed as a bypass device at the gateway of the intranet, can collect all the traffic of the current network and analyze them. The most important module of the network probe is packet capture. In Linux network protocol stack, there are many performance bottlenecks in the procedure of packets processing which cannot meet the demand of high speed network environment. In this paper, we introduce several new packet capture engines based on zero-copy and multi-core technology. Further, we design and implement a scalable high performance packet capture framework based on Intel DPDK, which uses RSS (receiver-side scaling) to make packet capture parallelization and customize the packet processing. Additionally, this paper also discusses more effective and fair Hash function by which data packet can be deliveried to different receiving queues. In evaluation, we can see that the system can capture and process the packets in nearly line-speed and balance the load between CPU cores.
Real-Time Panoramic Video Stitching Based on GPU Acceleration Using Local ORB Feature Extraction
Du Chengyao, Yuan Jingling, Chen Mincheng, Li Tao
2017, 54(6):  1316-1325.  doi:10.7544/issn1000-1239.2017.20170095
Asbtract ( 2712 )   HTML ( 7)   PDF (8791KB) ( 1359 )  
Related Articles | Metrics
Panoramic video is a sort of video recorded at the same point of view to record the full scene. The collecting devices of panoramic video are getting widespread attention with the development of VR and live-broadcasting video technology. Nevertheless, CPU and GPU are required to possess strong processing abilities to make panoramic video. The traditional panoramic products depend on large equipment or post processing, which results in high power consumption, low stability, unsatisfying performance in real time and negative advantages to the information security. This paper proposes a L-ORB feature detection algorithm. The algorithm optimizes the feature detection regions of the video images and simplifies the support of the ORB algorithm in scale and rotation invariance. Then the features points are matched by the multi-probe LSH algorithm and the progressive sample consensus (PROSAC) is used to eliminate the false matches. Finally, we get the mapping relation of image mosaic and use the multi-band fusion algorithm to eliminate the gap between the video. In addition, we use the Nvidia Jetson TX1 heterogeneous embedded system that integrates ARM A57 CPU and Maxwell GPU, leveraging its Teraflops floating point computing power and built-in video capture, storage, and wireless transmission modules to achieve multi-camera video information real-time panoramic splicing system, the effective use of GPU instructions block, thread, flow parallel strategy to speed up the image stitching algorithm. The experimental results show that the algorithm mentioned can improve the performance in the stages of feature extraction of images stitching and matching, the running speed of which is 11 times than that of the traditional ORB algorithm and 639 times than that of the traditional SIFT algorithm. The performance of the system accomplished in the article is 59 times than that of the former embedded one, while the power dissipation is reduced to 10W.
A Key-Value Database Optimization Method Based on Raw Flash Device
Qin Xiongjun, Zhang Jiacheng, Lu Youyou, Shu Jiwu
2017, 54(6):  1326-1336.  doi:10.7544/issn1000-1239.2017.20170092
Asbtract ( 728 )   HTML ( 0)   PDF (5767KB) ( 692 )  
Related Articles | Metrics
In recent years, NoSQL key-value databases have been widely used. However, the current mainstream key-value databases are based either on disk, or on traditional file system and flash translation layer, which makes it difficult to utilize the characteristics of flash devices, and also limits I/O concurrency of flash devices. Moreover, garbage collection process under such kind of architecture is complex. This paper designs and implements Flashkv, a key-value data management architecture based on raw flash device. Flashkv doesn’t use file system and flash translation layer, instead, it’s space management and garbage collection are done by the management unit in the user mode. Flashkv makes full use of the concurrent features inside the flash device, and simplifies the garbage collection process and removes redundant function modules which exist in both traditional file system and flash translation layer, and also shortens the I/O path. This paper proposes I/O scheduling technology based on the characteristics of flash memory, which reduces read and write latency of flash memory and improves throughput. The user mode cache management technology is proposed, which reduces write amount and also the cost of frequent system calls. Test results show that Flashkv’s performance is 1.9 to 2.2 times that of levelDB and the write amount reduces by 60% to 65%.
A Quantitative Analysis on the “Approximatability” of Machine Learning Algorithms
Jiang Shuhao, Yan Guihai, Li Jiajun, Lu Wenyan, Li Xiaowei
2017, 54(6):  1337-1347.  doi:10.7544/issn1000-1239.2017.20170086
Asbtract ( 978 )   HTML ( 3)   PDF (5472KB) ( 1008 )  
Related Articles | Metrics
Recently, Machine learning algorithms, such as neural network, have made a great progress and are widely used in image recognition, data searching and finance analysis field. The energy consumption of machine learning algorithms becomes critical with more complex issues and higher data dimensionality. Because of the inherent error-resilience of machine learning algorithms, approximate computing techniques, which trade the accuracy of results for energy savings, are applied to save energy consumption of these algorithms by many researchers. We observe that most works are dedicated to leverage the error-resilience of certain algorithm while they ignore the difference of error-resilience among different algorithms. Understanding the difference on “approximatability” of different algorithms is very essential because when the approximate computing techniques are applied, approximatability can help the classification tasks choose the best algorithms to achieve the most energy savings. Therefore, we choose 3 common supervised learning algorithms, that is, SVM, random forest (RF) and neural network (NN), and evaluate their approximatibility targeted to different kinds of energy consumption. Meanwhile, we also design several metrics such as memory storage contamination sensitivity, memory access contamination sensitivity and energy diversity to quantify the difference on approximatability of learning algorithms. The conclusion from evaluation will assist in choosing the appropriate learning algorithms when the classification applications apply approximate computing techniques.
A Comparison Among Different Numeric Representations in Deep Convolution Neural Networks
Wang Peiqi, Gao Yuan, Liu Zhenyu, Wang Haixia, Wang Dongsheng
2017, 54(6):  1348-1356.  doi:10.7544/issn1000-1239.2017.20170098
Asbtract ( 896 )   HTML ( 0)   PDF (4457KB) ( 1051 )  
Related Articles | Metrics
Deep convolution neural networks have been widely used in industries as well as academic area because of their triumphant performance. There are tendencies toward deeper and more complex network structures, which leads to demand of substantial computation and memory resources. Customized hardware is an appropriate and feasible option, which is beneficial to maintain high performance in lower energy consumption. Furthermore, customized hardware can also be adopted in some special situations where CPU and GPU cannot be placed. During the hardware-designing processes, we need to address some problems like how to choose different types of numeric representation as well as precision. In this article, we focus on two typical numeric representations, fixed-point and floating-point, and propose corresponding error models. Using these models, we theoretically analyze the influence of different types of data representation on the hardware overhead of neural networks. It is remarkable that floating-point has clear advantages over fixed-point under ordinary circumstances. In general, we verify through experiments that floating-point numbers, which are limited to certain precision, preponderate in both hardware area and power consumption. What’s more, according to the features of floating-point representation, our customized hardware implementation of convolution computation declines the power and area with 14.1× and 4.38× respectively.
Increasing PCM Lifetime by Using Pipelined Pseudo-Random Encoding Algorithm
Gao Peng, Wang Dongsheng, Wang Haixia
2017, 54(6):  1357-1366.  doi:10.7544/issn1000-1239.2017.20170065
Asbtract ( 489 )   HTML ( 0)   PDF (8297KB) ( 611 )  
Related Articles | Metrics
Phase change memory (PCM) is a promising technique due to its low static power, non-volatility, and density potential. However, the low endurance remains as the key problem to be solved before it can be widely used in practice. Generally, minimizing modified bits in write operation by writing the different bits, is an effective method to extend the lifetime of PCM. But it’s still challenging to reach the minimum without causing significant slowdown of read/write operations. To this end, we propose FEBRE: A fast and efficient bit-flipping reduction technique to extend PCM lifetime. The key idea of our method is to design and use a novel one-to-many parallel mapping before differential write stage. Specifically, FEBRE employs a new data encoding method to generate multiple highly random distributed encoded vectors from one writing data item, which thus increases the possibility of identifying the nearest one to stored data in those vectors. The other contribution of our technique is a pipelined pseudo-random encoding algorithm (PPREA). The new algorithm reduces writing overhead because it is able to accelerate the procedure of the one-to-many mapping. The experiment shows that our technique, compared with PRES, can reduce bit flips by 5.31% on average, and improve the encodingdecoding speed by 2.29x and 45%, respectively.
A Memristor-Based Processing-in-Memory Architecture for Deep Convolutional Neural Networks Approximate Computation
Li Chuxi, Fan Xiaoya, Zhao Changhe, Zhang Shengbing, Wang Danghui, An Jianfeng, Zhang Meng
2017, 54(6):  1367-1380.  doi:10.7544/issn1000-1239.2017.20170099
Asbtract ( 869 )   HTML ( 3)   PDF (8816KB) ( 868 )  
Related Articles | Metrics
Memristor is one of the most promising candidates to build processing-in-memory (PIM) structures. The memristor-based PIM with digital or multi-level memristors has been proposed for neuromorphic computing. The essential frequent AD/DA converting and intermediate memory in these structures leads to significant energy and area overhead. To address this issue, a memristor-based PIM architecture for deep convolutional neural network (CNN) is proposed in this work. It exploits the analog architecture to eliminate data converting in neuron layer banks, each of which consists of two special modules named weight sub-arrays (WSAs) and accumulate sub-arrays (ASAs). The partial sums of neuron inputs are generated in WSAs concurrently and are written into ASAs continuously, in which the results are computed finally. The noise in proposed analog style architecture is analyzed quantitatively in both model and circuit levels, and a synthetic solution is presented to suppress the noise, which calibrates the non-linear distortion of weight with a corrective function, pre-charges the write module to reduce the parasitic effects, and eliminates noise with a modified noise-aware training. The proposed design has been evaluated by varying neural network benchmarks, in which the results show that the energy efficiency and performance can both be improved about 90% in specific neural network without accuracy losses compared with digital solutions.
Design of RDD Persistence Method in Spark for SSDs
Lu Kezhong, Zhu Jinbin, Li Zhengmin, Sui Xiufeng
2017, 54(6):  1381-1390.  doi:10.7544/issn1000-1239.2017.20170108
Asbtract ( 1018 )   HTML ( 3)   PDF (5951KB) ( 859 )  
Related Articles | Metrics
SSD (solid-state drive) and HDD (hard disk drive) hybrid storage system has been widely used in big data computing datacenters. The workloads should be able to persist data of different characteristics to SSD or HDD on demand to improve the overall performance of the system. Spark is an industry-wide efficient data computing framework, especially for the applications with multiple iterations. The reason is that Spark can persist data in memory or hard disk, and persisting data to the hard disk can break the insufficient memory limits on the size of the data set. However, the current Spark implementation does not specifically provide an explicit SSD-oriented persistence interface, although data can be distributed proportionally to different storage mediums based on configuration information, and the user can not specify RDD’s persistence locations according to the data characteristics, and thus the lack of relevance and flexibility. This has not only become a bottleneck to further enhance the performance of Spark, but also seriously affected the played performance of hybrid storage system. This paper presents the data persistence strategy for SSD for the first time as we know. We explore the data persistence principle in Spark, and optimize the architecture based on hybrid storage system. Finally, users can specify RDD’s storage mediums explicitly and flexibly leveraging the persistence API we provided. Experimental results based on SparkBench shows that the performance can be improved by an average of 14.02%.
Design and Implementation of Positive and Negative Discriminator of MSD Data for Ternary Optical Processor
Zhang Honglie, Zhou Jian, Zhang Sulan, Liu Yanju, Wang Xianchao
2017, 54(6):  1391-1404.  doi:10.7544/issn1000-1239.2017.20170093
Asbtract ( 513 )   HTML ( 0)   PDF (7239KB) ( 642 )  
Related Articles | Metrics
The numerical positive/negative or zero value discriminator is a key component to compare the data size in computer. With the advent of the MSD (modified signed-digit) parallel adder which using three state optical signal to express number in the ternary optical processor, the research of positive/negative or zero value discriminator of MSD digit is becoming an important test to perfect ternary optical processor. Based on the characteristics of MSD data and the correspondence of the optical signal and the MSD digit, this paper proposes a method to ascertain the positive/negative or zero value of the multi-bit MSD data via direct analysis of a group of tree state optical signals which expressing the MSD data. By applying this method to the subtraction result of MSD data, it is realized to discriminate the size of two MSD data. According to the above theory, in this paper a structure of MSD data discriminator is established, which is made of polarizer, liquid crystal and half-mirror. In addition to FPGA as the control circuit, a 3-bit MSD data discriminator is realized. The validity of the discriminator is proved by some experiment, and the correctness of the basic theory and the feasibility of the structural design are proved too.