2023 Vol. 60 No. 9
We explore the emerging challenges presented by artificial intelligence (AI) development in the era of big models, with a focus on large language model (LLM) and ethical value alignment. Big models have greatly advanced AI’s ability to understand, generate, and manipulate information and content, enabling numerous applications. However, as these models become increasingly integrated into everyday life, their inherent ethical values and potential biases pose unforeseen risks to society. We provide an overview of the risks and challenges associated with big models, survey existing AI ethics guidelines, and examine the ethical implications arising from the limitations of these models. Taking a normative ethics perspective, we propose a reassessment of recent normative guidelines, highlighting the importance of collaborative efforts in academia to establish a unified and universal AI ethics framework. Furthermore, we investigate the ethical inclinations of current mainstream large language models using moral foundation theory, analyze existing big model alignment algorithms, and outline the unique challenges encountered in aligning moral values within them. To address these challenges, we introduce a novel conceptual paradigm for ethically aligning the values of big models and discuss promising research directions for alignment criteria, evaluation and method, representing an initial step towards the interdisciplinary construction of a morally aligned general artificial intelligence.
Neuromorphic hardware is a specialized computer system designed for running spiking neural network (SNN) applications. With the increasing scale of hardware resources and the challenge of concurrent execution of numerous SNN applications, efficiently allocating neuromorphic hardware resources to SNN applications has become highly challenging. We propose a resource allocation process for a neural computer operating system that maximizes the decoupling of resource allocation from compiler. We allocate hardware resources and corresponding input-output routing for SNN applications only when loading them onto the neuromorphic hardware. Furthermore, we introduce the innovative maximum empty rectangle (MER) algorithm to address the management and dynamic allocation of neuromorphic hardware resources. Additionally, we present a resource allocation algorithm that minimizes the communication cost of spiking-based input-output in SNNs, aiming to reduce energy consumption, latency, and resource fragmentation. Experimental results demonstrate that our algorithm outperforms existing approaches in terms of energy consumption, latency, and fragmentation rate.
In distributed memory systems, caching is an effective way for reducing the latency of remote memory accesses. However, a single cache consistency mechanism often fails to efficiently adapt to the access behaviors of various workloads. We propose a hybrid and reconfigurable cache consistency mechanism for distributed heterogeneous memory pool systems, which has advantages of both directory-based and broadcast-based cache consistency mechanisms. We use the four-quadrant matrix analysis method to analyze the access pattern of each object, and then adopt the most efficient cache consistency mechanism. Moreover, the adopted cache consistency mechanism can be dynamically shifted to another mechanism based on the change of memory access pattern. Experimental results show that the reconfigurable hybrid cache consistency mechanism can improve the read and write performance of distributed heterogeneous memory pool systems by 32.31% and 31.20% on average, respectively, compared with a single cache consistency mechanism. Moreover, the hybrid cache consistency mechanism shows good scalability when the number of clients increases.
Sparse matrix-vector multiplication (SpMV) has been widely applied in scientific computation, industry simulation and intelligent computation domains, which is the critical algorithm in all these applications. Usually, iterative computation of SpMV is required to fulfill precise numeric simulation, linear algebra solving and graph analytics requirements. However, due to the poor data locality, low cache usage and extreme irregular computation patterns caused by the highly sparse and random distributions, SpMV optimization has become one of the most challenging problems for modern high-performance processors. In this paper, we study the bottlenecks of SpMV on current out-of-order CPUs and propose to improve its performance by pursuing high predictability and low program complexity. Specifically, we improve the memory access regularity and locality by creating serialized access patterns so that the data prefetching efficiency and cache usage are optimized. We also improve the pipeline efficiency by creating regular branch patterns to make the branch prediction more accurate. Meanwhile, we flexibly lever the SIMD instructions to optimize the parallel execution and fully use CPU’s computation resources. Experimental results show that using the above optimization approaches, our SpMV kernel is effective to significantly alleviate the critical bottlenecks and improve the efficiency of CPU pipeline, cache and memory bandwidth usage. The resulting performance achieves average 2.6 times speedup against Intel’s commercial library of MKL, as well as average 1.3 times speedup against the existing best SpMV algorithm.
Driven by the globalization of the economy, the globalized economy is experiencing a new evolution that advocates cooperation among multiple potential competing entities rather than monopolization. JointCloud has been envisioned as a promising approach to enhance cross-cloud cooperation and exploit the price heterogeneity to reduce cloud computing costs. Considering the data produced by the working process of the JointCloud environment, the data of users and the results of users’ different tasks can also be traded or processed. We consider a kind of hybrid resource in the JointCloud market, where not only traditional cloud resources but also data resources are considered. Depending on privacy and data ownership, data resources can be processed by others clouds to make more profit. Under this circumstance, we investigate the problem of resource management under the JointCloud environment wherein multiple clouds sell their resources to other clouds and consumers through a wholesale market. As resources can be sold to either clouds or consumers, we formulate the market of resources in JointCloud as a supply chain network competition model with multiple competing manufacturers and retailers offering resources. A market game is established to model the competition among clouds. We theoretically prove that the formulated market game has a Nash equilibrium. We also analyse the profit of manufacturers and retailers when a new cloud enters the JointCloud. The analysis explains the reason for the emergency of the JointCloud.
Serverless computing is an emerging function-centric cloud computing paradigm. It exposes a high-level function abstraction for users to write and deploy applications on cloud computing platforms. Serverless computing allocates resources based on the granulation of functions. Function scheduling is critical to function performance. It faces two difficulties, which are large problem space and high dynamism. Existing schedulers for serverless computing use a first-come-first-serve (FCFS) algorithm, which has head-of-line blocking and results in long function completion time. In order to highly utilize system resources and reduce function completion time, it is important to study the problem of function scheduling in serverless computing. First, we analyze the problem of function scheduling in serverless computing, and identify two factors that affect function completion time, which are queueing time, start time and execution time. Based on the analysis, we propose a mathematical model to formalize the problem of function scheduling in serverless computing. Second, we propose a scheduling algorithm, called FuncSched, based on temporal-spatial characterstics for serverless computing. The algorithm considers function execution time and function start time in the time dimension, and function resource consumption in the space dimension. Finally, we implement a system prototype, and evaluate it with real-world serverless computing workload datasets. Experimental results show that the proposed algorithm can effectively reduce average function completion time, thus effectively improving function execution efficiency in serverless computing.
Tactile feedback is an essential component of virtual reality systems, as it enhances the user’s immersion and engagement in the virtual environment. However, traditional tactile feedback methods, such as vibration mode, suffer from the limitation of a single feedback mode, while mechanical drive type and microfluidic drive mode is complex in structure and difficult to integrate. Moreover, micro-current tactile feedback offers high integration and rich feedback modalities, but it suffers from issues such as insufficient feedback strength recognition accuracy and discomfort caused by prolonged electrical stimulation. To address these challenges, we develop and design a novel multi-intensity electro-tactile feedback system based on microcurrent stimulation and determine the optimal stimulation paradigm of this system by studying key influencing factors such as stimulation signal parameters, electrode arrays, and grounded electrode, introducing biphasic current pulses, and optimizing the ratio of positive to negative current charge. We evaluate the performance of the system through psychophysical experiments on 35 subjects. The results show that the system achieves 93.3% and 81.7% accuracy in recognizing four and five levels of intensity, respectively, while effectively reducing discomfort caused by micro-current stimulation. This system outperforms traditional methods and has the potential to be a versatile tactile feedback device with wide application scenarios.
DBSCAN (density-based spatial clustering of applications with noise) is one of the most widely used and studied density clustering algorithms for its simplicity and easy implementation. However, the high time complexity (
The problem of algorithmic fairness has a long history, and it has been constantly renovated with the process of social change. With the acceleration of digital transformation, the root cause of algorithmic fairness problem has gradually shifted from social bias to data bias and model bias. Meanwhile, algorithmic exploitation has become more hidden and far-reaching. Although various fields of social science have studied the problem of fairness for a long time, most of them only stay in qualitative expression. As an intersection of computer science and social science, algorithmic fairness under digital transformation should not only inherit the basic theories of various fields of social science, but also provide the methods and capabilities of fairness computing. Therefore, we start with the definition of algorithmic fairness, and summarize the existing algorithmic fairness computing methods from the three dimensions of social bias, data bias and model bias. Finally, we compare algorithmic fairness indicators and methods by experiments, and then analyze the challenges of algorithmic fairness computing. Our experiments show that there is a trade-off relationship between the fairness and accuracy of original models, and there is a consistent relationship between the fairness and accuracy of fairness methods. Regarding fairness indicators, there is a significant difference in the correlation between different fairness indicators, indicating the importance of diverse fairness indicators. Regarding fairness methods, a single fairness method has limited effect, indicating the importance of exploring combinations of fairness methods.
Adversarial training based on adversarial examples has become an important means to improve model robustness and security recently. COVID-19 makes wearing masks the norm and occluded face recognition a practical need. Aiming at the problem of lacking an adversarial example generation method for occluded face recognition, an adaptive adversarial example generation method AOA (adversarial examples against occluded faces recognition based on adaptive method) is proposed. Firstly, it adjusts the adversarial example generation strategy according to the target model and automatically adjusts the interference area according to the input face. Secondly, by concentrating the disturbance on the area that has more significant impact on recognition and combining with the ensemble model and Gaussian filtering, black-box attacks conducted on local feature enhance ArcSoft and Baidu face recognition. Finally, the combination of dynamic masks and dynamic perturbation multiplier avoids redundant calculation in the attack process and ensures the sustainability of the attack. The generated perturbation makes the face inpainting occlusion recognition model wrongly segment the occlusion area, thereby reducing the model recognition accuracy. We build a face inpainting occlusion recognition model, called Arc-UFI. The experiments show that AOA is effective for attacking both local feature enhancement and face inpainting occluded face recognitions. In addition, AOA can provide useful support for model security adversarial training.
With the successful application of graph representation learning in multiple fields, graph representation learning methods designed for different graph data and problems have exploded. However, the existence of graph noise limits the ability of graph representation learning. In order to effectively reduce the proportion of noise in the graph network, we first analyze the distribution characteristics of the local adjacency of the graph nodes, and theoretically prove that in the construction of the local adjacency topology, exploring high-order neighbor information can optimize the performance of the enhanced graph representation learning. Second, we propose “2-Steps” local subgraph optimization strategy (LOSO). This strategy first constructs a local adjacency similarity matrix with multi-order information based on the original graph topology information. Then, based on the similarity matrix and the local information of the graph nodes, the graph nodes are locally subgraph structure optimization. The proportion of noise in the network through the reasonable reconstruction of local adjacencies is reduced, and then the enhancement of graph representation learning ability is achieved. In the experiments of node classification, link prediction and community discovery tasks, the results indicate the local subgraph optimization strategy in this paper can boost the performance of 8 baseline algorithms. Among them, in the node classification tasks of the three aviation networks, the highest improvement effect reaches 23.11%, 41.58%, and 24.16%, respectively.
Dual networks are composed of physical graph and concept graph. The physical graph and concept graph share the same set of vertices but have different sets of edges. The edges of physical graph represent the real relationship between vertices, while those of concept graph represent the similarity between vertices, usually obtained by computation. Recently, discovering cohesive subgraphs from dual networks, i.e., the subgraphs that connected in physical graph, while cohesive in concept graphs, has attracted extensive attention of researchers. The subgraphs have been widely used in many real scenarios such as seminar preparation, product recommendation and disease causing gene discovery. Yet, few existing studies have considered the influence of cohesive subgraphs in dual networks. To this end, in this paper: 1) An influential cohesive subgraph based on the minimum edge weight, i.e., influential
Although existing image captioning models can detect and represent target objects and visual relationships, they have not focused on the interpretability of image captioning model from the perspective of syntactic relations. To this end, we present an interpretable image caption generation based on dependency syntax triplets modeling (IDSTM) , which leverages the multi-task learning to jointly generate the dependency syntax triplet sequence and image caption. IDSTM firstly obtains the potential dependency syntactic features from the input image through the dependency syntax encoder, and then incorporates these features with the dependency syntactic triplets and textual word embedding vectors into single LSTM (long short-term memory) to generate the dependency syntactic triplet sequence as the prior knowledge. Secondly, the dependency syntactic features are input into captioning encoder to extract the visual object textual features. Finally, hard and soft limitations are adopted to employ the dependency syntactic and relation features into double LSTM for the interpretable image caption generation. With the generation of the task of dependency syntax triplet sequence, IDSTM improves the interpretability of the image captioning model without a significant decrease on the accuracy of the generated captions. In addition, we propose novel metrics B1-DS (BLEU-1-DS), B4-DS (BLEU-4-DS) and M-DS (METEOR-DS) to testify the quality of dependency syntax triplets and to show extensive experimental results on MSCOCO dataset for evaluating the effectiveness and interpretability of IDSTM.
Cross-lingual word embedding aims to use the embedding space of resource-rich languages to improve the embedding of resource-scare languages, and it is widely used in a variety of cross-lingual tasks. Most of the existing methods address the word alignment by learning a linear mapping between two embedding spaces. Among them, the adversarial model based methods have received widespread attention because they can obtain good performance without using any dictionary. However, these methods perform not well on the dissimilar language pairs. The reason may be that the mapping learning only relies on the distance measurement for the entire space without the guidance of the seed dictionary, which results in multiple possibilities for the aligned word pairs and unsatisfying alignment. Therefore, in this paper, a semi-supervised cross-lingual word embedding method based on an adversarial model with dual discriminators is proposed. Based on the existing adversarial model, a bi-directional shared and fine-grained discriminator is added, and then an adversarial model with double discriminators is constructed. In addition, a negative sample dictionary is introduced as a supplement of the supervised seed dictionary to guild the fine-grained discriminator in a semi-supervised way. By minimizing the distance between the initial word-pairs and the supervised dictionary, including the seed dictionary and negative dictionary, the fine-grained discriminator will reduce the possibility of multiple word pairs and recognize the correct aligned pairs from those initial generated dictionaries. Finally, experimental results conducted on two cross-lingual datasets show that our proposed method can effectively improve the performance of the cross-lingual word embedding.
Recent years, with the advancement of the IoT and blockchain, multi-party signature protocols have received renewed attention. Multi-party signature is a special digital signature that requires users to interact with each other to jointly generate a signature for a message and achieve the authentication. Compared with each user signing respectively, the advantage is that the key size can be greatly decreased, and every party cannot get a legal signature only by itself, which can be used to prevent the danger of being impersonated when user’s key is lost or hijacked. On the other hand, the progress of quantum computers poses a potential threat to the traditional public key cryptography scheme, the PQC(post-quantum cryptography) project was organized by the NIST(National Institute of Standards and Technology) in the US since 2016, and it determined the algorithm that was standardized in July 2022. At the same time, the multi-party signature based on its candidate digital signature schemes (such as CRYSTALS-Dilithium) also appeared. Chinese Association for Cryptologic Research(CACR) also held a national cryptographic algorithm design competition in 2019, Aigis-sig, which is the first prize signature algorithm, adopts the similar structure with Dilithium. In this paper, Aitps is proposed, which is a two-party signature based on Aigis-sig. Compared with the existing Dilithium-based two-party signatures, Aitps has better key sizes and signature sizes. For example, the signature sizes can be reduced by more than 20% at the same security level. Lastly, Aitps can also be extended to multi-party signature.
At present, source code vulnerability detection based on deep learning is a highly efficient vulnerability analysis approach. But it faces two challenges: large data sets and effective learning approach. We have done some research work on these two challenges. Firstly, a multi-vulnerability dataset with a sample size of 280793 is constructed based on the SARD dataset, including 150 CWE vulnerabilities. Secondly, the deep learning approach based on comparative learning is proposed. Its core idea is to construct a sample set of the same type and a sample set of different types for each sample in the deep learning training set, forming a comparative learning atmosphere. Based on the training data set created by this idea, the deep learning model can not only learn a large number of more subtle features of the same type of samples, but also extract highly distinguishable features of different types of samples in the training process. Through experimental verification, the deep learning model trained based on the data set and the proposed learning approach in the paper can identify 150 CWE vulnerabilities with an accuracy of 92.0%, an average PR value of 0.84 and an average ROC-AUC value of 0.96. In addition, we also analyze and discuss the commonly used code symbolization technology in deep learning-based vulnerability analysis technology. Experiments show that, in the process of deep learning training, whether the code is symbolized or not will not affect the vulnerability identification accuracy of the deep learning model.
Video-text retrieval has been widely used in many real-world applications and attracted more and more research attention. Recently, many work has been proposed to leverage the visual-language matching knowledge of the pre-training models to further improve the retrieval performance. However, these methods ignore that video and text data are composed of events. If the fine-grained similarities between events in video and events in text can be captured well, it will help to calculate more accurate semantic similarities between texts and videos, and then improve the retrieval performance. Hence, in this paper, we propose a CLIP based multi-event representation generation for video-text retrieval, called CLIPMERG. Specifically, CLIPMERG first utilizes the video encoder and text encoder of pre-training model CLIP to transform the video and text inputs into video frame token sequences and word token sequences, respectively. Next, CLIPMERG uses a video (text) event generator to map the video frame (text word) token sequence into
In recent years, text-to-image generation methods based on generative adversarial networks have become a popular area of research in cross-media convergence. Text-to-image generation methods aim to improve the semantic consistency between text descriptions and generated images by extracting more representational text and image features. Most of the existing methods model the global image features and the initial text semantic features, ignoring the limitations of the initial text features and not fully utilizing the guidance of the semantic consistency of the generated images with the text features, thus reducing the representativeness of the text information in text-to-image synthesis. In addition, because the dynamic interaction between the generated object regions is not considered, the generated network can only roughly delineate the target region and ignore the potential correspondence between local regions of the image and the semantic labels of the text. To solve the above problems, a text-to-image generation method, called ITSC-GAN, based on image-text semantic consistency is proposed in this paper. The model firstly designs a text information enhancement module to enhance the text information using the generated images, thus improving the characterization of text features. Secondly, the model proposes an image regional attention module to enhance the characterization ability of image features by mining the relationship between image sub-regions. By jointly utilizing the two modules, higher consistency between image local features and text semantic labels is achieved. Finally, the model uses the generator and discriminator loss functions as constraints to improve the quality of the generated images and the semantic agreement with the text description. The experimental results show that the