• 北大核心期刊(《中文核心期刊要目总览》来源期刊)
  • 中国科技核心期刊(中国科技论文统计源期刊)
  • JST 日本科学技术振兴机构数据库(日)收录期刊

NoticeMore>

Special TopicMore >

Download CenterMore >

Wechat

Articles just accepted have been peer-reviewed and accepted, which are not yet assigned to volumes /issues, but are citable by Digital Object Identifier (DOI).
Display Method:
Multi-label image classification algorithm based on spatial attention and graph convolution
 doi: 10.19304/J.ISSN1000-7180.2021.1166
Abstract(57)
Abstract:
For traditional multi-label image classification models is difficult to generate image features that are closer to related labels, and the visual correlation between the labels is not used, which leads to problems such as insufficient recognition accuracy. This paper proposes a multi-label image classification algorithm based on spatial attention and graph convolution. The algorithm first uses the graph convolution network to learn the features of the label adjacency graph, and introduces the spatial attention mechanism into the high-level semantic information to recalibrate the target features. Then, the high-level semantic information and the label co-occurrence features extracted by the GCN network are fused in the classifier based on the co-occurrence feature fusion, and finish the prediction. Comparison experiments on two public data sets show that the average accuracy of the algorithm in the article on the MS-COCO data set is 1.1% higher than that of MLGCN, and the amount of parameters is only one-eighth of the original model, greatly reduced its training cost.
Mixed-precision Optimization Strategy for Deploying Convolutional Network Based on Memristor Arrays
 doi: 10.19304/J.ISSN1000-7180.2021.1205
Abstract(55)
Abstract:
The memristor array is expected to meet the requirements of edge intelligence for power consumption, storage density, and computing time. However, it is hard to map huge network models with little memristor arrays. To solve this problem, a method to deploy convolutional network that by using single memristor and dual memristor simultaneously in a way of mixed precision is proposed. In order to avoid the contingency of manual setting, a fine-grained mixed-precision network optimization strategy based on particle swarm algorithm is further proposed, which can search the key parameters. To ensure reasonableness of the solutions, the network performance and the number of memristors are both used in the step of fitness calculation, and in order to speed up the search speed, a mixing ratio constraint is added before this step. In addition, the performance and search complexity are compared with other optimization algorithms. For 4-value memristor, the optimized assignment can get 33% higher precision than the manual setting assignment. This work is expected to provide a friendly and feasible non-von Neumann hardware solution by Edge Intelligence.
 doi: 10.19304/J.ISSN1000-7180.2021.1216
Abstract(58)
Abstract:
Automotive condenser surface defect detection is an important part of automobile manufacturing process. At present the deep learning method is widely used in the field of industrial defect image detection because of its good robustness and accuracy. However, in the actual production, the yield of good products is high, the high rate of good products makes it difficult to collect defective images, which makes the deep learning method based on large-scale training samples encounter bottlenecks. To solve the above problems, a deep convolutional generative adversarial network (DCGAN) model based on semi-supervised and self-attention mechanism is proposed to generate the surface defect images of automobile condenser. Firstly, self-attention mechanism is introduced in DCGAN to overcome the problem of long-range feature extraction in convolutional network. Secondly, a supervised classifier is added to the discriminator and the loss function is modified to the Wasserstein distance with the classifier's cross entropy loss, which improves the convergence speed and stability of the model. Finally, condition normalization and category label fusion are used to enable the model to generate condenser images with specific defects. Experimental results show that the proposed model can generate high-quality condenser defect images, the FID value reaches 44.35, which is better than the existing DCGAN and SAGAN. Compared with ACGAN, the diversity of generated images is also significantly improved.
VisualCommonsense and Attention on Attention for Image Captioning
 doi: 10.19304/J.ISSN1000-7180.2021.1226
Abstract(51)
Abstract:
Image Captioning can be applied to retrieval systems, navigation for the blind and medical report generation. visual commonsense and attention on attention for image captioning is proposed to address the problems that the image captioning models for local features do not sufficiently mine the visual semantic relations and the features extracted by the multi-level attention mechanism suffer from attention deviation. Under the framework of codec structure, visual commonsense is introduced in the encoding part to guide local features to generate commonsense semantic relations, and attention on attention is applied to the high-level semantics mined by multi-layer attention, which can enhance features and obtain better relevance and reduce attention deviation to mislead sequence generation at the decoding part. The model was tested on MS COCO dataset, and the experimental results showed that BLEU, CIDEr and SPICE were improved to some extent, which indicated that the model could express the semantic content of images more accurately and more richly.
Articles in press have been peer-reviewed and accepted, which are not yet assigned to volumes /issues, but are citable by Digital Object Identifier (DOI).
Display Method:
Display Method:
Research progress of VLSI detailed routing algorithm
QU Tong, GAI Tianyang, WANG Shuhan, SU Xiaojing, SU Yajuan, WEI Yayi
2021, 38(11): 1-6.   doi: 10.19304/J.ISSN1000-7180.2021.0030
Abstract(88) HTML(31) PDF (23)
Abstract:
Detailed routing in very-large-scale integration (VLSI) is one of the most important and challenging stages in physical design. The path of all wires will be determined at this stage, the quality of routing is directly related to the area and performance of the chip.The path search is one of the most time-consuming steps in routing. First, the grid-based routing model is described in this paper, which models the routing problem into a graph search problem or a multi-commodity flow problem. Then it summarizes the application of maze search algorithm, A* algorithm, integer linear programming (ILP) algorithm, and parallel acceleration algorithm in path search and optimization for design constraints, and analyzes its pros and cons in the application of routers. Finally, it summarized and reviewed the research progress of algorithms based on machine learning, analyzed the existing problems, and looked forward to the development trend of detailed routing algorithms. Analysis shows that the comprehensive performance of the A* algorithm in terms of routing quality, stability, and speed is over performed than other algorithms. The difficulty lies in designing a reasonable routing strategy and graph model. Reinforcement learning has great research potential, the current research is only tested in a small-scale design, and further improvement and exploration are still needed.
Multi-modal group activity recognition method combining motion trajectory features
WANG Shihui, ZHU Yongxin, WANG Hui, ZHENG Xiaoying
2021, 38(11): 7-13.   doi: 10.19304/J.ISSN1000-7180.2021.0341
Abstract(80) HTML(45) PDF (23)
Abstract:
Group Activity Recognition focused on group activities and individual actions classification from a group level perspective. A better group activity recognition result is of great significance to applications such as security monitoring and sports video analysis. To deal with the problem that current LSTM based models could not fully extract spatial-temporal features at group level, a LSTM-Transformer based group activity recognition model was proposed to utilize group-individual features. Additionally, a multi-modal model combining trajectory features was proposed for the first time at group activity recognition.The experimental results show that compared with the existing LSTM-based models, the accuracy of the proposed model's group activity recognition is increased by 8.3%, and the accuracy of individual action recognition is increased by 2.1%. Compared with the GCN-based model, the proposed model not only improves the recognition accuracy, but also can handle group with varying size.
Two stage unsupervised domain adaption algorithm
TAO Yang, YANG Wen, LIN Feipeng, WENG Shan
2021, 38(11): 14-20.   doi: 10.19304/J.ISSN1000-7180.2021.0101
Abstract(29) HTML(14) PDF (10)
Abstract:
The distribution of source domain samples and target domain samples is quite different in domain adaption. Traditional domain adaption methods often tend to ignore the prior label information of samples when aligning the domain distribution, which leads to the lack of discriminability of subspace samples after projection. To alleviate this problem, a two-stage unsupervised domain adaption methodis proposed.This method uses class label information of samples to obtain the discriminative target subspace, and at the same time imposes block diagonal constraints on the reconstruction matrix, obtains domain invariant features in the cross-domain subspace, and improves the performance of model classification. Experiments on three benchmark datasets commonly used in domain adaptation, and the results show that the proposed method has better classification performance.
Grasshopper optimization algorithm based on chaos and cauchy mutation and feature selection
LAN Yaxun
2021, 38(11): 21-30.   doi: 10.19304/J.ISSN1000-7180.2021.0084
Abstract(73) HTML(36) PDF (8)
Abstract:
There are some shortages when traditional grasshopper optimization algorithm deals with some complicated optimization problems, such as slower convergence speed and more easy to fall into a local optimum. For this reason, a non-linear grasshopper optimization algorithm CCGOA, which combines chaotic mapping and Cauchy mutation mechanism, is proposed. The population is initialized by mixing chaotic Tent map and opposite-learning mechanism, which can ensure better-quality individuals in initial population and make all individuals distributed more uniform in search space as far as possible. A non-linear adaptive coefficient update mechanism based on cosine function is designed to get better trade-off between global search and local development. And, Cauchy mutation is introduced to mutate and disturb the current optimal individual to avoid the algorithm falling into the local optimum. The benchmark function optimization test proves that the algorithm can effectively improve the optimization accuracy and convergence speed. The feature selection algorithm CCGOA-FS is designed and applied to solve the feature selection problem. Several data set tests proved that the algorithm can effectively select the optimal feature subset and improve the accuracy of data classification.
Restoration of minimum cascade mobile for wireless sensor networks
WU Hao, CHEN Wenbai, HAO Cui, MA Hang
2021, 38(11): 31-37.   doi: 10.19304/J.ISSN1000-7180.2021.0196
Abstract(397) HTML(32) PDF (4)
Abstract:
Aiming at the problem of connectivity failure caused by node failure in complex application scenarios of wireless sensor networks, a restoration method of minimum cascade mobility is proposed. When a node in the network fails, the cut point detection algorithm is used to determine whether the failed node affects the network connectivity. If the failure node is a cut point, it is considered that the node has a great impact on the network connectivity and needs to be restored. If the distance between the neighbor nodes of the failed node is less than or equal to the communication radius of the node, the network is restored by directly establishing communication links between the neighbor nodes. If the distance between the neighbor nodes of the failed node is greater than the communication radius of the node, the best candidate node is selected by the degree of node and the Euclidean distance between nodes, and the location to be moved is calculated by the communication radius and the location of the neighbor node. After the best candidate node moves to the location, it establishes a communication link with other neighbor nodes. Experimental results show that this method can effectively restore the network after the node failure. In the process of restoration, this method reduces the moving distance of the restoration node, reduces the energy loss of the restoration node, and prolongs the network life.
Multi-obstacle visual sensor network deployment optimization based on improved difference algorithm
WU Xiaoling, CHEN Xinyang, LUO Xiaowei, LING Jie
2021, 38(11): 38-44.   doi: 10.19304/J.ISSN1000-7180.2021.0046
Abstract(62) HTML(33) PDF (2)
Abstract:
The VSN is much more sensitive to the obstacle than other sensors, which means the obstacle can affect the visual sensor perception range, different from the traditional wireless sensor networks(WSNs). Nowadays, most research on optimal VSN deployment mainly focuses on the scenes without obstacles, neglecting multi-obstacle scenes due to the limitation of real scene model simulation. However, Building Information Model (BIM) technology can provide reliable real scene data as input for the optimal algorithm due to the features that it can give the geometric and non-geometric properties of actual scene elements automatically. This paper proposes a novel BIM-SAMDE framework that combines BIM and Segment-Adaptive-Migration Differential Evolution algorithm. This paper also presents a multi-nodes effective perception coverage analysis algorithm to calculate the optimal effective coverage rate for collaborative visual sensor network deployment in multi-obstacle cases. The simulation result shows that the proposed BIM-SAMDE framework can automatically obtain the multi-obstacle scenes data. The proposed SAMDE algorithm has the superior optimizing performance in multi-obstacle scenes, with a fast convergence rate compared with other optimal algorithms.
A light weight image super-resolution algorithm
LI Lang, TAO Yang
2021, 38(11): 45-52.   doi: 10.19304/J.ISSN1000-7180.2021.0188
Abstract(71) HTML(45) PDF (12)
Abstract:
At present, most image super-resolution reconstruction algorithms improve the reconstruction ability of the algorithm by increasing the depth or width of the convolutional neural network. Although these algorithms can improve the reconstruction effect to a certain extent, the algorithm complexity is high. In order to improve this problem, a lightweight super-resolution reconstruction algorithm based on attention mechanism and feature fusion is proposed. Based on the pixel attention mechanism, pixels with different characteristics are weighted. Subsequently, a multi-level feature fusion module is designed based on the lightweight channel attention mechanism module, and further uses the global residual connection to perform global feature fusion. Finally, sub-pixel convolution up-sampling is performed on the extracted features. The algorithm is tested on four commonly used public data sets in the field of image super-resolution reconstruction, Set5, Set14, BSD100, and Urban100. The experimental results show that compared with others algorithms, in terms of image evaluation indicators, it has a higher peak signal-to-noise ratio and structural similarity. In terms of visual effects, compared with the reconstructed image of the contrast algorithm, the image reconstructed by this algorithm has richer details.At the same time, compared with the comparison algorithm, the proposedalgorithm has the least amount of parameters.
MRI / PET image fusion based on 3D NSDST and improved spatial frequency
ZHENG Wei, AN Xiaolin, LI Han, MA Zepeng
2021, 38(11): 53-60.   doi: 10.19304/J.ISSN1000-7180.2021.0148
Abstract(60) HTML(37) PDF (6)
Abstract:
In order to solve the problems of insufficient detail expression and incomplete energy information in 3D Magnetic Resonance Imaging (MRI) and 3D Positron Emission Tomography (PET) fusion, an image fusion method based on 3D Nonsubsampled discrete shearlet transform and improved spatial frequency was proposed. 3D NSDST is used to decompose MRI image and PET image into one low-frequency sub-band and several high-frequency sub-band. The low-frequency sub-band adopts the improved spatial frequency(SF)fusion strategy to adaptively adjust the size of the image block, and takes into account the voxel information of 26 neighborhood in three-dimensional space, and introducesweight sum of three-dimensional modified Laplacian to retain energy information. The high frequency subband adopts the fusion strategy of pulse coupled neural network(PCNN). The weight sum of three-dimensional modified Laplacian is used as the input, and the energy of three-dimensional gradient is used as the link strength to adjust the neurons. Finally, the 3D NSDST inverse transform is used to reconstruct the image and realize the MRI / PET image fusion. Experimental results show that the fusion strategy combining 3D NSDST and improved spatial frequency can effectively preserve the detail information in the image without affecting the overall contrast of the image. It has certain advantages compared with existing algorithms in subjective and objective evaluation..
Design of RISC-V dedicated instructions for GNSS channel decoding
QIN Shuang, LI Jian, CHEN Jie
2021, 38(11): 61-66.   doi: 10.19304/J.ISSN1000-7180.2021.0061
Abstract(49) HTML(25) PDF (4)
Abstract:
With the increase of global navigation satellite system(GNSS) signals, more and more channel decoding algorithms need to be processed by navigation receivers. Although the traditional method using a coprocessor can improve the efficiency of channel decoding, it consumes a lot of hardware resources. Using software to implement channel decoding can use instruction sets such as DSP and SIMD for acceleration, but these instruction sets are not only extended for channel decoding, and most of the instructions are rarely used in channel decoding algorithms. In this way, the channel decoding efficiency is low. Based on the RISC-V instruction set, seven dedicated instructions are extended for GNSS channel decoding. These dedicated instructions enrich the bit manipulations of RISC-V. Compared with the same channel decoding program, the optimized algorithm code amount is reduced. The BCH and deinterleave algorithm code amount reduced by 50%. The gem5 simulator and self-designed RISC-V processor Nightcore verification results show that the number of cycles of the optimized algorithm is reduced. Among them, the number of operating cycles of the deinterleave algorithm is reduced by 92%.
GRU network for flight-data anomaly detection based on FPGA
PENG Jingtong, ZHU Yongxin, WANG Hui, KONG Xiangcong, ZHANG Qinrun, GUO Zhentang
2021, 38(11): 67-73.   doi: 10.19304/J.ISSN1000-7180.2021.0103
Abstract(30) HTML(10) PDF (10)
Abstract:
The anomaly detection of flight data on commercial large aircraft has difficulties in high real-time requirement and massive testing points. The use of traditional time series processing software has disadvantages such as long processing time. This paper proposes a Gated Recurrent Unit(GRU) anomaly detection neural network based on FPGA, which is used for time series analysis of flight vibration data sources for anomaly detection. In order to meet the real-time processing requirements of high frequency sampling data, the implementation of GRU is optimized in many aspects of parallel acceleration. The first is to propose a structured parallel optimization method. The weight parameters are stored in the FPGA on-chip memory and the array is cut in dimensions, so that the weight parameters can be read in parallel in the column dimension, and the parallel calculation optimization of the matrix-vector multiplication is implemented to achieve the efficient calculation efficiency of the GRU network.The first is to propose a structured parallel optimization method. The weight parameters are stored in the FPGA on-chip memory and the array is partition in dimension, so that the weight parameters can be read in parallel in the column dimension, and the parallel calculation optimization of the matrix vector multiplication is implemented to achieve the efficient calculation efficiency of the GRU network. The second is to optimize the calculation method of the GRU network activation function. The use of BRAMs as Lookup-table greatly reduces the delay of activation function operations and the consumption of computing resources in a pipeline way.The third is to adjust the data path of the GRU network. By optimizing the calculation sequence, , the dependence of the two sets of matrix vector multiplication is eliminated, and the critical delay is reduced by 40%. The experimental results show that a hardware accelerator of high energy efficiency ratio is achieved with 10.33GFLOPS throughput and 2.532w power consumption.
Research on visual tracking system of lightweight garbage collection robot
HU Minghong, GUO Hui, ZHOU Shaoping, LIU Yafei
2021, 38(11): 74-80.   doi: 10.19304/J.ISSN1000-7180.2021.0259
Abstract(75) HTML(31) PDF (0)
Abstract:
To increase the autonomous perception ability of the garbage pickup robot, an improved lightweight target detection algorithm YOLO-TrashNet based on YOLOV4 for garbage tracking vision system is proposed. Aiming at the trade-off between speed and accuracy of the visual tracking system, the backbone network is replacedwith MobileNetV3 on the basis of YOLOV4, the effects of SE (Squeeze-and-Excitation) attention mechanism, CBAM (Convolutional Block Attention Module) attention mechanism and CSP cross-level local network structure on the performance of the algorithm are analyzed.Ithasbuilt a garbage collection robot vision system, used Realsense depth cameras that can improve target positioning, collected 15 most common types of garbage in public places, and completed indoor garbage tracking experiments.The experimental results shows that the CSPMobileNetV3-CBAM backbone network model proposed in this paper can greatly increase the detection speed, compared with YOLO-V4, the amount of calculation is reduced by 93.3%, the weight is only 19.5MB, and the memory consumption is lower than YOLOV4-tiny. Compared with YOLO-V4, the Garbage detection sacrifices 4% accuracy and its speed is increased by 6 times on Jetson Nano, its mAP is 86.3%. Provides a high-real-time and high-accuracy visual tracking system for garbage collection robots.
Formalized modeling-based anomaly detection for NC code
PAN Jun, HAN Jingchen, YU Dan, CHEN Yongle
2021, 38(11): 81-87.   doi: 10.19304/J.ISSN1000-7180.2021.0224
Abstract(28) HTML(15) PDF (3)
Abstract:
The control of Computer numerical control (CNC) machine tools is usually realized by using Numerical Control (NC) code. If the NC code is man-made during transmission, it poses a serious security threat to machine parts and even machine tools. Therefore, this paper proposes an NC code automation anomaly detection method, which can better protect the machine tool. The C Programming Language is used to formalize NC code modeling, and linear-time Temporal Logic is used to detect NC code formal model anomalies, and to achieve the efficient automated anomaly detection of NC code. The experimental results show that the method can effectively identify 5 types of abnormal operations, and it has good scalability and can be used in a variety of CNC systems.
Accelerator design and implementation for automatic searching neural network
HE Wen, ZHU Yongxin, WANG Hui, HUANG Zunkai
2021, 38(11): 88-94.   doi: 10.19304/J.ISSN1000-7180.2021.0279
Abstract(61) HTML(36) PDF (3)
Abstract:
In recent years, the Automatic Searching Neural Networks obtained through Neural Architecture Search (NAS) has performed quite prominently in visual tasks, but their complex and variable convolution scale and convolution types limit their application in edge-side devices. To solve this problem, a high flexibility and high frame rate accelerator is proposed to accelerate automatic searching neural networks represented by MnasNet. Firstly, the Array Multiplexing Mixed Convolution(AMMC) structure is proposed for its rich convolution types, which can realize the parallel processing of different convolutions in different directions without using additional MAC resources. Secondly, a variable precision Configurable Multiple Selection Activation(CMA) structure is proposed, which can effectively realize the high-precision fitting of various activation functions. When the accelerator is deployed on the zcu102 chip of Xilinx with a 32*32 MAC scale, the clock frequency can reach 200 MHz, the power consumption of the accelerator is 3.2 w, and the actual operating frame rate for 224×224 size image of MnasNet-a1 is 272.9 fps.
A parallel input re-encoding circuit to reduce 3-D VRRAM thermal crosstalk effects
CHEN Zhisheng, ZHANG Feng
2021, 38(11): 95-100.   doi: 10.19304/J.ISSN1000-7180.2021.0258
Abstract(54) HTML(27) PDF (1)
Abstract:
3-D Vertical Resistive Random Access Memory (VRRAM) is a new architecture widely studied to reduce the cost of resistive random access memory cells. The current performance evaluation of 3-D VRRAM arrays mainly focuses on the analysis of write and read margins. However, the thermal crosstalk effect in 3-D VRRAM is also a problem worthy of attention. Excessive thermal crosstalk will significantly reduce the reliability of memory cells in the array. This paper proposes a parallel write re-encoding circuit to reduce the thermal crosstalk effect caused by massive parallel writes by re-encoding the input data. The experimental results show that when the parallel write array size is 4×4, 4×8, and 8×8, the input re-encoding circuit proposed in this paper can reduce the thermal crosstalk effect by 21.8%, 23.9%, and 12.2%, respectively. Besides, using the write re-encoding proposed in this paper will only increase the write latency by 3% and the additional area of 0.07%.
In-memory computing of STT-MRAM based on 2T1MTJ cell structure
ZHENG Zhiqiang, CHEN Junjie, YAN Sicen, HU Wei, WANG Shaohao
2021, 38(11): 101-108.   doi: 10.19304/J.ISSN1000-7180.2021.0240
Abstract(64) HTML(36) PDF (6)
Abstract:
The traditional Von Neumann computing architecture is difficult to balance low latency and low power consumption when processing data-intensive applications. By innovating the data processing architecture and effectively improving the communication efficiency between the processor and the memory, In-memory computing technology is expected to overcome this "memory wall"problem.A general-purpose STT-MRAM based on the 2T1MTJ cell structure in-memory calculation schemeis proposed, which realizes the in-memory logic operation and MRAM memory function by accessing transistors. To evaluate the proposed scheme's performance, a CMOS/MTJ hybrid simulation was performed, which combines SMIC 55nm process with MTJ compact model, and the performance was compared with similar schemes based on 1T1MTJ and 2T2MTJ cell structures. The results show that due to the use of a single logic operation reference cell with the same MTJ as the memory cell, the accuracy of the AND/OR bit logic operation and the cell write accuracy of the 2T1MTJ scheme are better than 1T1MTJ scheme in different MTJ process deviations, TMR, temperature and VDD. Compared with the 2T2MTJ solution, the writing accuracy rate of the proposed solution is 37.1% higher, and the cell area is halved. In addition, an improved 2T1MTJ cell structure using dual-threshold transistors is proposed. Its read and write performance is better than the 2T1MTJ scheme which uses the same access transistor, and the accuracy of cell writing is improved by 9.4%.

Found in 1972
Monthly

Supervisor:
Xi'an Institute of Microelectronics Technology

Sponsor:
China Aerospace Science and Technology Corporation

ISSN 1000-7180

CN 61-1123/TN