In this paper, we propose a method to select the observation position in visual servoing with an eye-in-vehicle configuration for the manipulator. In traditional visual servoing, the images taken by the camera may have various problems, including being out of view, large perspective aberrance, improper projection area of object in images and so on. In this paper, we propose a method to determine the observation position to solve these problems. A mobile robot system with pan-tilt camera is designed, which calculates the observation position based on an observation and then moves there. Both simulation and experimental results are provided to validate the effectiveness of the proposed method.
In this paper, the problem of load transportation and robust mitigation of payload oscillations in uncertain tower-cranes is addressed. This problem is tackled through a control scheme based on the philosophy of active-disturbance-rejection. Here, a general disturbance model built with two dominant components: polynomial and harmonic, is stated. Then, a disturbance observer is formulated through state-vector augmentation of the tower-crane model. Thus, better performance of estimations for system states and disturbances is achieved. The control law is then formulated to actively reject the disturbances but also to accommodate the closed-loop system dynamics even under system uncertainty. The proposed control schema is validated via experimentation using a small-scale tower-crane, and compared with other relevant active disturbance rejection control (ADRC)-based techniques. The experimental results show that the proposed control scheme is robust under parametric uncertainty of the system, and provides improved attenuation of payload oscillations even under system uncertainty.
This paper addresses an advanced analysis system for the identification of alcoholic brain states from electroencephalogram (EEG) data in an automatic way. This study introduces an optimum allocation based sampling (OAS) scheme to discover the most favourable representative data points from every single time-window of each EEG signal considering the minimal variability of the observations. Combining all representative samples of each time-window in a set, some statistical features are extracted from every set of each class. The Mann-Whitney U test is used to assess whether each of the features is significant between the two classes (e.g., alcoholic and control). In order to evaluate the effectiveness of the OAS-based features, four well-known machine learning methods (decision table, support vector machine (SVM), k-nearest neighbor (k-NN) and logistic regression) are considered for identification of alcoholic brain state. The experimental results on the UCI KDD (i.e., UCI knowledge discovery in databases) database demonstrate that the OAS based decision table algorithm yields the highest accuracy of 99.58% with a low false alarm rate 0.40%, which is an improvement of up to 9.58% over the existing algorithms. A proposed analysis system can be used to detect alcoholism and also to determine the level of alcoholism-related changes in EEG signals.
Fine-grained image classification, which aims to distinguish images with subtle distinctions, is a challenging task for two main reasons: lack of sufficient training data for every class and difficulty in learning discriminative features for representation. In this paper, to address the two issues, we propose a two-phase framework for recognizing images from unseen fine-grained classes, i.e., zero-shot fine-grained classification. In the first feature learning phase, we finetune deep convolutional neural networks using hierarchical semantic structure among fine-grained classes to extract discriminative deep visual features. Meanwhile, a domain adaptation structure is induced into deep convolutional neural networks to avoid domain shift from training data to test data. In the second label inference phase, a semantic directed graph is constructed over attributes of fine-grained classes. Based on this graph, we develop a label propagation algorithm to infer the labels of images in the unseen classes. Experimental results on two benchmark datasets demonstrate that our model outperforms the state-of-the-art zero-shot learning models. In addition, the features obtained by our feature learning model also yield significant gains when they are used by other zero-shot learning models, which shows the flexility of our model in zero-shot fine-grained classification.
Facial emotion recognition is an essential and important aspect of the field of human-machine interaction. Past research on facial emotion recognition focuses on the laboratory environment. However, it faces many challenges in real-world conditions, i.e., illumination changes, large pose variations and partial or full occlusions. Those challenges lead to different face areas with different degrees of sharpness and completeness. Inspired by this fact, we focus on the authenticity of predictions generated by different <emotion, region> pairs. For example, if only the mouth areas are available and the emotion classifier predicts happiness, then there is a question of how to judge the authenticity of predictions. This problem can be converted into the contribution of different face areas to different emotions. In this paper, we divide the whole face into six areas: nose areas, mouth areas, eyes areas, nose to mouth areas, nose to eyes areas and mouth to eyes areas. To obtain more convincing results, our experiments are conducted on three different databases: facial expression recognition + ( FER+), real-world affective faces database (RAF-DB) and expression in-the-wild (ExpW) dataset. Through analysis of the classification accuracy, the confusion matrix and the class activation map (CAM), we can establish convincing results. To sum up, the contributions of this paper lie in two areas: 1) We visualize concerned areas of human faces in emotion recognition; 2) We analyze the contribution of different face areas to different emotions in real-world conditions through experimental analysis. Our findings can be combined with findings in psychology to promote the understanding of emotional expressions.
In this paper, a new adaptive hierarchical sliding mode control scheme for a 3D overhead crane system is proposed. A controller is first designed by the use of a hierarchical structure of two first-order sliding surfaces represented by two actuated and un-actuated subsystems in the bridge crane. Parameters of the controller are then intelligently estimated, where uncertain parameters due to disturbances in the 3D overhead crane dynamic model are proposed to be represented by radial basis function networks whose weights are derived from a Lyapunov function. The proposed approach allows the crane system to be robust under uncertainty conditions in which some uncertain and unknown parameters are highly difficult to determine. Moreover, stability of the sliding surfaces is proved to be guaranteed. Effectiveness of the proposed approach is then demonstrated by implementing the algorithm in both synthetic and real-life systems, where the results obtained by our method are highly promising.
This paper proposes an image encryption algorithm LQBPNN (logistic quantum and back propagation neural network) based on chaotic sequences incorporating quantum keys. Firstly, the improved one-dimensional logistic chaotic sequence is used as the basic key sequence. After the quantum key is introduced, the quantum key is incorporated into the chaotic sequence by nonlinear operation. Then the pixel confused process is completed by the neural network. Finally, two sets of different mixed secret key sequences are used to perform two rounds of diffusion encryption on the confusing image. The experimental results show that the randomness and uniformity of the key sequence are effectively enhanced. The algorithm has a secret key space greater than 2182. The adjacent pixel correlation of the encrypted image is close to 0, and the information entropy is close to 8. The ciphertext image can resist several common attacks such as typical attacks, statistical analysis attacks and differential attacks.
Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi-direction Harris algorithm and a novel compound feature. Multi-scale circle Gaussian Combined invariant moments and multi-direction gray level co-occurrence matrix are extracted as features for image matching. The proposed algorithm is evaluated on numerous multi-source remote sensor images with noise and illumination changes. Extensive experimental studies prove that our proposed method is capable of receiving stable and even distribution of key points as well as obtaining robust and accurate correspondence matches. It is a promising scheme in multi-source remote sensing image registration.
The aim of this work is to model and analyze the behavior of a new smart nano force sensor. To do so, the carbon nanotube has been used as a suspended gate of a metal-oxide-semiconductor field-effect transistor (MOSFET). The variation of the applied force on the carbon nanotube (CNT) generates a variation of the capacity of the transistor oxide-gate and therefore the variation of the threshold voltage, which allows the MOSFET to become a capacitive nano force sensor. The sensitivity of the nano force sensor can reach 0.124 31 V/nN. This sensitivity is greater than results in the literature. We have found through this study that the response of the sensor depends strongly on the geometric and physical parameters of the CNT. From the results obtained in this study, the increase in the applied force has as a consequence an increase in the value of the threshold voltage VTh of the MOSFET. In this paper, we first used artificial neural networks to faithfully reproduce the response of the nano force sensor model. This neural model is called direct model. Then, secondly, we designed an inverse model called an intelligent sensor which allows linearization of the response of our developed force sensor.
The objective of this paper is to propose a reduced-order observer for a class of Lipschitz nonlinear discrete-time systems. The conditions that guarantee the existence of this observer are presented in the form of linear matrix inequalities (LMIs). To handle the Lipschitz nonlinearities, the Lipschitz condition and the Young′s relation are adequately operated to add more degrees of freedom to the proposed LMI. Necessary and sufficient conditions for the existence of the unbiased reduced-order observer are given. An extension to \begin{document}$\mathcal{H}_\infty$\end{document} performance analysis is considered in order to deal with \begin{document}$\mathcal{H}_\infty$\end{document} asymptotic stability of the estimation error in the presence of disturbances that affect the state of the system. To highlight the effectiveness of the proposed design methodology, three numerical examples are considered. Then, high performances are shown through real time implementation using the ARDUINO MEGA 2560 device.
This paper investigates the necessity of feasibility considerations in a fault tolerant control system using the constrained control allocation methodology where both static and dynamic actuator constraints are considered. In the proposed feasible control allocation scheme, the constrained model predictive control (MPC) is employed as the main controller. This considers the admissible region of the control allocation problem as its constraints. Using the feasibility notion in the control allocation problem provides the main controller with information regarding the actuator′s status, which leads to closed loop system performance improvement. Several simulation examples under normal and faulty conditions are employed to illustrate the effectiveness of the proposed methodology. The main results clearly indicate that closed loop performance and stability characteristics can be significantly degraded by neglecting the actuator constraints in the main controller. Also, it is shown that the proposed strategy substantially enlarges the domain of attraction of the MPC combined with the control allocation as compared to the conventional MPC.
2019,  vol. 16,  no. 4,   pp. 413-426
Single image super-resolution has attracted increasing attention and has a wide range of applications in satellite imaging, medical imaging, computer vision, security surveillance imaging, remote sensing, objection detection, and recognition. Recently, deep learning techniques have emerged and blossomed, producing " the state-of-the-art” in many domains. Due to their capability in feature extraction and mapping, it is very helpful to predict high-frequency details lost in low-resolution images. In this paper, we give an overview of recent advances in deep learning-based models and methods that have been applied to single image super-resolution tasks. We also summarize, compare and discuss various models from the past and present for comprehensive understanding and finally provide open problems and possible directions for future research.
2019,  vol. 16,  no. 4,   pp. 427-436
In this contribution, we present iHEARu-PLAY, an online, multi-player platform for crowdsourced database collection and labelling, including the voice analysis application (VoiLA), a free web-based speech classification tool designed to educate iHEARu-PLAY users about state-of-the-art speech analysis paradigms. Via this associated speech analysis web interface, in addition, VoiLA encourages users to take an active role in improving the service by providing labelled speech data. The platform allows users to record and upload voice samples directly from their browser, which are then analysed in a state-of-the-art classification pipeline. A set of pre-trained models targeting a range of speaker states and traits such as gender, valence, arousal, dominance, and 24 different discrete emotions is employed. The analysis results are visualised in a way that they are easily interpretable by laymen, giving users unique insights into how their voice sounds. We assess the effectiveness of iHEARu-PLAY and its integrated VoiLA feature via a series of user evaluations which indicate that it is fun and easy to use, and that it provides accurate and informative results.
2019,  vol. 16,  no. 4,   pp. 437-448
As a major component of speech signal processing, speech emotion recognition has become increasingly essential to understanding human communication. Benefitting from deep learning, many researchers have proposed various unsupervised models to extract effective emotional features and supervised models to train emotion recognition systems. In this paper, we utilize semi-supervised ladder networks for speech emotion recognition. The model is trained by minimizing the supervised loss and auxiliary unsupervised cost function. The addition of the unsupervised auxiliary task provides powerful discriminative representations of the input features, and is also regarded as the regularization of the emotional supervised task. We also compare the ladder network with other classical autoencoder structures. The experiments were conducted on the interactive emotional dyadic motion capture (IEMOCAP) database, and the results reveal that the proposed methods achieve superior performance with a small number of labelled data and achieves better performance than other methods.
2019,  vol. 16,  no. 4,   pp. 449-461
Extracting the three-dimensional (3D) information including location and height of a pedestrian is important for vision-based intelligent traffic monitoring systems. This paper tackles the relationship between pixels′ actual size and pixels′ spatial resolution through a new method named pixel-resolution mapping (P-RM). The proposed P-RM method derives the equations for pixels′ spatial resolutions (XY-direction) and object′s height (Z-direction) in the real world, while introducing new tilt angle and mounting height calibration methods that do not require special calibration patterns placed in the real world. Both controlled laboratory and actual world experiments were performed and reported. The tests on 3D mensuration using proposed P-RM method showed overall better than 98.7% accuracy in laboratory environments and better than 96% accuracy in real world pedestrian height estimations. The 3D reconstructed images for measured points were also determined with the proposed P-RM method which shows that the proposed method provides a general algorithm for 3D information extraction.
2019,  vol. 16,  no. 4,   pp. 462-474
With the rapid development of the robotic industry, domestic robots have become increasingly popular. As domestic robots are expected to be personal assistants, it is important to develop a natural language-based human-robot interactive system for end-users who do not necessarily have much programming knowledge. To build such a system, we developed an interactive tutoring framework, named " Holert”, which can translate task descriptions in natural language to machine-interpretable logical forms automatically. Compared to previous works, Holert allows users to teach the robot by further explaining their intentions in an interactive tutor mode. Furthermore, Holert introduces a semantic dependency model to enable the robot to " understand” similar task descriptions. We have deployed Holert on an open-source robot platform, Turtlebot 2. Experimental results show that the system accuracy could be significantly improved by 163.9% with the support of the tutor mode. This system is also efficient. Even the longest task session with 10 sentences can be handled within 0.7 s.
2019,  vol. 16,  no. 4,   pp. 475-490
This paper presents a novel movement planning algorithm for a guard robot in an indoor environment, imitating the job of human security. A movement planner is employed by the guard robot to continuously observe a certain person. This problem can be distinguished from the person following problem which continuously follows the object. Instead, the movement planner aims to reduce the movement and the energy while keeping the target person under its visibility. The proposed algorithm exploits the topological features of the environment to obtain a set of viewpoint candidates, and it is then optimized by a cost-based set covering problem. Both the robot and the target person are modeled using geodesic motion model which considers the environment shape. Subsequently, a particle model-based planner is employed, considering the chance constraints over the robot visibility, to choose an optimal action for the robot. Simulation results using 3D simulator and experiments on a real environment are provided to show the feasibility and effectiveness of our algorithm.
2019,  vol. 16,  no. 4,   pp. 491-510
2019,  vol. 16,  no. 4,   pp. 511-533
This paper presents a novel five degrees of freedom (DOF) two-wheeled robotic machine (TWRM) that delivers solutions for both industrial and service robotic applications by enlarging the vehicle′s workspace and increasing its flexibility. Designing a two-wheeled robot with five degrees of freedom creates a high challenge for the control, therefore the modelling and design of such robot should be precise with a uniform distribution of mass over the robot and the actuators. By employing the Lagrangian modelling approach, the TWRM′s mathematical model is derived and simulated in Matlab/Simulink®. For stabilizing the system′s highly nonlinear model, two control approaches were developed and implemented: proportional-integral-derivative (PID) and fuzzy logic control (FLC) strategies. Considering multiple scenarios with different initial conditions, the proposed control strategies′ performance has been assessed.
2019,  vol. 16,  no. 4,   pp. 534-542
The convergence analysis of MaxMin-SOMO algorithm is presented. The SOM-based optimization (SOMO) is an optimization algorithm based on the self-organizing map (SOM) in order to find a winner in the network. Generally, through a competitive learning process, the SOMO algorithm searches for the minimum of an objective function. The MaxMin-SOMO algorithm is the generalization of SOMO with two winners for simultaneously finding two winning neurons i.e., first winner stands for minimum and second one for maximum of the objective function. In this paper, the convergence analysis of the MaxMin-SOMO is presented. More specifically, we prove that the distance between neurons decreases at each iteration and finally converge to zero. The work is verified with the experimental results.
2019,  vol. 16,  no. 4,   pp. 543-552
The problem of robust stabilization for a class of discrete-time switched large-scale systems with parameter uncertainties and nonlinear interconnected terms is considered. By using state feedback and Lyapunov function technique, a decentralized switching control approach is put forward to guarantee the solutions of large-scale systems converge to the origin globally. A numerical example and a corresponding simulation result are utilized to verify the effectiveness of the presented approach.
2019,  vol. 16,  no. 4,   pp. 553-563
The purpose of this paper is to propose a synthesis method of parametric sensitivity constrained linear quadratic (SCLQ) controller for an uncertain linear time invariant (LTI) system. System sensitivity to parameter variation is handled through an additional quadratic trajectory parametric sensitivity term in the standard LQ criterion to be minimized. The main purpose here is to find a suboptimal linear quadratic control taking explicitly into account the parametric uncertainties. The paper main contribution is threefold: 1) A descriptor system approach is used to show that the underlying singular linear-quadratic optimal control problem leads to a non-standard Riccati equation. 2) A solution to the proposed control problem is then given based on a connection to the so-called Lur'e matrix equations. 3) A synthesis method of multiple parametric SCLQ controllers is proposed to cover the whole parametric uncertainty while degrading as less as possible the intrinsic robustness properties of each local linear quadratic controller. Some examples are presented in order to illustrate the effectiveness of the approach.
IJAC receives a CiteScore as high as 2.34 in 2018 which is 1.37 times higher than that in 2017. Being in the top 15%, it ranks #69 among 460 journals in respective categories.