|Commenced in January 2007||Frequency: Monthly||Edition: International||Paper Count: 75|
In this work, a training algorithm for probabilistic neural networks (PNN) is presented. The algorithm addresses one of the major drawbacks of PNN, which is the size of the hidden layer in the network. By using a cross-validation training algorithm, the number of hidden neurons is shrunk to a smaller number consisting of the most representative samples of the training set. This is done without affecting the overall architecture of the network. Performance of the network is compared against performance of standard PNN for different databases from the UCI database repository. Results show an important gain in network size and performance.
The issue of high blood sugar level, the effects of which might end up as diabetes mellitus, is now becoming a rampant cardiovascular disorder in our community. In recent times, a lack of awareness among most people makes this disease a silent killer. The situation calls for urgency, hence the need to design a device that serves as a monitoring tool such as a wrist watch to give an alert of the danger a head of time to those living with high blood glucose, as well as to introduce a mechanism for checks and balances. The neural network architecture assumed 8-15-10 configuration with eight neurons at the input stage including a bias, 15 neurons at the hidden layer at the processing stage, and 10 neurons at the output stage indicating likely symptoms cases. The inputs are formed using the exclusive OR (XOR), with the expectation of getting an XOR output as the threshold value for diabetic symptom cases. The neural algorithm is coded in Java language with 1000 epoch runs to bring the errors into the barest minimum. The internal circuitry of the device comprises the compatible hardware requirement that matches the nature of each of the input neurons. The light emitting diodes (LED) of red, green, and yellow colors are used as the output for the neural network to show pattern recognition for severe cases, pre-hypertensive cases and normal without the traces of diabetes mellitus. The research concluded that neural network is an efficient Accu-Chek design tool for the proper monitoring of high glucose levels than the conventional methods of carrying out blood test.
Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.
The application of neural network using pattern recognition to study the fluid dynamics and predict the groundwater reservoirs properties has been used in this research. The essential of geophysical survey using the manual methods has failed in basement environment, hence the need for an intelligent computing such as predicted from neural network is inevitable. A non-linear neural network with an XOR (exclusive OR) output of 8-bits configuration has been used in this research to predict the nature of groundwater reservoirs and fluid dynamics of a typical basement crystalline rock. The control variables are the apparent resistivity of weathered layer (p1), fractured layer (p2), and the depth (h), while the dependent variable is the flow parameter (F=λ). The algorithm that was used in training the neural network is the back-propagation coded in C++ language with 300 epoch runs. The neural network was very intelligent to map out the flow channels and detect how they behave to form viable storage within the strata. The neural network model showed that an important variable gr (gravitational resistance) can be deduced from the elevation and apparent resistivity pa. The model results from SPSS showed that the coefficients, a, b and c are statistically significant with reduced standard error at 5%.
Recently PM-10 has become a social and global issue. It is one of major air pollutants which affect human health. Therefore, it needs to be forecasted rapidly and precisely. However, PM-10 comes from various emission sources, and its level of concentration is largely dependent on meteorological and geographical factors of local and global region, so the forecasting of PM-10 concentration is very difficult. Neural network model can be used in the case. But, there are few cases of high concentration PM-10. It makes the learning of the neural network model difficult. In this paper, we suggest a simple input balancing method when the data distribution is uneven. It is based on the probability of appearance of the data. Experimental results show that the input balancing makes the neural networks’ learning easy and improves the forecasting rates.
The increasing amount of collected data has limited the performance of the current analyzing algorithms. Thus, developing new cost-effective algorithms in terms of complexity, scalability, and accuracy raised significant interests. In this paper, a modified effective k-means based algorithm is developed and experimented. The new algorithm aims to reduce the computational load without significantly affecting the quality of the clusterings. The algorithm uses the City Block distance and a new stop criterion to guarantee the convergence. Conducted experiments on a real data set show its high performance when compared with the original k-means version.
Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.
The use of decision support systems in agriculture may help monitoring large fields of crops by automatically detecting the symptoms of foliage diseases. In our work, we designed and implemented a decision support system for small tomatoes producers. This work investigates ways to recognize the late blight disease from the analysis of digital images of tomatoes, using a pair of multilayer perceptron neural networks. The networks outputs are used to generate repainted tomato images in which the injuries on the plant are highlighted, and to calculate the damage level of each plant. Those levels are then used to construct a situation map of a farm where a cellular automata simulates the outbreak evolution over the fields. The simulator can test different pesticides actions, helping in the decision on when to start the spraying and in the analysis of losses and gains of each choice of action.
Foliage diseases in plants can cause a reduction in both quality and quantity of agricultural production. Intelligent detection of plant diseases is an essential research topic as it may help monitoring large fields of crops by automatically detecting the symptoms of foliage diseases. This work investigates ways to recognize the late blight disease from the analysis of tomato digital images, collected directly from the field. A pair of multilayer perceptron neural network analyzes the digital images, using data from both RGB and HSL color models, and classifies each image pixel. One neural network is responsible for the identification of healthy regions of the tomato leaf, while the other identifies the injured regions. The outputs of both networks are combined to generate the final classification of each pixel from the image and the pixel classes are used to repaint the original tomato images by using a color representation that highlights the injuries on the plant. The new images will have only green, red or black pixels, if they came from healthy or injured portions of the leaf, or from the background of the image, respectively. The system presented an accuracy of 97% in detection and estimation of the level of damage on the tomato leaves caused by late blight.
The study of the electrical signals produced by neural activities of human brain is called Electroencephalography. In this paper, we propose an automatic and efficient EEG signal classification approach. The proposed approach is used to classify the EEG signal into two classes: epileptic seizure or not. In the proposed approach, we start with extracting the features by applying Discrete Wavelet Transform (DWT) in order to decompose the EEG signals into sub-bands. These features, extracted from details and approximation coefficients of DWT sub-bands, are used as input to Principal Component Analysis (PCA). The classification is based on reducing the feature dimension using PCA and deriving the supportvectors using Support Vector Machine (SVM). The experimental are performed on real and standard dataset. A very high level of classification accuracy is obtained in the result of classification.
Work is in on line Arabic character recognition and the principal motivation is to study the Arab manuscript with on line technology.
This system is a Markovian system, which one can see as like a Dynamic Bayesian Network (DBN). One of the major interests of these systems resides in the complete models training (topology and parameters) starting from training data.
Our approach is based on the dynamic Bayesian Networks formalism. The DBNs theory is a Bayesians networks generalization to the dynamic processes. Among our objective, amounts finding better parameters, which represent the links (dependences) between dynamic network variables.
In applications in pattern recognition, one will carry out the fixing of the structure, which obliges us to admit some strong assumptions (for example independence between some variables). Our application will relate to the Arabic isolated characters on line recognition using our laboratory database: NOUN. A neural tester proposed for DBN external optimization.
The DBN scores and DBN mixed are respectively 70.24% and 62.50%, which lets predict their further development; other approaches taking account time were considered and implemented until obtaining a significant recognition rate 94.79%.
Depending on the big data analysis becomes important, yield prediction using data from the semiconductor process is essential. In general, yield prediction and analysis of the causes of the failure are closely related. The purpose of this study is to analyze pattern affects the final test results using a die map based clustering. Many researches have been conducted using die data from the semiconductor test process. However, analysis has limitation as the test data is less directly related to the final test results. Therefore, this study proposes a framework for analysis through clustering using more detailed data than existing die data. This study consists of three phases. In the first phase, die map is created through fail bit data in each sub-area of die. In the second phase, clustering using map data is performed. And the third stage is to find patterns that affect final test result. Finally, the proposed three steps are applied to actual industrial data and experimental results showed the potential field application.
In pattern clustering, nearest neighborhood point computation is a challenging issue for many applications in the area of research such as Remote Sensing, Computer Vision, Pattern Recognition and Statistical Imaging. Nearest neighborhood computation is an essential computation for providing sufficient classification among the volume of pixels (voxels) in order to localize the active-region-of-interests (AROI). Furthermore, it is needed to compute spatial metric relationships of diverse area of imaging based on the applications of pattern recognition. In this paper, we propose a new methodology for finding the nearest neighbor point, depending on making a virtually grid of a hexagon cells, then locate every point beneath them. An algorithm is suggested for minimizing the computation and increasing the turnaround time of the process. The nearest neighbor query points Φ are fetched by seeking fashion of hexagon holistic. Seeking will be repeated until an AROI Φ is to be expected. If any point Υ is located then searching starts in the nearest hexagons in a circular way. The First hexagon is considered be level 0 (L0) and the surrounded hexagons is level 1 (L1). If Υ is located in L1, then search starts in the next level (L2) to ensure that Υ is the nearest neighbor for Φ. Based on the result and experimental results, we found that the proposed method has an advantage over the traditional methods in terms of minimizing the time complexity required for searching the neighbors, in turn, efficiency of classification will be improved sufficiently.
Many approaches to pattern recognition are founded on probability theory, and can be broadly characterized as either generative or discriminative according to whether or not the distribution of the image features. Generative and discriminative models have very different characteristics, as well as complementary strengths and weaknesses. In this paper, we study these models to recognize the patterns of alphabet characters (A-Z) and numbers (0-9). To handle isolated pattern, generative model as Hidden Markov Model (HMM) and discriminative models like Conditional Random Field (CRF), Hidden Conditional Random Field (HCRF) and Latent-Dynamic Conditional Random Field (LDCRF) with different number of window size are applied on extracted pattern features. The gesture recognition rate is improved initially as the window size increase, but degrades as window size increase further. Experimental results show that the LDCRF is the best in terms of results than CRF, HCRF and HMM at window size equal 4. Additionally, our results show that; an overall recognition rates are 91.52%, 95.28%, 96.94% and 98.05% for CRF, HCRF, HMM and LDCRF respectively.
Pattern discovery from time series is of fundamental importance. Particularly, when information about the structure of a pattern is not complete, an algorithm to discover specific patterns or shapes automatically from the time series data is necessary. The dynamic time warping is a technique that allows local flexibility in aligning time series. Because of this, it is widely used in many fields such as science, medicine, industry, finance and others. However, a major problem of the dynamic time warping is that it is not able to work with structural changes of a pattern. This problem arises when the structure is influenced by noise, which is a common thing in practice for almost every application. This paper addresses this problem by means of developing a novel technique called adaptive dynamic time warping.
Control chart pattern recognition is one of the most important tools to identify the process state in statistical process control. The abnormal process state could be classified by the recognition of unnatural patterns that arise from assignable causes. In this study, a wavelet based neural network approach is proposed for the recognition of control chart patterns that have various characteristics. The procedure of proposed control chart pattern recognizer comprises three stages. First, multi-resolution wavelet analysis is used to generate time-shape and time-frequency coefficients that have detail information about the patterns. Second, distance based features are extracted by a bi-directional Kohonen network to make reduced and robust information. Third, a back-propagation network classifier is trained by these features. The accuracy of the proposed method is shown by the performance evaluation with numerical results.
The use of neural networks for recognition application is generally constrained by their inherent parameters inflexibility after the training phase. This means no adaptation is accommodated for input variations that have any influence on the network parameters. Attempts were made in this work to design a neural network that includes an additional mechanism that adjusts the threshold values according to the input pattern variations. The new approach is based on splitting the whole network into two subnets; main traditional net and a supportive net. The first deals with the required output of trained patterns with predefined settings, while the second tolerates output generation dynamically with tuning capability for any newly applied input. This tuning comes in the form of an adjustment to the threshold values. Two levels of supportive net were studied; one implements an extended additional layer with adjustable neuronal threshold setting mechanism, while the second implements an auxiliary net with traditional architecture performs dynamic adjustment to the threshold value of the main net that is constructed in dual-layer architecture. Experiment results and analysis of the proposed designs have given quite satisfactory conducts. The supportive layer approach achieved over 90% recognition rate, while the multiple network technique shows more effective and acceptable level of recognition. However, this is achieved at the price of network complexity and computation time. Recognition generalization may be also improved by accommodating capabilities involving all the innate structures in conjugation with Intelligence abilities with the needs of further advanced learning phases.
In this paper a non-parametric statistical pattern recognition algorithm for the problem of credit scoring will be presented. The proposed algorithm is based on a clustering k- means algorithm and allows for the determination of subclasses of homogenous elements in the data. The algorithm will be tested on two benchmark datasets and its performance compared with other well known pattern recognition algorithm for credit scoring.