Performance Analysis of Artificial Neural Network Based Land Cover Classification
Landcover classification using automated classification techniques, while employing remotely sensed multi-spectral imagery, is one of the promising areas of research. Different land conditions at different time are captured through satellite and monitored by applying different classification algorithms in specific environment. In this paper, a SPOT-5 image provided by SUPARCO has been studied and classified in Environment for Visual Interpretation (ENVI), a tool widely used in remote sensing. Then, Artificial Neural Network (ANN) classification technique is used to detect the land cover changes in Abbottabad district. Obtained results are compared with a pixel based Distance classifier. The results show that ANN gives the better overall accuracy of 99.20% and Kappa coefficient value of 0.98 over the Mahalanobis Distance Classifier.
Job Shop Scheduling: Classification, Constraints and Objective Functions
The job-shop scheduling problem (JSSP) is an important decision facing those involved in the fields of industry, economics and management. This problem is a class of combinational optimization problem known as the NP-hard problem. JSSPs deal with a set of machines and a set of jobs with various predetermined routes through the machines, where the objective is to assemble a schedule of jobs that minimizes certain criteria such as makespan, maximum lateness, and total weighted tardiness. Over the past several decades, interest in meta-heuristic approaches to address JSSPs has increased due to the ability of these approaches to generate solutions which are better than those generated from heuristics alone. This article provides the classification, constraints and objective functions imposed on JSSPs that are available in the literature.
Lean Models Classification: Towards a Holistic View
The purpose of this paper is to present a classification of Lean models which aims to capture all the concepts related to this approach and thus facilitate its implementation. This classification allows the identification of the most relevant models according to several dimensions. From this perspective, we present a review and an analysis of Lean models literature and we propose dimensions for the classification of the current proposals while respecting among others the axes of the Lean approach, the maturity of the models as well as their application domains. This classification allowed us to conclude that researchers essentially consider the Lean approach as a toolbox also they design their models to solve problems related to a specific environment. Since Lean approach is no longer intended only for the automotive sector where it was invented, but to all fields (IT, Hospital, ...), we consider that this approach requires a generic model that is capable of being implemented in all areas.
Pose Normalization Network for Object Classification
Convolutional Neural Networks (CNN) have
demonstrated their effectiveness in synthesizing 3D views of object
instances at various viewpoints. Given the problem where one
have limited viewpoints of a particular object for classification, we
present a pose normalization architecture to transform the object to
existing viewpoints in the training dataset before classification to
yield better classification performance. We have demonstrated that
this Pose Normalization Network (PNN) can capture the style of
the target object and is able to re-render it to a desired viewpoint.
Moreover, we have shown that the PNN improves the classification
result for the 3D chairs dataset and ShapeNet airplanes dataset
when given only images at limited viewpoint, as compared to a
Data Quality Enhancement with String Length Distribution
Recently, collectable manufacturing data are rapidly
increasing. On the other hand, mega recall is getting serious as
a social problem. Under such circumstances, there are increasing
needs for preventing mega recalls by defect analysis such as
root cause analysis and abnormal detection utilizing manufacturing
data. However, the time to classify strings in manufacturing data
by traditional method is too long to meet requirement of quick
defect analysis. Therefore, we present String Length Distribution
Classification method (SLDC) to correctly classify strings in a short
time. This method learns character features, especially string length
distribution from Product ID, Machine ID in BOM and asset list.
By applying the proposal to strings in actual manufacturing data, we
verified that the classification time of strings can be reduced by 80%.
As a result, it can be estimated that the requirement of quick defect
analysis can be fulfilled.
An Approach for Vocal Register Recognition Based on Spectral Analysis of Singing
Recognizing and controlling vocal registers during
singing is a difficult task for beginner vocalist. It requires among
others identifying which part of natural resonators is being used
when a sound propagates through the body. Thus, an application
has been designed allowing for sound recording, automatic vocal
register recognition (VRR), and a graphical user interface providing
real-time visualization of the signal and recognition results. Six
spectral features are determined for each time frame and passed to the
support vector machine classifier yielding a binary decision on the
head or chest register assignment of the segment. The classification
training and testing data have been recorded by ten professional
female singers (soprano, aged 19-29) performing sounds for both
chest and head register. The classification accuracy exceeded 93%
in each of various validation schemes. Apart from a hard two-class
clustering, the support vector classifier returns also information on
the distance between particular feature vector and the discrimination
hyperplane in a feature space. Such an information reflects the level
of certainty of the vocal register classification in a fuzzy way. Thus,
the designed recognition and training application is able to assess and
visualize the continuous trend in singing in a user-friendly graphical
mode providing an easy way to control the vocal emission.
Diagnosis of Diabetes Using Computer Methods: Soft Computing Methods for Diabetes Detection Using Iris
Complementary and Alternative Medicine (CAM) techniques are quite popular and effective for chronic diseases. Iridology is more than 150 years old CAM technique which analyzes the patterns, tissue weakness, color, shape, structure, etc. for disease diagnosis. The objective of this paper is to validate the use of iridology for the diagnosis of the diabetes. The suggested model was applied in a systemic disease with ocular effects. 200 subject data of 100 each diabetic and non-diabetic were evaluated. Complete procedure was kept very simple and free from the involvement of any iridologist. From the normalized iris, the region of interest was cropped. All 63 features were extracted using statistical, texture analysis, and two-dimensional discrete wavelet transformation. A comparison of accuracies of six different classifiers has been presented. The result shows 89.66% accuracy by the random forest classifier.
Neuro-Fuzzy Based Model for Phrase Level Emotion Understanding
The present approach deals with the identification of Emotions and classification of Emotional patterns at Phrase-level with respect to Positive and Negative Orientation. The proposed approach considers emotion triggered terms, its co-occurrence terms and also associated sentences for recognizing emotions. The proposed approach uses Part of Speech Tagging and Emotion Actifiers for classification. Here sentence patterns are broken into phrases and Neuro-Fuzzy model is used to classify which results in 16 patterns of emotional phrases. Suitable intensities are assigned for capturing the degree of emotion contents that exist in semantics of patterns. These emotional phrases are assigned weights which supports in deciding the Positive and Negative Orientation of emotions. The approach uses web documents for experimental purpose and the proposed classification approach performs well and achieves good F-Scores.
Air Classification of Dust from Steel Converter Secondary De-dusting for Zinc Enrichment
The off-gas from the basic oxygen furnace (BOF), where pig iron is converted into steel, is treated in the primary ventilation system. This system is in full operation only during oxygen-blowing when the BOF converter vessel is in a vertical position. When pig iron and scrap are charged into the BOF and when slag or steel are tapped, the vessel is tilted. The generated emissions during charging and tapping cannot be captured by the primary off-gas system. To capture these emissions, a secondary ventilation system is usually installed. The emissions are captured by a canopy hood installed just above the converter mouth in tilted position. The aim of this study was to investigate the dependence of Zn and other components on the particle size of BOF secondary ventilation dust. Because of the high temperature of the BOF process it can be expected that Zn will be enriched in the fine dust fractions. If Zn is enriched in the fine fractions, classification could be applied to split the dust into two size fractions with a different content of Zn. For this air classification experiments with dust from the secondary ventilation system of a BOF were performed. The results show that Zn and Pb are highly enriched in the finest dust fraction. For Cd, Cu and Sb the enrichment is less. In contrast, the non-volatile metals Al, Fe, Mn and Ti were depleted in the fine fractions. Thus, air classification could be considered for the treatment of dust from secondary BOF off-gas cleaning.
Application of Data Mining Techniques for Tourism Knowledge Discovery
Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.
Design and Implementation of a Counting and Differentiation System for Vehicles through Video Processing
This paper presents a self-sustaining mobile system for
counting and classification of vehicles through processing video. It
proposes a counting and classification algorithm divided in four steps
that can be executed multiple times in parallel in a SBC (Single
Board Computer), like the Raspberry Pi 2, in such a way that it
can be implemented in real time. The first step of the proposed
algorithm limits the zone of the image that it will be processed.
The second step performs the detection of the mobile objects using
a BGS (Background Subtraction) algorithm based on the GMM
(Gaussian Mixture Model), as well as a shadow removal algorithm
using physical-based features, followed by morphological operations.
In the first step the vehicle detection will be performed by using
edge detection algorithms and the vehicle following through Kalman
filters. The last step of the proposed algorithm registers the vehicle
passing and performs their classification according to their areas.
An auto-sustainable system is proposed, powered by batteries and
photovoltaic solar panels, and the data transmission is done through
GPRS (General Packet Radio Service)eliminating the need of using
external cable, which will facilitate it deployment and translation to
any location where it could operate. The self-sustaining trailer will
allow the counting and classification of vehicles in specific zones
with difficult access.
Predicting Groundwater Areas Using Data Mining Techniques: Groundwater in Jordan as Case Study
Data mining is the process of extracting useful or hidden information from a large database. Extracted information can be used to discover relationships among features, where data objects are grouped according to logical relationships; or to predict unseen objects to one of the predefined groups. In this paper, we aim to investigate four well-known data mining algorithms in order to predict groundwater areas in Jordan. These algorithms are Support Vector Machines (SVMs), Naïve Bayes (NB), K-Nearest Neighbor (kNN) and Classification Based on Association Rule (CBA). The experimental results indicate that the SVMs algorithm outperformed other algorithms in terms of classification accuracy, precision and F1 evaluation measures using the datasets of groundwater areas that were collected from Jordanian Ministry of Water and Irrigation.
Sparse Coding Based Classification of Electrocardiography Signals Using Data-Driven Complete Dictionary Learning
In this paper, a data-driven dictionary approach is proposed for the automatic detection and classification of cardiovascular abnormalities. Electrocardiography (ECG) signal is represented by the trained complete dictionaries that contain prototypes or atoms to avoid the limitations of pre-defined dictionaries. The data-driven trained dictionaries simply take the ECG signal as input rather than extracting features to study the set of parameters that yield the most descriptive dictionary. The approach inherently learns the complicated morphological changes in ECG waveform, which is then used to improve the classification. The classification performance was evaluated with ECG data under two different preprocessing environments. In the first category, QT-database is baseline drift corrected with notch filter and it filters the 60 Hz power line noise. In the second category, the data are further filtered using fast moving average smoother. The experimental results on QT database confirm that our proposed algorithm shows a classification accuracy of 92%.
Satellite Imagery Classification Based on Deep Convolution Network
Satellite imagery classification is a challenging problem with many practical applications. In this paper, we designed a deep convolution neural network (DCNN) to classify the satellite imagery. The contributions of this paper are twofold — First, to cope with the large-scale variance in the satellite image, we introduced the inception module, which has multiple filters with different size at the same level, as the building block to build our DCNN model. Second, we proposed a genetic algorithm based method to efficiently search the best hyper-parameters of the DCNN in a large search space. The proposed method is evaluated on the benchmark database. The results of the proposed hyper-parameters search method show it will guide the search towards better regions of the parameter space. Based on the found hyper-parameters, we built our DCNN models, and evaluated its performance on satellite imagery classification, the results show the classification accuracy of proposed models outperform the state of the art method.
Multi-Objective Evolutionary Computation Based Feature Selection Applied to Behaviour Assessment of Children
Abstract—Attribute or feature selection is one of the basic
strategies to improve the performances of data classification tasks,
and, at the same time, to reduce the complexity of classifiers,
and it is a particularly fundamental one when the number
of attributes is relatively high. Its application to unsupervised
classification is restricted to a limited number of experiments in
the literature. Evolutionary computation has already proven itself
to be a very effective choice to consistently reduce the number
of attributes towards a better classification rate and a simpler
semantic interpretation of the inferred classifiers. We present a feature
selection wrapper model composed by a multi-objective evolutionary
algorithm, the clustering method Expectation-Maximization (EM),
and the classifier C4.5 for the unsupervised classification of data
extracted from a psychological test named BASC-II (Behavior
Assessment System for Children - II ed.) with two objectives:
Maximizing the likelihood of the clustering model and maximizing
the accuracy of the obtained classifier. We present a methodology
to integrate feature selection for unsupervised classification, model
evaluation, decision making (to choose the most satisfactory model
according to a a posteriori process in a multi-objective context), and
testing. We compare the performance of the classifier obtained by the
multi-objective evolutionary algorithms ENORA and NSGA-II, and
the best solution is then validated by the psychologists that collected
Case-Based Reasoning: A Hybrid Classification Model Improved with an Expert's Knowledge for High-Dimensional Problems
Data mining and classification of objects is the process of data analysis, using various machine learning techniques, which is used today in various fields of research. This paper presents a concept of hybrid classification model improved with the expert knowledge. The hybrid model in its algorithm has integrated several machine learning techniques (Information Gain, K-means, and Case-Based Reasoning) and the expert’s knowledge into one. The knowledge of experts is used to determine the importance of features. The paper presents the model algorithm and the results of the case study in which the emphasis was put on achieving the maximum classification accuracy without reducing the number of features.
Determination of the Bank's Customer Risk Profile: Data Mining Applications
In this study, the clients who applied to a bank branch for loan were analyzed through data mining. The study was composed of the information such as amounts of loans received by personal and SME clients working with the bank branch, installment numbers, number of delays in loan installments, payments available in other banks and number of banks to which they are in debt between 2010 and 2013. The client risk profile was examined through Classification and Regression Tree (CART) analysis, one of the decision tree classification methods. At the end of the study, 5 different types of customers have been determined on the decision tree. The classification of these types of customers has been created with the rating of those posing a risk for the bank branch and the customers have been classified according to the risk ratings.
Predication Model for Leukemia Diseases Based on Data Mining Classification Algorithms with Best Accuracy
In recent years, there has been an explosion in the rate of using technology that help discovering the diseases. For example, DNA microarrays allow us for the first time to obtain a "global" view of the cell. It has great potential to provide accurate medical diagnosis, to help in finding the right treatment and cure for many diseases. Various classification algorithms can be applied on such micro-array datasets to devise methods that can predict the occurrence of Leukemia disease. In this study, we compared the classification accuracy and response time among eleven decision tree methods and six rule classifier methods using five performance criteria. The experiment results show that the performance of Random Tree is producing better result. Also it takes lowest time to build model in tree classifier. The classification rules algorithms such as nearest- neighbor-like algorithm (NNge) is the best algorithm due to the high accuracy and it takes lowest time to build model in classification.
Clustering-Based Detection of Alzheimer's Disease Using Brain MR Images
This paper presents a comprehensive survey of recent research studies to segment and classify brain MR (magnetic resonance) images in order to detect significant changes to brain ventricles. The paper also presents a general framework for detecting regions that atrophy, which can help neurologists in detecting and staging Alzheimer. Furthermore, a prototype was implemented to segment brain MR images in order to extract the region of interest (ROI) and then, a classifier was employed to differentiate between normal and abnormal brain tissues. Experimental results show that the proposed scheme can provide a reliable second opinion that neurologists can benefit from.
Performance Comparison of ADTree and Naive Bayes Algorithms for Spam Filtering
Classification is an important data mining technique
and could be used as data filtering in artificial intelligence. The
broad application of classification for all kind of data leads to be
used in nearly every field of our modern life. Classification helps us
to put together different items according to the feature items decided
as interesting and useful. In this paper, we compare two
classification methods Naïve Bayes and ADTree use to detect spam
e-mail. This choice is motivated by the fact that Naive Bayes
algorithm is based on probability calculus while ADTree algorithm is
based on decision tree. The parameter settings of the above
classifiers use the maximization of true positive rate and
minimization of false positive rate. The experiment results present
classification accuracy and cost analysis in view of optimal classifier
choice for Spam Detection. It is point out the number of attributes to
obtain a tradeoff between number of them and the classification
Statistical Feature Extraction Method for Wood Species Recognition System
Effective statistical feature extraction and classification are important in image-based automatic inspection and analysis. An automatic wood species recognition system is designed to perform wood inspection at custom checkpoints to avoid mislabeling of timber which will results to loss of income to the timber industry. The system focuses on analyzing the statistical pores properties of the wood images. This paper proposed a fuzzy-based feature extractor which mimics the experts’ knowledge on wood texture to extract the properties of pores distribution from the wood surface texture. The proposed feature extractor consists of two steps namely pores extraction and fuzzy pores management. The total number of statistical features extracted from each wood image is 38 features. Then, a backpropagation neural network is used to classify the wood species based on the statistical features. A comprehensive set of experiments on a database composed of 5200 macroscopic images from 52 tropical wood species was used to evaluate the performance of the proposed feature extractor. The advantage of the proposed feature extraction technique is that it mimics the experts’ interpretation on wood texture which allows human involvement when analyzing the wood texture. Experimental results show the efficiency of the proposed method.
A Hybrid Gene Selection Technique Using Improved Mutual Information and Fisher Score for Cancer Classification Using Microarrays
Feature Selection is significant in order to perform constructive classification in the area of cancer diagnosis. However, a large number of features compared to the number of samples makes the task of classification computationally very hard and prone to errors in microarray gene expression datasets. In this paper, we present an innovative method for selecting highly informative gene subsets of gene expression data that effectively classifies the cancer data into tumorous and non-tumorous. The hybrid gene selection technique comprises of combined Mutual Information and Fisher score to select informative genes. The gene selection is validated by classification using Support Vector Machine (SVM) which is a supervised learning algorithm capable of solving complex classification problems. The results obtained from improved Mutual Information and F-Score with SVM as a classifier has produced efficient results.
Fake Account Detection in Twitter Based on Minimum Weighted Feature set
Social networking sites such as Twitter and Facebook
attracts over 500 million users across the world, for those users, their
social life, even their practical life, has become interrelated. Their
interaction with social networking has affected their life forever.
Accordingly, social networking sites have become among the main
channels that are responsible for vast dissemination of different kinds
of information during real time events. This popularity in Social
networking has led to different problems including the possibility of
exposing incorrect information to their users through fake accounts
which results to the spread of malicious content during life events.
This situation can result to a huge damage in the real world to the
society in general including citizens, business entities, and others. In this paper, we present a classification method for detecting the
fake accounts on Twitter. The study determines the minimized set of
the main factors that influence the detection of the fake accounts on
Twitter, and then the determined factors are applied using different
classification techniques. A comparison of the results of these
techniques has been performed and the most accurate algorithm is
selected according to the accuracy of the results. The study has been
compared with different recent researches in the same area; this
comparison has proved the accuracy of the proposed study. We claim
that this study can be continuously applied on Twitter social network
to automatically detect the fake accounts; moreover, the study can be
applied on different social network sites such as Facebook with minor
changes according to the nature of the social network which are
discussed in this paper.
Day/Night Detector for Vehicle Tracking in Traffic Monitoring Systems
Recently, traffic monitoring has attracted the attention
of computer vision researchers. Many algorithms have been
developed to detect and track moving vehicles. In fact, vehicle
tracking in daytime and in nighttime cannot be approached with the
same techniques, due to the extreme different illumination conditions.
Consequently, traffic-monitoring systems are in need of having a
component to differentiate between daytime and nighttime scenes. In
this paper, a HSV-based day/night detector is proposed for traffic
monitoring scenes. The detector employs the hue-histogram and the
value-histogram on the top half of the image frame. Experimental
results show that the extraction of the brightness features along with
the color features within the top region of the image is effective for
classifying traffic scenes. In addition, the detector achieves high
precision and recall rates along with it is feasible for real time
Image Segmentation Using 2-D Histogram in RGB Color Space in Digital Libraries
This paper presents an unsupervised color image segmentation method. It is based on a hierarchical analysis of 2-D histogram in RGB color space. This histogram minimizes storage space of images and thus facilitates the operations between them. The improved segmentation approach shows a better identification of objects in a color image and, at the same time, the system is fast.
Documents Emotions Classification Model Based on TF-IDF Weighting Measure
Emotions classification of text documents is applied to reveal if the document expresses a determined emotion from its writer. As different supervised methods are previously used for emotion documents’ classification, in this research we present a novel model that supports the classification algorithms for more accurate results by the support of TF-IDF measure. Different experiments have been applied to reveal the applicability of the proposed model, the model succeeds in raising the accuracy percentage according to the determined metrics (precision, recall, and f-measure) based on applying the refinement of the lexicon, integration of lexicons using different perspectives, and applying the TF-IDF weighting measure over the classifying features. The proposed model has also been compared with other research to prove its competence in raising the results’ accuracy.
Segmentation of Korean Words on Korean Road Signs
This paper introduces an effective method of
segmenting Korean text (place names in Korean) from a Korean road
sign image. A Korean advanced directional road sign is composed of
several types of visual information such as arrows, place names in
Korean and English, and route numbers. Automatic classification of
the visual information and extraction of Korean place names from the
road sign images make it possible to avoid a lot of manual inputs to a
database system for management of road signs nationwide. We
propose a series of problem-specific heuristics that correctly segments
Korean place names, which is the most crucial information, from the
other information by leaving out non-text information effectively. The
experimental results with a dataset of 368 road sign images show 96%
of the detection rate per Korean place name and 84% per road sign
Multinomial Dirichlet Gaussian Process Model for Classification of Multidimensional Data
We present probabilistic multinomial Dirichlet
classification model for multidimensional data and Gaussian process
priors. Here, we have considered efficient computational method that
can be used to obtain the approximate posteriors for latent variables
and parameters needed to define the multiclass Gaussian process
classification model. We first investigated the process of inducing a
posterior distribution for various parameters and latent function by
using the variational Bayesian approximations and important sampling
method, and next we derived a predictive distribution of latent
function needed to classify new samples. The proposed model is
applied to classify the synthetic multivariate dataset in order to verify
the performance of our model. Experiment result shows that our model
is more accurate than the other approximation methods.
A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data
Advances in spatial and spectral resolution of satellite
images have led to tremendous growth in large image databases. The
data we acquire through satellites, radars, and sensors consists of
important geographical information that can be used for remote
sensing applications such as region planning, disaster management.
Spatial data classification and object recognition are important tasks
for many applications. However, classifying objects and identifying
them manually from images is a difficult task. Object recognition is
often considered as a classification problem, this task can be
performed using machine-learning techniques. Despite of many
machine-learning algorithms, the classification is done using
supervised classifiers such as Support Vector Machines (SVM) as the
area of interest is known. We proposed a classification method,
which considers neighboring pixels in a region for feature extraction
and it evaluates classifications precisely according to neighboring
classes for semantic interpretation of region of interest (ROI). A
dataset has been created for training and testing purpose; we
generated the attributes by considering pixel intensity values and
mean values of reflectance. We demonstrated the benefits of using
knowledge discovery and data-mining techniques, which can be on
image data for accurate information extraction and classification from
high spatial resolution remote sensing imagery.
Human Action Recognition System Based on Silhouette
Human action is recognized directly from the video sequences. The objective of this work is to recognize various human actions like run, jump, walk etc. Human action recognition requires some prior knowledge about actions namely, the motion estimation, foreground and background estimation. Region of interest (ROI) is extracted to identify the human in the frame. Then, optical flow technique is used to extract the motion vectors. Using the extracted features similarity measure based classification is done to recognize the action. From experimentations upon the Weizmann database, it is found that the proposed method offers a high accuracy.