Open Science Research Excellence
%0 Journal Article
%A Kittisak Kerdprasop and  Nittaya Kerdprasop and  Pairote Sattayatham
%D 2008 
%J  International Journal of Computer, Electrical, Automation, Control and Information Engineering
%B World Academy of Science, Engineering and Technology
%I International Science Index 20, 2008
%T A Monte Carlo Method to Data Stream Analysis
%V 20
%X Data stream analysis is the process of computing
various summaries and derived values from large amounts of data
which are continuously generated at a rapid rate. The nature of a
stream does not allow a revisit on each data element. Furthermore,
data processing must be fast to produce timely analysis results. These
requirements impose constraints on the design of the algorithms to
balance correctness against timely responses. Several techniques
have been proposed over the past few years to address these
challenges. These techniques can be categorized as either dataoriented
or task-oriented. The data-oriented approach analyzes a
subset of data or a smaller transformed representation, whereas taskoriented
scheme solves the problem directly via approximation
techniques. We propose a hybrid approach to tackle the data stream
analysis problem. The data stream has been both statistically
transformed to a smaller size and computationally approximated its
characteristics. We adopt a Monte Carlo method in the approximation
step. The data reduction has been performed horizontally and
vertically through our EMR sampling method. The proposed method
is analyzed by a series of experiments. We apply our algorithm on
clustering and classification tasks to evaluate the utility of our
%P 2758 - 2763