Open Science Research Excellence

Open Science Index

Commenced in January 2007 Frequency: Monthly Edition: International Publications Count: 29640


Select areas to restrict search in scientific publication database:
4653
A Monte Carlo Method to Data Stream Analysis
Abstract:
Data stream analysis is the process of computing various summaries and derived values from large amounts of data which are continuously generated at a rapid rate. The nature of a stream does not allow a revisit on each data element. Furthermore, data processing must be fast to produce timely analysis results. These requirements impose constraints on the design of the algorithms to balance correctness against timely responses. Several techniques have been proposed over the past few years to address these challenges. These techniques can be categorized as either dataoriented or task-oriented. The data-oriented approach analyzes a subset of data or a smaller transformed representation, whereas taskoriented scheme solves the problem directly via approximation techniques. We propose a hybrid approach to tackle the data stream analysis problem. The data stream has been both statistically transformed to a smaller size and computationally approximated its characteristics. We adopt a Monte Carlo method in the approximation step. The data reduction has been performed horizontally and vertically through our EMR sampling method. The proposed method is analyzed by a series of experiments. We apply our algorithm on clustering and classification tasks to evaluate the utility of our approach.
Digital Object Identifier (DOI):

References:

[1] C. Aggarwal, J. Han, J. Wang, and P. Yu, A framework for clustering evolving data streams, in Pro. Very Large Data Bases, 2003.
[2] B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, Model and issues in data stream systems, in Pro. ACM PODS, 2002.
[3] M. Berthold and D.J. Hand, Intelligent Data Analysis: An Introduction. Springer-Verlag, 2003.
[4] J. Bilmes, A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models, Dept. Electrical Engineering and Computer Science, University of California Berkeley, Technical Report TR-97-021, 1998.
[5] G. Coremode and S. Muthukrishnan, What s hot and what s not: Tracking most frequent items dynamically, in Pro. ACM PODS, 2003.
[6] A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society B, vol. 39, pp. 1-22, 1977.
[7] P. Domingos and G. Hulten, A general method to scaling up machine learning algorithms and its application to clustering, in Pro. ICML, 2001.
[8] M. A. T. Figueiredo and A. K. Jain, Unsupervised learning of finite mixture models, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, pp. 381 396, 2002.
[9] M. Gaber, A. Zaslavsky, and S. Krishnaswamy, Mining data stream: A review, SIGMOD Record, vol. 34, pp. 18 26, 2005.
[10] A. Gilbert, Y. Kotidis, S. Muthukrishnan, and M. Strauss, One-pass wavelet decompositions of data streams, IEEE Trans. Knowledge and Data Engineering, vol. 15, pp. 541 554, 2003.
[11] D. Mackay, Introduction to Monte Carlo, in Learning in Graphical Models, M. Jordan, Ed. MIT Press, 1996, pp. 175-204.
[12] J. M. Marin, K. Mengersen, and C. Robert, Bayesian modelling and inference on mixtures of distributions, in Handbook of Statistics, vol. 25, Elsevier-Science, 2005.
[13] S. Muthukrishnan, Data streams: Algorithms and applications, in Proc. ACM-SIAM Symposium on Discrete Algorithm, 2003.
[14] R. Neal, Probabilistic inference using Markov chain Monte Carlo methods, Dept. Computer Science, University of Toronto, Technical Report CRG-TR93-1, 1993.
[15] B. Resch, A tutorial for the course computational intelligence, Available: http://www.igi.tugraz.at/lehre/CI
[16] J. von Neumann, Various techniques used in connection with random digits, Applied Mathematics Series, vol. 12, National Bureau of Standards, Washington, D.C., 1951.
[17] I. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, 2000.
Vol:13 No:05 2019Vol:13 No:04 2019Vol:13 No:03 2019Vol:13 No:02 2019Vol:13 No:01 2019
Vol:12 No:12 2018Vol:12 No:11 2018Vol:12 No:10 2018Vol:12 No:09 2018Vol:12 No:08 2018Vol:12 No:07 2018Vol:12 No:06 2018Vol:12 No:05 2018Vol:12 No:04 2018Vol:12 No:03 2018Vol:12 No:02 2018Vol:12 No:01 2018
Vol:11 No:12 2017Vol:11 No:11 2017Vol:11 No:10 2017Vol:11 No:09 2017Vol:11 No:08 2017Vol:11 No:07 2017Vol:11 No:06 2017Vol:11 No:05 2017Vol:11 No:04 2017Vol:11 No:03 2017Vol:11 No:02 2017Vol:11 No:01 2017
Vol:10 No:12 2016Vol:10 No:11 2016Vol:10 No:10 2016Vol:10 No:09 2016Vol:10 No:08 2016Vol:10 No:07 2016Vol:10 No:06 2016Vol:10 No:05 2016Vol:10 No:04 2016Vol:10 No:03 2016Vol:10 No:02 2016Vol:10 No:01 2016
Vol:9 No:12 2015Vol:9 No:11 2015Vol:9 No:10 2015Vol:9 No:09 2015Vol:9 No:08 2015Vol:9 No:07 2015Vol:9 No:06 2015Vol:9 No:05 2015Vol:9 No:04 2015Vol:9 No:03 2015Vol:9 No:02 2015Vol:9 No:01 2015
Vol:8 No:12 2014Vol:8 No:11 2014Vol:8 No:10 2014Vol:8 No:09 2014Vol:8 No:08 2014Vol:8 No:07 2014Vol:8 No:06 2014Vol:8 No:05 2014Vol:8 No:04 2014Vol:8 No:03 2014Vol:8 No:02 2014Vol:8 No:01 2014
Vol:7 No:12 2013Vol:7 No:11 2013Vol:7 No:10 2013Vol:7 No:09 2013Vol:7 No:08 2013Vol:7 No:07 2013Vol:7 No:06 2013Vol:7 No:05 2013Vol:7 No:04 2013Vol:7 No:03 2013Vol:7 No:02 2013Vol:7 No:01 2013
Vol:6 No:12 2012Vol:6 No:11 2012Vol:6 No:10 2012Vol:6 No:09 2012Vol:6 No:08 2012Vol:6 No:07 2012Vol:6 No:06 2012Vol:6 No:05 2012Vol:6 No:04 2012Vol:6 No:03 2012Vol:6 No:02 2012Vol:6 No:01 2012
Vol:5 No:12 2011Vol:5 No:11 2011Vol:5 No:10 2011Vol:5 No:09 2011Vol:5 No:08 2011Vol:5 No:07 2011Vol:5 No:06 2011Vol:5 No:05 2011Vol:5 No:04 2011Vol:5 No:03 2011Vol:5 No:02 2011Vol:5 No:01 2011
Vol:4 No:12 2010Vol:4 No:11 2010Vol:4 No:10 2010Vol:4 No:09 2010Vol:4 No:08 2010Vol:4 No:07 2010Vol:4 No:06 2010Vol:4 No:05 2010Vol:4 No:04 2010Vol:4 No:03 2010Vol:4 No:02 2010Vol:4 No:01 2010
Vol:3 No:12 2009Vol:3 No:11 2009Vol:3 No:10 2009Vol:3 No:09 2009Vol:3 No:08 2009Vol:3 No:07 2009Vol:3 No:06 2009Vol:3 No:05 2009Vol:3 No:04 2009Vol:3 No:03 2009Vol:3 No:02 2009Vol:3 No:01 2009
Vol:2 No:12 2008Vol:2 No:11 2008Vol:2 No:10 2008Vol:2 No:09 2008Vol:2 No:08 2008Vol:2 No:07 2008Vol:2 No:06 2008Vol:2 No:05 2008Vol:2 No:04 2008Vol:2 No:03 2008Vol:2 No:02 2008Vol:2 No:01 2008
Vol:1 No:12 2007Vol:1 No:11 2007Vol:1 No:10 2007Vol:1 No:09 2007Vol:1 No:08 2007Vol:1 No:07 2007Vol:1 No:06 2007Vol:1 No:05 2007Vol:1 No:04 2007Vol:1 No:03 2007Vol:1 No:02 2007Vol:1 No:01 2007