An Empirical Evaluation of Performance of Machine Learning Techniques on Imbalanced Software Quality Data
 R. Malhotra and M. Khanna, “Investigation of Relationship between Object-oriented Metrics and Change Proneness,” Int. J. Mach. Learn. & Cyber., vol. 4, 2013, pp. 273-286.
 K.K. Aggarwal, Y. Singh, A.Kaur and R. Malhotra, “Empirical Analysis for Investigating the Effect of Object-Oriented Metrics on fault Proneness: A Replicated Case Study,” Software Process: Improvement and Practice, vol. 16, 2009, no. 1, pp. 39-62.
 A.G. Koru and J. Tian, “Comparing High-Change Modules and Modules with the Highest Measurement Values in two Large-Scale Open-Source Products,” IEEE Trans. Softw. Eng., vol. 31, 2005, no. 8, pp. 625-642.
 M. Kubat and S. Martin, “Addressing the Curse of Imbalanced Data Sets: One Sided Sampling,” in Proc. of 14th International Conf. on Machine Learning, Nashville, 1997, pp. 179-186.
 S. Visa, and A. Ralescu, "Issues in Mining Imbalanced Data Sets- A Review Paper," in Proc. of the 16th midwest Artificial Intelligence and Cognitive Science Conf., 2005, Ohio, pp. 67-73.
 H. He, and E.A. Garcia, “Learning from imbalanced data, IEEE Trans. Knowl. Data Eng., vol. 21, 2009, pp. 1263–1284.
 G.M. Weiss, "Mining with Rarity: A Unifying Framework.," ACM SIGKDD Explorations Newsletter, vol. 6, no. 1. pp. 7-19, 2004.
 X. Zhang and Y. Li, "An Empirical Study of Learning from Imbalanced Data." in Proc. of the 22nd Australasian Database Conf, Perth, 2011, pp. 85-94.
 R. Shatnawi, "Improving Software Fault-Prediction for Imbalanced Data." in Proc. of International Conf. on Innovations in Information Technology, Al Ain, 2012, pp. 54-59.
 C. Seiffert, T. M. Khoshgoftaar, J. V. Hulse, and A. Folleco, "An Empirical Study of the Classification Performance of Learners on Imbalanced and Noisy Software Quality Data,” Information Sciences, vol. 259, 2014, pp. 571-595.
 Y. Liu, A. An, and X. Huang, "Boosting Prediction Accuracy on Imbalanced Datasets with SVM Ensembles," In Advances in Knowledge Discovery and Data Mining, pp. 107-118, 2006.
 S. Wang and X. Yao, "Using Class Imbalance Learning for Software Defect Prediction," IEEE Transactions on Reliability, vol. 62, 2013, pp. 434-443.
 N. Seliya and T. M. Khoshgoftaar, "The Use of Decision Trees for Cost‐sensitive Classification: An Empirical Study in Software Quality Prediction," Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 1, 2011, 448-459.
 G.M.Weiss, K. McCarthy, and B. Zabar, "Cost-sensitive Learning vs. Sampling: Which is Best for Handling Unbalanced Classes with Unequal Error Costs?" in Proc. of Internatioanl Conf. on Data Mining, Omaha NE, 2007 pp. 35-41.
 D. Rodriguez, I. Herraiz, R. Harrison, J. Dolado, and J. C. Riquelme, "Preliminary Comparison of Techniques for Dealing with Imbalance in Software Defect Prediction," In Proc. of the 18th International Conf. on Evaluation and Assessment in Software Engineering, London, 2014, p. 43.
 J.V. Hulse, T. M. Khoshgoftaar, A. Napolitano, and Randall Wald, "Feature Selection with High-dimensional Imbalanced Data." in Proc. of International Conf. on Data Mining Workshops, Florida, 2009, pp. 507-514.
 K. Gao, T. M. Khoshgoftaar, and Amri Napolitano, "Combining Feature Subset Selection and Data Sampling for Coping with Highly Imbalanced Software Data" in Proc. of 27th International Conf. on Software Engineering and Knowledge Engineering, Pittsburgh, 2015.
 L. Jeni, J. F. Cohn, and F. De La Torre, "Facing Imbalanced Data--Recommendations for the Use of Performance Metrics." In Proc. of Humaine Association Conf. on Affective Computing and Intelligent Interaction, Geneva, 2013, pp. 245-251.
 M. Tan, L. Tan, S. Dara, and C. Mayeux, "Online Defect Prediction for Imbalanced Data," in Proc. of 37th International Conf. on Software Engineering, Florence, 2015.
 N.V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, "SMOTE: Synthetic Minority Over-Sampling Technique," Journal of Artificial Intelligence Research, vol. 16, 2002, pp. 321-357.
 I.H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, vol. 2, 2005.
 P. Domingos, “Metacost: A General Method for Making Classifiers Cost-sensitive,” In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, CA, 1999, pp. 155-164.
 S. Chidamber and C. Kemerer, “A Metric Suite for Object- Oriented design,” IEEE Transactions on Software Engineering, vol. 20, 1994, pp. 476-493.
 R. Malhotra, K. Nagpal, P. Upmanyu & N. Pritam, “Defect collection and reporting system for Git based open source software. in Proc. of International Conf. on Data Mining and Intelligent Computing, Delhi, 2014, pp. 1-7.
 M.A. Hall, “Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning,” in Proc. of the Seventeenth International Conf. on Machine Learning, CA, 2000, pp. 359-366.
 C.G. Weng, and J. Poon, “A New Evaluation Measure for Imbalanced Datasets,” in Proc. of the 7th Australasian Data Mining Conf., Sydney, 2008, pp. 27-32.