Excellence in Research and Innovation for Humanity

International Science Index

Commenced in January 1999 Frequency: Monthly Edition: International Abstract Count: 51925

Computer and Systems Engineering

105
89335
The Intersection/Union Region Computation for Drosophila Brain Images Using Encoding Schemes Based on Multi-Core CPUs
Abstract:
With more and more Drosophila Driver and Neuron images, it is an important work to find the similarity relationships among them as the functional inference. There is a general problem that how to find a Drosophila Driver image, which can cover a set of Drosophila Driver/Neuron images. In order to solve this problem, the intersection/union region for a set of images should be computed at first, then a comparison work is used to calculate the similarities between the region and other images. In this paper, three encoding schemes, namely Integer, Boolean, Decimal, are proposed to encode each image as a one-dimensional structure. Then, the intersection/union region from these images can be computed by using the compare operations, Boolean operators and lookup table method. Finally, the comparison work is done as the union region computation, and the similarity score can be calculated by the definition of Tanimoto coefficient. The above methods for the region computation are also implemented in the multi-core CPUs environment with the OpenMP. From the experimental results, in the encoding phase, the performance by the Boolean scheme is the best than that by others; in the region computation phase, the performance by Decimal is the best when the number of images is large. The speedup ratio can achieve 12 based on 16 CPUs. This work was supported by the Ministry of Science and Technology under the grant MOST 106-2221-E-182-070.
Digital Article Identifier (DAI):
104
89046
Comparison of Web Development Using Framework over Library
Abstract:
Over the recent years, web development has changed significantly. Driven largely by the rise of trends like mobile, the world of development is rapidly evolving. The rise of the Internet makes web applications become crucial nowadays. The web has been an interface for a company and one of the way they present their portfolio to the client. On the other hand, the web has been part of file management system which takes over the role of paper. Due to high demand in a web application, developers are required to develop a web application that is cost-effective, secure and well coded. A framework has been proposed to develop an application rather than using library style development. The framework is helping the developer in creating the structure of web automatically. This paper will compare the advantage and disadvantage of web development using framework against the library style development. This comparison is based on a previous research paper that's been made and focusing on three main indicators which are the impact to management and impact to the developer.
Digital Article Identifier (DAI):
103
88971
Comparison of Authentication Methods in Internet of Things (IoT) Technology
Abstract:
Internet of Things (IoT) is a powerful industry system, which end-devices are interconnected and automated, allowing the devices to analyze data and execute actions based on the analysis. The IoT technology leverages the technology of Radio-Frequency Identification (RFID) and Wireless Sensor Network (WSN), including mobile and sensor. These technologies contribute to the evolution of IoT. However, due to more devices are connected each other in the Internet, and data from various sources exchanged between things, confidentiality of the data becomes a major concern. This paper focuses on one of the major challenges in IoT; authentication, in order to ensure data integrity and confidentiality are in place. Few solutions are reviewed based on papers from the last few years. One of the proposed solutions is securing the communication between IoT devices and cloud servers with Elliptic Curve Cryptography (ECC) based mutual authentication protocol. This solution focuses on Hyper Text Transfer Protocol (HTTP) cookies as a security parameter. Next proposed solution is using keyed-hash scheme protocol to enable IoT devices to authenticate each other without the presence of a central control server. Another proposed solution uses Physical Unclonable Function (PUF) based mutual authentication protocol. It emphasizes on tamper resistant and resource-efficient technology, which equals a 3-way handshake security protocol.
Digital Article Identifier (DAI):
102
88567
Context-Aware Recommender Systems Using User's Emotional State
Abstract:
The product recommendation is a field of research that has received much attention in the recent information overload phenomenon. The proliferation of the mobile environment and social media cannot help but affect the results of the recommendation depending on how the factors of the user's situation are reflected in the recommendation process. Recently, research has been spreading attention to the context-aware recommender system which is to reflect user's contextual information in the recommendation process. However, until now, most of the context-aware recommender system researches have been limited in that they reflect the passive context of users. It is expected that the user will be able to express his/her contextual information through his/her active behavior and the importance of the context-aware recommender system reflecting this information can be increased. The purpose of this study is to propose a context-aware recommender system that can reflect the user's emotional state as an active context information to recommendation process. The context-aware recommender system is a recommender system that can make more sophisticated recommendations by utilizing the user's contextual information and has an advantage that the user's emotional factor can be considered as compared with the existing recommender systems. In this study, we propose a method to infer the user's emotional state, which is one of the user's context information, by using the user's facial expression data and to reflect it on the recommendation process. This study collects the facial expression data of a user who is looking at a specific product and the user's product preference score. Then, we classify the facial expression data into several categories according to the previous research and construct a model that can predict them. Next, the predicted results are applied to existing collaborative filtering with contextual information. As a result of the study, it was shown that the recommended results of the context-aware recommender system including facial expression information show improved results in terms of recommendation performance. Based on the results of this study, it is expected that future research will be conducted on recommender system reflecting various contextual information.
Digital Article Identifier (DAI):
101
88377
Segmentation of Arabic Handwritten Numeral Strings Based on Watershed Approach
Abstract:
Arabic offline handwriting recognition systems are considered as one of the most challenging topics. Arabic Handwritten Numeral Strings are used to automate systems that deal with numbers such as postal code, banking account numbers and numbers on car plates. Segmentation of connected numerals is the main bottleneck in the handwritten numeral recognition system. This is, in turn, can increase the speed and efficiency of the recognition system. In this paper, we proposed powerful algorithms for automatic segmentation and feature extraction of Arabic handwritten numeral strings based on Watershed approach. The proposed algorithms have been designed and implemented to achieve the main goal of segmenting and extracting the string of numeral digits written by hand especially in a courtesy amount of bank checks. The segmentation algorithm partitions the string into multiple regions that can be associated with the properties of one or more criterion. The numeral extraction algorithm extracts the numeral string digits into separated individual digit. Both algorithms for segmentation and feature extraction have been tested successfully and efficiently for all types of numerals.
Digital Article Identifier (DAI):
100
86538
Classification of Red, Green and Blue Values from Face Images Using k-NN Classifier to Predict the Skin or Non-Skin
Authors:
Abstract:
In this study, it has been estimated whether there is skin by using RBG values obtained from the camera and k-nearest neighbor (k-NN) classifier. The dataset used in this study has an unbalanced distribution and a linearly non-separable structure. This problem can also be called a big data problem. The Skin dataset was taken from UCI machine learning repository. As the classifier, we have used the k-NN method to handle this big data problem. For k value of k-NN classifier, we have used as 1. To train and test the k-NN classifier, 50-50% training-testing partition has been used. As the performance metrics, TP rate, FP Rate, Precision, recall, f-measure and AUC values have been used to evaluate the performance of k-NN classifier. These obtained results are as follows: 0.999, 0.001, 0.999, 0.999, 0.999, and 1,00. As can be seen from the obtained results, this proposed method could be used to predict whether the image is skin or not.
Digital Article Identifier (DAI):
99
86007
Transient Voltage Distribution on the Single Phase Transmission Line under Short Circuit Fault Effect
Abstract:
Single phase transmission lines are used to transfer data or energy between two users. Transient conditions such as switching operations and short circuit faults cause the generation of the fluctuation on the waveform to be transmitted. Spatial voltage distribution on the single phase transmission line may change owing to the position and duration of the short circuit fault in the system. In this paper, the state space representation of the single phase transmission line for short circuit fault and for various types of terminations is given. Since the transmission line is modeled in time domain using distributed parametric elements, the mathematical representation of the event is given in state space (time domain) differential equation form. It also makes easy to solve the problem because of the time and space dependent characteristics of the voltage variations on the distributed parametrically modeled transmission line.
Digital Article Identifier (DAI):
98
85662
A Unified Approach to Support the Coordination of Usability Work in Agile Software Development
Abstract:
Usability evaluation is essential for developing usable software systems, yet its integration within agile software development remains a challenging interdisciplinary endeavour. In this paper, the authors present a study to investigate obstacles of such integration from the management perspective. The study incorporates two methods, namely an online questionnaire survey and a series of interviews with participants that answered the questionnaire. Based on the obtained results, a unified approach is proposed for enabling coordinate the efforts of agile developers and usability engineers to produce usable software systems.
Digital Article Identifier (DAI):
97
84868
Reconstruction of Performace-Based Budgeting in Indonesian Local Government: Application of Soft Systems Methodology in Producing Guideline for Policy Implementation
Abstract:
Effective public policy creation required a strong budget system, both in terms of design and implementation. Performance-based Budget is an evolutionary approach with two substantial characteristics; first, the strong integration between budgeting and planning, and second, its existence as guidance so that all activities and expenditures refer to measurable performance targets. There are four processes in the government that should be followed in order to make the budget become performance-based. These four processes consist of the preparation of a vision according to the bold aspiration, the formulation of outcome, the determination of output based on the analysis of organizational resources, and the formulation of Value Creation Map that contains a series of programs and activities. This is consistent with the concept of logic model which revealed that the budget performance should be placed within a relational framework of resources, activities, outputs, outcomes and impacts. Through the issuance of Law 17/2003 regarding State Finance, local governments in Indonesia have to implement performance-based budget. Central Government then issued Government Regulation 58/2005 which contains the detail guidelines how to prepare local governments budget. After a decade, implementation of performance budgeting in local government is still not fully meet expectations, though the guidance is completed, socialization routinely performed, and trainings have also been carried out at all levels. Accordingly, this study views the practice of performance-based budget at local governments as a problematic situation. This condition must be approached with a system approach that allows the solutions from many point of views. Based on the fact that the infrastructure of budgeting has already settled, the study then considering the situation as complexity. Therefore, the intervention needs to be done in the area of human activity system. Using Soft Systems Methodology, this research will reconstruct the process of performance-based budget at local governments is area of human activity system. Through conceptual models, this study will invite all actors (central government, local government, and the parliament) for dialogue and formulate interventions in human activity systems that systematically desirable and culturally feasible. The result will direct central government in revise the guidance to local government budgeting process as well as a reference to build the capacity building strategy.
Digital Article Identifier (DAI):
96
83545
Seawater Changes' Estimation at Tidal Flat in Korean Peninsula Using Drone Stereo Images
Abstract:
Tidal flat in Korean peninsula is one of the largest biodiversity tidal flats in the world. Therefore, digital elevation models (DEM) is continuously demanded to monitor of the tidal flat. In this study, DEM of tidal flat, according to different times, was produced by means of the Drone and commercial software in order to measure seawater change during high tide at water-channel in tidal flat. To correct the produced DEMs of the tidal flat where is inaccessible to collect control points, the DEM matching method was applied by using the reference DEM instead of the survey. After the ortho-image was made from the corrected DEM, the land cover classified image was produced. The changes of seawater amount according to the times were analyzed by using the classified images and DEMs. As a result, it was confirmed that the amount of water rapidly increased as the time passed during high tide.
Digital Article Identifier (DAI):
95
83453
The Design Process of an Interactive Seat for Improving Workplace Productivity
Abstract:
Creative industries’ workers are becoming more prominent as countries move towards intellectual-based economies. Consequently, the nature and essence of the workplace needs to be reconfigured so that creativity and productivity can be better promoted at these spaces. Using a multidisciplinary approach and a user-centered methodology, combining product design, electronic engineering, software and human-computer interaction, we have designed and developed a new seat that uses embedded sensors and actuators to increase the overall well-being of its users, their productivity and their creativity. Our contribution focuses on the parameters that most affect the user’s work on these kinds of spaces, which are, according to our study, noise and temperature. We describe the design process for a new interactive seat targeted at improving workspace productivity.
Digital Article Identifier (DAI):
94
82893
Implementation of an IoT Sensor Data Collection and Analysis Library
Abstract:
Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.
Digital Article Identifier (DAI):
93
82653
A POX Controller Module to Prepare a List of Flow Header Information Extracted from SDN Traffic
Abstract:
Software Defined Networking (SDN) is a paradigm designed to facilitate the way of controlling the network dynamically and with more agility. Network traffic is a set of flows, each of which contains a set of packets. In SDN, a matching process is performed on every packet coming to the network in the SDN switch. Only the headers of the new packets will be forwarded to the SDN controller. In terminology, the flow header fields are called tuples. Basically, these tuples are 5-tuple: the source and destination IP addresses, source and destination ports, and protocol number. This flow information is used to provide an overview of the network traffic. Our module is meant to extract this 5-tuple with the packets and flows numbers and show them as a list. Therefore, this list can be used as a first step in the way of detecting the DDoS attack. Thus, this module can be considered as the beginning stage of any flow-based DDoS detection method.
Digital Article Identifier (DAI):
92
82574
CompleX-Machine: An Automated Testing Tool Using X-Machine Theory
Abstract:
This paper is aimed at creating an Automatic Java X-Machine testing tool for software development. The nature of software development is changing; thus, the type of software testing tools required is also changing. Software is growing increasingly complex and, in part due to commercial impetus for faster software releases with new features and value, increasingly in danger of containing faults. These faults can incur huge cost for software development organisations and users; Cambridge Judge Business School’s research estimated the cost of software bugs to the global economy is $312 billion. Beyond the cost, faster software development methodologies and increasing expectations on developers to become testers is driving demand for faster, automated, and effective tools to prevent potential faults as early as possible in the software development lifecycle. Using X-Machine theory, this paper will explore a new tool to address software complexity, changing expectations on developers, faster development pressures and methodologies, with a view to reducing the huge cost of fixing software bugs.
Digital Article Identifier (DAI):
91
82341
Automated Java Testing: JUnit versus AspectJ
Abstract:
Growing dependency of mankind on software technology increases the need for thorough testing of the software applications and automated testing techniques that support testing activities. We have outlined our testing strategy for performing various types of automated testing of Java applications using AspectJ which has become the de-facto standard for Aspect Oriented Programming (AOP). Likewise JUnit, a unit testing framework is the most popular Java testing tool. In this paper, we have evaluated our proposed AOP approach for automated testing and JUnit on various parameters. First we have provided the similarity between the two approaches and then we have done a detailed comparison of the two testing techniques on factors like lines of testing code, learning curve, testing of private members etc. We established that our AOP testing approach using AspectJ has got several advantages and is thus particularly more effective than JUnit.
Digital Article Identifier (DAI):
90
81960
Developing Co-Creation Monitoring Technique for Civic Tech
Abstract:
The main task of the paper is to offer a scientific evidence-based Co-creation Monitoring Technique for Civic Tech (civic technologies platforms) as a tool to evaluate, effectively manage and standardize digital supported co-creation processes, and multiply successful models of collective decision making and transparent management in other sectors. The co-creation concept fundamentally differs from traditional public engagement approach, while it focuses on the collective influence and responsibility of all stakeholders by creating the public good. While traditional approaches to public engagement and governmental reforms remain relevant, this paper focuses towards the growing potential of networked society to solve their social problems. It expands co-creation field to the citizens co-initiated, heavily technology supported, and systems oriented co-creation approaches. Around the world, the civic organizations, individual citizens, and even businesses experiment with the ICT tools and available open resources to collaborate with each other and with the government to find innovative solutions for societal problems. To support this, the international scientific society publishes the research results about the creative power of networked systems and their potential to grow 'collective intelligence.' The current research project relates co-creation and collective intelligence concepts and supplements the knowledge in the field of collective intelligence with the new aspects. The co-creation is defined as the new form of collective intelligence, which influences an internal and external motivation of the platforms' users to act for the public good. The both mentioned concepts were strongly influenced by technological progress but were developed in science parallel. The Civic Tech was investigated as collective intelligence systems, which integrate all criteria inherent for such kind of systems (openness, dynamism, decentralisation, critical mass for 'swarm effect', etc.). The challenging task for the proposed methodology was to correlate different factors and to find realizable possibilities for the system performance in these causal relationships. The Co-Creation Monitoring Technique evaluates the basic characteristics, functionality, and technological design of civic tech using a set of integral socio-technological indicators.
Digital Article Identifier (DAI):
89
81920
Point-of-Interest Recommender Systems for Location-Based Social Network Services
Abstract:
Location Based Social Network services (LBSNs) is a new term that combines location based service and social network service (SNS). Unlike traditional SNS, LBSNs emphasizes empirical elements in the user's actual physical location. Point-of-Interest (POI) is the most important factor to implement LBSNs recommendation system. POI information is the most popular spot in the area. In this study, we would like to recommend POI to users in a specific area through recommendation system using collaborative filtering. The process is as follows: first, we will use different data sets based on Seoul and New York to find interesting results on human behavior. Secondly, based on the location-based activity information obtained from the personalized LBSNs, we have devised a new rating that defines the user's preference for the area. Finally, we have developed an automated rating algorithm from massive raw data using distributed systems to reduce advertising costs of LBSNs.
Digital Article Identifier (DAI):
88
81797
A Case Study on Evaluating and Selecting Soil /Pipeline Interaction Analysis Software for the Oil and Gas Industry
Abstract:
The evaluation and selection of appropriate software solutions to meet with an organisation’s inherent business requirements can be a problematic software engineering process that if done incorrectly can have a significant, costly and adverse effect on the business and its processes. The aim of this paper is to show the process and evaluation criteria followed to select the right engineering solution for the identified business requirement. The research adopted an action research method within an organisation in the oil and gas industry, which required a solution suitable for conducting stress analysis for soil-pipeline interaction analysis (SPIA). Through the use of the presented software selection and evaluation approach, to capture and measure key requirements, it was possible to determine a suitable software for the organisation. This paper investigates methodologies for selecting software packages, software evaluation techniques, and software evaluation criteria in evaluating software packages before providing an explanation of the developed methodology adopted. The key findings of the study are: (1) that there is a need to create a framework for software selection methodologies, (2) there are no universal selection criteria in the engineering industry, and (3) there is a need to validate the findings by creating an application based on the evaluation technique and evaluation criteria for selecting software packages for the engineering industry. The findings of the study are offered to support organisations in the oil and gas sector improve software selection methodologies for SPIA.
Digital Article Identifier (DAI):
87
80276
Topographic Mapping of Farmland by Integration of Multiple Sensors on Board Low-Altitude Unmanned Aerial System
Abstract:
This paper introduced a topographic mapping system with time-saving and simplicity advantages based on integration of Light Detection and Ranging (LiDAR) data and Post Processing Kinematic Global Positioning System (PPK GPS) data. This topographic mapping system used a low-altitude Unmanned Aerial Vehicle (UAV) as a platform to conduct land survey in a low-cost, efficient, and totally autonomous manner. An experiment in a small-scale sugarcane farmland was conducted in Queensland, Australia. Subsequently, we synchronized LiDAR distance measurements that were corrected by using attitude information from gyroscope with PPK GPS coordinates for generation of precision topographic maps, which could be further utilized for such applications like precise land leveling and drainage management. The results indicated that LiDAR distance measurements and PPK GPS altitude reached good accuracy of less than 0.015 m.
Digital Article Identifier (DAI):
86
80166
Impact of Extended Enterprise Resource Planning in the Context of Cloud Computing on Industries and Organizations
Abstract:
The Extended Enterprise Resource Planning (ERPII) system usually requires massive amounts of storage space, powerful servers, and large upfront and ongoing investments to purchase and manage the software and the related hardware which are not affordable for organizations. In recent decades, organizations prefer to adapt their business structures with new technologies for remaining competitive in the world economy. Therefore, cloud computing (which is one of the tools of information technology (IT)) is a modern system that reveals the next-generation application architecture. Also, cloud computing has had some advantages that reduce costs in many ways such as: lower upfront costs for all computing infrastructure and lower cost of maintaining and supporting. On the other hand, traditional ERPII is not responding for huge amounts of data and relations between the organizations. In this study, based on a literature study, ERPII is investigated in the context of cloud computing where the organizations operate more efficiently. Also, ERPII conditions have a response to needs of organizations in large amounts of data and relations between the organizations.
Digital Article Identifier (DAI):
85
80157
From E-Government to Cloud-Government Challenges of Jordanian Citizens' Acceptance for Public Services
Abstract:
On the inception of the third millennium, there is much evidence that cloud technologies have become the strategic trend for many governments not only developed countries (e.g., UK, Japan, and USA), but also developing countries (e.g. Malaysia and the Middle East region), who have launched cloud computing movements for enhanced standardization of IT resources, cost reduction, and more efficient public services. Therefore, cloud-based e-government services considered as one of the high priorities for government agencies in Jordan. Although of their phenomenal evolution, government cloud-services still suffering from the adoption challenges of e-government initiatives (e.g. technological, human-aspects, social, and financial) which need to be considered carefully by governments contemplating its implementation. This paper presents a pilot study to investigate the citizens' perception of the extent in which these challenges affect the acceptance and use of cloud computing in Jordanian public sector. Based on the data analysis collected using online survey some important challenges were identified. The results can help to guide successful acceptance of cloud-based e-government services in Jordan.
Digital Article Identifier (DAI):
84
79820
Application of Wireless Sensor Networks: A Survey in Thailand
Abstract:
Nowadays, Today, wireless sensor networks are an important technology that works with Internet of Things. It is receiving various data from many sensor. Then sent to processing or storing. By wireless network or through the Internet. The devices around us are intelligent, can receiving/transmitting and processing data and communicating through the system. There are many applications of wireless sensor networks, such as smart city, smart farm, environmental management, weather. This article will explore the use of wireless sensor networks in Thailand and collect data from Thai Thesis database in 2012-2017. How to Implementing Wireless Sensor Network Technology. Advantage from this study To know the usage wireless technology in many fields. This will be beneficial for future research. In this study was found the most widely used wireless sensor network in agriculture field. Especially for smart farms. And the second is the adoption of the environment. Such as weather stations and water inspection.
Digital Article Identifier (DAI):
83
78014
Multilayer Neural Network and Fuzzy Logic Based Software Quality Prediction
Abstract:
In the software development lifecycle, the quality prediction techniques hold a prime importance in order to minimize future design errors and expensive maintenance. There are many techniques proposed by various researchers, but with the increasing complexity of the software lifecycle model, it is crucial to develop a flexible system which can cater for the factors which in result have an impact on the quality of the end product. These factors include properties of the software development process and the product along with its operation conditions. In this paper, a neural network (perceptron) based software quality prediction technique is proposed. Using this technique, the stakeholders can predict the quality of the resulting software during the early phases of the lifecycle saving time and resources on future elimination of design errors and costly maintenance. This technique can be brought into practical use using successful training.
Digital Article Identifier (DAI):
82
75889
D-Care: Diabetes Care Application to Enhance Diabetic Awareness to Diabetes in Indonesia
Abstract:
Diabetes is a common disease in Indonesia. One of the risk factors of diabetes is an unhealthy diet which is consuming food that contains too much glucose, one of glucose sources presents in food containing carbohydrate. The purpose of this study is to identify the amount of glucose level in the consumed food. The authors use literature studies for this research method. For the results of this study, the authors expect diabetics to be more aware of diabetes by applying daily dietary regulation through D-Care. D-Care is an application that can enhance people awareness to diabetes in Indonesia. D-Care provides two menus; there are nutrition calculation and healthy food. Nutrition calculation menu is used for knowing estimated glucose intake level by calculating food that consumed each day. Whereas healthy food menu, it provides a combination of healthy food menu for diabetic. The conclusion is D-Care is useful to be used for reducing diabetes prevalence in Indonesia.
Digital Article Identifier (DAI):
81
75514
Water End-Use Classification with Contemporaneous Water-Energy Data and Deep Learning Network
Abstract:
‘Water-related energy’ is energy use which is directly or indirectly influenced by changes to water use. Informatics applying a range of mathematical, statistical and rule-based approaches can be used to reveal important information on demand from the available data provided at second, minute or hourly intervals. This study aims to combine these two concepts to improve the current water end use disaggregation problem through applying a wide range of most advanced pattern recognition techniques to analyse the concurrent high-resolution water-energy consumption data. The obtained results have shown that recognition accuracies of all end-uses have significantly increased, especially for mechanised categories, including clothes washer, dishwasher and evaporative air cooler where over 95% of events were correctly classified.
Digital Article Identifier (DAI):
80
74866
Gradient Boosted Trees on Spark Platform for Supervised Learning in Health Care Big Data
Abstract:
Health care is one of the prominent industries that generate voluminous data thereby finding the need of machine learning techniques with big data solutions for efficient processing and prediction. Missing data, incomplete data, real time streaming data, sensitive data, privacy, heterogeneity are few of the common challenges to be addressed for efficient processing and mining of health care data. In comparison with other applications, accuracy and fast processing are of higher importance for health care applications as they are related to the human life directly. Though there are many machine learning techniques and big data solutions used for efficient processing and prediction in health care data, different techniques and different frameworks are proved to be effective for different applications largely depending on the characteristics of the datasets. In this paper, we present a framework that uses ensemble machine learning technique gradient boosted trees for data classification in health care big data. The framework is built on Spark platform which is fast in comparison with other traditional frameworks. Unlike other works that focus on a single technique, our work presents a comparison of six different machine learning techniques along with gradient boosted trees on datasets of different characteristics. Five benchmark health care datasets are considered for experimentation, and the results of different machine learning techniques are discussed in comparison with gradient boosted trees. The metric chosen for comparison is misclassification error rate and the run time of the algorithms. The goal of this paper is to i) Compare the performance of gradient boosted trees with other machine learning techniques in Spark platform specifically for health care big data and ii) Discuss the results from the experiments conducted on datasets of different characteristics thereby drawing inference and conclusion. The experimental results show that the accuracy is largely dependent on the characteristics of the datasets for other machine learning techniques whereas gradient boosting trees yields reasonably stable results in terms of accuracy without largely depending on the dataset characteristics.
Digital Article Identifier (DAI):
79
74464
Programming without Code: An Approach and Environment to Conditions-On-Data Programming
Abstract:
This paper presents the concept of an object-based programming language where tests (if... then... else) and control structures (while, repeat, for...) disappear and are replaced by conditions on data. According to the object paradigm, by using this concept, data are still embedded inside objects, as variable-value couples, but object methods are expressed into the form of logical propositions (‘conditions on data’ or COD).For instance : variable1 = value1 AND variable2 > value2 => variable3 = value3. Implementing this approach, a central inference engine turns and examines objects one after another, collecting all CODs of each object. CODs are considered as rules in a rule-based system: the left part of each proposition (left side of the ‘=>‘ sign) is the premise and the right part is the conclusion. So, premises are evaluated and conclusions are fired. Conclusions modify the variable-value couples of the object and the engine goes to examine the next object. The paper develops the principles of writing CODs instead of complex algorithms. Through samples, the paper also presents several hints for implementing a simple mechanism able to process this ‘COD language’. The proposed approach can be used within the context of simulation, process control, industrial systems validation, etc. By writing simple and rigorous conditions on data, instead of using classical and long-to-learn languages, engineers and specialists can easily simulate and validate the functioning of complex systems.
Digital Article Identifier (DAI):
78
74462
Generic Early Warning Signals for Program Student Withdrawals: A Complexity Perspective Based on Critical Transitions and Fractals
Authors:
Abstract:
Complex systems exhibit universal characteristics as they near a tipping point. Among them are common generic early warning signals which precede critical transitions. These signals include: critical slowing down in which the rate of recovery from perturbations decreases over time; an increase in the variance of the state variable; an increase in the skewness of the state variable; an increase in the autocorrelations of the state variable; flickering between different states; and an increase in spatial correlations over time. The presence of the signals has management implications, as the identification of the signals near the tipping point could allow management to identify intervention points. Despite the applications of the generic early warning signals in various scientific fields, such as fisheries, ecology and finance, a review of literature did not identify any applications that address the program student withdrawal problem at the undergraduate distance universities. This area could benefit from the application of generic early warning signals as the program withdrawal rate amongst distance students is higher than the program withdrawal rate at face-to-face conventional universities. This research specifically assessed the generic early warning signals through an intensive case study of undergraduate program student withdrawal at a Canadian distance university. The university is non-cohort based due to its system of continuous course enrollment where students can enroll in a course at the beginning of every month. The assessment of the signals was achieved through the comparison of the incidences of generic early warning signals among students who withdrew or simply became inactive in their undergraduate program of study, the true positives, to the incidences of the generic early warning signals among graduates, the false positives. This was achieved through significance testing. Research findings showed support for the signal pertaining to the rise in flickering which is represented in the increase in the student’s non-pass rates prior to withdrawing from a program; moderate support for the signals of critical slowing down as reflected in the increase in the time a student spends in a course; and moderate support for the signals on increase in autocorrelation and increase in variance in the grade variable. The findings did not support the signal on the increase in skewness of the grade variable. The research also proposes a new signal based on the fractal-like characteristic of student behavior. The research also sought to extend knowledge by investigating whether the emergence of a program withdrawal status is self-similar or fractal-like at multiple levels of observation, specifically the program level and the course level. In other words, whether the act of withdrawal at the program level is also present at the course level. The findings moderately supported self-similarity as a potential signal. Overall, the assessment of the signals suggests that the signals, with the exception with the increase of skewness, could be utilized as a predictive management tool and potentially add one more tool, the fractal-like characteristic of withdrawal, as an additional signal in addressing the student program withdrawal problem.
Digital Article Identifier (DAI):
77
73667
Foslip Loaded and CEA-Affimer Functionalised Silica Nanoparticles for Fluorescent Imaging of Colorectal Cancer Cells
Abstract:
Introduction: There is a need for real-time imaging of colorectal cancer (CRC) to allow tailored surgery to the disease stage. Fluorescence guided laparoscopic imaging of primary colorectal cancer and the draining lymphatics would potentially bring stratified surgery into clinical practice and realign future CRC management to the needs of patients. Fluorescent nanoparticles can offer many advantages in terms of intra-operative imaging and therapy (theranostic) in comparison with traditional soluble reagents. Nanoparticles can be functionalised with diverse reagents and then targeted to the correct tissue using an antibody or Affimer (artificial binding protein). We aimed to develop and test fluorescent silica nanoparticles and targeted against CRC using an anti-carcinoembryonic antigen (CEA) Affimer (Aff). Methods: Anti-CEA and control Myoglobin Affimer binders were subcloned into the expressing vector pET11 followed by transformation into BL21 Star™ (DE3) E.coli. The expression of Affimer binders was induced using 0.1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG). Cells were harvested, lysed and purified using nickle chelating affinity chromatography. The photosensitiser Foslip (soluble analogue of 5,10,15,20-Tetra(m-hydroxyphenyl) chlorin) was incorporated into the core of silica nanoparticles using water-in-oil microemulsion technique. Anti-CEA or control Affs were conjugated to silica nanoparticles surface using sulfosuccinimidyl-4-(N-maleimidomethyl) cyclohexane-1-carboxylate (sulfo SMCC) chemical linker. Binding of CEA-Aff or control nanoparticles to colorectal cancer cells (LoVo, LS174T and HC116) was quantified in vitro using confocal microscopy. Results: The molecular weights of the obtained band of Affimers were ~12.5KDa while the diameter of functionalised silica nanoparticles was ~80nm. CEA-Affimer targeted nanoparticles demonstrated 9.4, 5.8 and 2.5 fold greater fluorescence than control in, LoVo, LS174T and HCT116 cells respectively (p < 0.002) for the single slice analysis. A similar pattern of successful CEA-targeted fluorescence was observed in the maximum image projection analysis, with CEA-targeted nanoparticles demonstrating 4.1, 2.9 and 2.4 fold greater fluorescence than control particles in LoVo, LS174T, and HCT116 cells respectively (p < 0.0002). There was no significant difference in fluorescence for CEA-Affimer vs. CEA-Antibody targeted nanoparticles. Conclusion: We are the first to demonstrate that Foslip-doped silica nanoparticles conjugated to anti-CEA Affimers via SMCC allowed tumour cell-specific fluorescent targeting in vitro, and had shown sufficient promise to justify testing in an animal model of colorectal cancer. CEA-Affimer appears to be a suitable targeting molecule to replace CEA-Antibody. Targeted silica nanoparticles loaded with Foslip photosensitiser is now being optimised to drive photodynamic killing, via reactive oxygen generation.
Digital Article Identifier (DAI):
76
73588
Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency
Abstract:
Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.
Digital Article Identifier (DAI):