In the big data era, much real-world data can be naturally represented as graphs. Consequently, many application domains can be modelled as graph processing. Graph processing, especially the processing of the large scale graphs with the number of vertices and edges in the order of billions or even hundreds of billions, has attracted much attention in both industry and academia. It still remains a great challenge to process such large scale graphs. Researchers have been seeking for new possible solutions. Because of the massive degree of parallelism and the high memory access bandwidth in GPU, utilizing GPU to accelerate graph processing proves to be a promising solution. This paper surveys the key issues of graph processing on GPUs, including data layout, memory access pattern, workload mapping and specific GPU programming. In this paper, we summarize the state-of-the-art research on GPU-based graph processing, analyze the existing challenges in details, and explore the research opportunities in future.
Organizations are exposed to threats that increase the risk factor of their ICT systems and the assurance of their protection is crucial, as their reliance on information technology is a continuing challenge for both security experts and chief executives. To tackle down the threats decision makers should be provided with information needed to understand and mitigate them. Risk assessment forms a means of providing such information and facilitates the development of a security strategy. This paper aims at addressing the problem of selection an appropriate risk assessment method to assess and manage information security risks, by proposing a set of 17 criteria, grouped in 4 categories, for comparing such methods and provide a comparison of the 10 most popular methods based upon them. Finally, the comparison presented in the paper could be utilized by organizations to determine which method is more suitable for their needs.
Smartphone applications to support healthcare are proliferating. A growing and important subset of these apps supports emergency medical intervention to address a wide range of illness-related emergencies in order to speed the arrival of relevant treatment. The emergency response characteristics and strategies employed by these apps are the focus in this study resulting in an mHealth Emergency Strategy Index (MESI). While a growing body of knowledge focuses on usability, safety and privacy aspects that characterize such apps, studies that map the various emergency intervention strategies and suggest criteria to evaluate their role as emergency agents are limited. We survey an extensive range of mHealth apps designed for emergency response along with the related assessment literature and present an index for mobile-based medical emergency intervention apps that can address assessment needs of future mHealth apps.
The main achievements of spatio-temporal modelling in the field of Geographic Information Science over the past three decades are surveyed. This article offers an overview of: (i) the origins and history of Temporal Geographic Information Systems (T-GIS); (ii) relevant spatio-temporal data models proposed; (iii) the evolution of spatio-temporal modelling trends; and (iv) an analysis of the future trends and developments in T-GIS. It also presents some current theories and concepts that have emerged from the research performed, as well as a summary of the current progress and the upcoming challenges and potential research directions for T-GIS. One relevant result of this survey is the proposed taxonomy of spatio-temporal modelling trends, which classifies 186 modelling proposals surveyed from more than 1400 articles.
Online social networks (OSNs) are structures that help users interact, exchange, and propagate new ideas. The identification of the most influential users in OSNs is a significant process for accelerating information propagation including those in marketing applications, or for hindering the dissemination of unwanted contents such as viruses, negative online behaviors, and rumors. The present paper presents a detailed survey of influential users identification algorithms and their performance evaluation approaches in OSNs. The survey covers recent techniques, applications, and open research issues on the influential users identification in OSNs.
The rapid development of cloud computing promotes a wide deployment of data and computation outsourcing by resource-limited entities to cloud providers. Based on a pay-per-use fashion, clients without enough computational power can easily outsource large-scale computational tasks to the cloud. Nonetheless, the issue of security and privacy is a major concern when customers' confidential or sensitive data is processed and the output is generated in not fully trusted cloud environments. Recently, a number of publications have investigated and designed secure outsourcing schemes for different computational tasks. The aim of this survey is to systemize and present the cutting-edge technologies in this area. It starts by presenting security threats and requirements, followed by other factors that should be considered by secure computation outsourcing constructions. In an organized way, we then dwell on the existing secure computation outsourcing solutions to different computational tasks such as matrix computations, mathematical optimization, etc., treating the confidentiality of data as well as the integrity of result. Finally, we offer a discussion of the literature and provide a list of open challenges in the area.
The continuously increasing cost of the US healthcare system has received significant attention. Central to the ideas aimed at curbing this trend is the use of technology, in the form of the mandate to implement electronic health records (EHRs). EHRs consist of patient information such as demographics, medications, laboratory test results, diagnosis codes and procedures. Mining EHRs could lead to improvement in patient health management as EHRs contain detailed information related to disease prognosis for large patient populations. In this manuscript, we provide a structured and comprehensive overview of data mining techniques for modeling EHR data. We first provide a detailed understanding of the major application areas to which EHR mining has been applied and then discuss the nature of EHR data and its accompanying challenges. Next, we describe major approaches used for EHR mining, the metrics associated with EHRs, and the various study designs. With this foundation, we then provide a systematic and methodological organization of existing data mining techniques used to model EHRs and discuss ideas for future research.
It is unlikely that an hacker is able to compromise sensitive data that is stored in an encrypted form. However, when data is to be processed, it has to be decrypted, becoming vulnerable to attacks. Homomorphic encryption fixes this vulnerability by allowing one to compute directly on encrypted data. In this survey, both previous and current Somewhat Homomorphic Encryption (SHE) schemes are reviewed, and the more powerful and recent Fully Homomorphic Encryption (FHE) schemes are comprehensively studied. The concepts that support these schemes are presented, and their performance and security are analyzed from an engineering standpoint.
The Experience Sampling Method (ESM) is used by scientists from various disciplines to gather insights into the intrapsychic elements of human life. Researchers have used the ESM in a wide variety of studies, with the method seeing increased popularity. Mobile technologies have enabled new possibilities for the use of the ESM, while simultaneously leading to new conceptual, methodological, and technological challenges. In this survey, we provide an overview of the history of the ESM, usage of this methodology in the computer science discipline, as well as its evolution over time. Next, we identify and discuss important considerations for ESM studies on mobile devices, and analyse the particular methodological parameters scientists should consider in their study design. We reflect on the existing tools that support the ESM methodology and discuss the future development of such tools. Finally, we discuss the effect of future technological developments on the use of the ESM and identify areas requiring further investigation.
Despite the rapid growth of hardware capacity and popularity in mobile devices, limited resources in battery and processing capacity still lack the ability to meet the increasing mobile users' demands. Both conventional techniques and emerging approaches are brought together to fill this gap between the user demand and mobile device's limited capacity. The cloud computing is an uprising topic in both business and academia in recent years to eliminate the gap. Augmentation techniques such as computation outsourcing and service oriented architectures are proposed by the proposed works, and new challenges regarding the augmentation techniques, energy efficiency, etc, needs to be studied. In this paper, we aim to provide a comprehensive taxonomy and survey of the existing techniques and frameworks for mobile cloud augmentation in terms of both computation and storage. Different from the existing taxonomies in this field, we focus on the techniques aspect, following the idea of realizing a complete mobile cloud computing system. The objective of this survey is to provide a guide on what available augmentation techniques can be adopted in mobile cloud computing systems as well as supporting mechanisms such as decision making and fault tolerance policies for realizing reliable mobile cloud services.
Firewalls are network security components that handle incoming and outgoing network traffic based on a set of rules. The process of correctly configuring a firewall is complicated and prone to error, and it worsens as the network complexity grows. A poorly configured firewall may result in major security threats; in case of a network firewall, an organizations security could be endangered, and in the case of a personal firewall, an individual computers security is threatened. A major reason of poorly configured firewalls, as pointed out in the literature, is usability issues. Our aim is to identify existing solutions that help professional and non- professional users to create and manage firewall configuration files, and to analyze the proposals in respect of usability. A systematic literature review with a focus on usability of firewall configuration is presented in the paper. Its main goal is to explore what has already been done in this field. In the primary selection procedure, 1,202 papers were retrieved and then screened. The secondary selection led us to 35 papers carefully chosen for further investigation, of which, 14 papers were selected and summarized....
Large volumes of spatio-temporal data are increasingly collected and studied in diverse domains including, climate science, social sciences, neuroscience, epidemiology, transportation, mobile health, and Earth sciences. Spatio-temporal data differs from relational data for which computational approaches are developed in the data mining community for multiple decades, in that both spatial and temporal attributes are available in addition to the actual measurements/attributes. The presence of these attributes introduces additional challenges that needs to be dealt with. Approaches for mining spatio-temporal data have been studied for over a decade in the data mining community. In this article we present a broad survey of this relatively young field of spatio-temporal data mining. We discuss different types of spatio-temporal data and the relevant data mining questions that arise in the context of analyzing each of these datasets. Based on the nature of the data mining problem studied, we classify literature on spatio-temporal data mining into six major categories: clustering, predictive learning, change detection, frequent pattern mining, anomaly detection, and relationship mining. We discuss the various forms of spatio-temporal data mining problems in each of these categories.
Storage as a Service (StaaS) forms a critical component of cloud computing by offering the vision of a virtually infinite pool of storage resources. It supports a variety of cloud-based data store classes in terms of availability, scalability, ACID (Atomicity, Consistency, Isolation, Durability) properties, data models, and price options. Despite many open challenges within a cloud-based data store, application providers deploy Geo-replicated data stores in order to obtain higher availability, lower response time, and more cost efficiency. The deployment of Geo-replicated data stores is in its infancy and poses vital challenges for researchers. In this paper, we first discuss the key advantages and challenges of data-intensive applications deployed within and across cloud-based data stores. Then, we provide a comprehensive taxonomy that covers key aspects of cloud-based data store: data model, data dispersion, data consistency, data transaction service, and data cost optimization. Finally, we map various cloud-based data store projects to our proposed taxonomy not only to validate the taxonomy but also to identify areas for future research.
Optical on-chip data transmission enabled by silicon photonics is widely considered a key technology to overcome the bandwidth and energy limitations of electrical interconnects. The possibility of utilizing optical links in the on-chip communication fabric has paved the way to a fascinating new research field - Optical Networks-on-Chip (ONoCs) - which has been gaining large interest in the community. Nanophotonic devices and materials, however, are still evolving, and dealing with optical data transmission on chip makes designers and researchers face a whole new set of obstacles and challenges. Designing efficient ONoCs is a challenging task and requires a detailed knowledge from on-chip traffic demands and patterns down to the physical layout and implications of integrating both electronic and photonic devices. In this paper, we provide an exhaustive review of recent ONoC proposals, discuss their strengths and weaknesses, and outline outstanding research questions. Moreover, we discuss recent research efforts in key enabling technologies, such as on-chip and adaptive laser sources, automatic synthesis tools, and ring heating techniques, which are essential to enable a widespread commercial adoption of ONoCs in the future.
The need to handle (process and store) massive amounts of data (Big Data), is a reality. In areas such as scientific experiments, social networks, credit card fraud detection, and financial analysis, massive amounts of information is generated and processed daily to extract valuable, summarized information. Due to its fast development cycle (i.e., less expensive to develop), mainly because of automatic memory management, and rich community resource, managed object-oriented programming languages (such as Java) are the first choice to develop Big Data platforms (e.g., Cassandra, Spark) on which such Big Data applications are executed. However, automatic memory management comes at a cost. This cost is introduced by the Garbage Collector which is responsible for collecting objects that are no longer being used. In this work, we study current Big Data platforms and their memory profiles to understand why classic algorithms (which are still the most common) are not appropriate and also analyze recently proposed and relevant memory management algorithms, targeted to Big Data environments. We characterize the scalability of recent memory management algorithms in terms of throughput (improves the throughput of the application) and pause time (reduces the latency of the application) when comparing to classic algorithms. We conclude our study by presenting a taxonomy of the described works.
Underwater wireless sensor networks (UWSNs) --- formed by underwater sensor nodes with sensing, processing, storage and underwater wireless communication capabilities --- will pave the way for a new era of underwater monitoring and actuation applications. UWSN has become a fast growing field. The envisioned landscape of applications that will be enabled by UWSNs has tremendous potential to change the current reality, where no more than 5\% of the volume of the oceans were explored. However, to enable large deployments of UWSNs, networking solutions toward efficient underwater data collection need to be investigated and proposed. The suitable, autonomous and on-the-fly organization of UWSN topology, through topology control algorithms, might mitigate undesired effects of the underwater wireless communication and, consequently, improve networking services and protocols. In this paper, therefore, we highlight the potentials of topology control for underwater sensor networks. We proposed to classify topology control algorithms, based on their principal methodology used to change the network topology, into three major groups: power control, wireless interface mode management and mobility assisted-based techniques. On the basis of the proposed classification, we survey the current state-of-the-art and present an in-depth discussion of topology control solutions designed for UWSNs.
As applications and operating systems are becoming more complex, the last decade has seen the rise of many tracing tools all across the software stack. This paper presents a hands-on comparison of modern tracers on Linux systems, both in user space and kernel space. The authors implement microbenchmarks that not only quantify the overhead of different tracers, but also sample fine-grained metrics that unveil insights into the tracers' internals and show the cause of each tracer's overhead. Internal design choices and implementation particularities are discussed, which helps to understand the challenges of developing tracers. Furthermore, this analysis aims to help users choose and configure their tracers based on their specific requirements in order to reduce their overhead and get the most of out of them.
Feature selection has been proven to be effective and efficient in preparing high-dimensional data for data mining and machine learning problems. The objectives include: building simpler and more comprehensible models, improving data mining performance, and preparing clean, understandable data. The recent proliferation of big data has presented some substantial challenges and opportunities of feature selection algorithms. In this survey, we provide a comprehensive and structured overview of recent advances in feature selection research. In particular, we revisit feature selection research from a data perspective, and review representative feature selection algorithms for generic data, structured data, heterogeneous data and streaming data. Methodologically, to emphasize the differences and similarities of most existing feature selection algorithms for generic data, we generally categorize them into four groups: similarity based, information theoretical based, sparse learning based and statistical based methods. Finally, to facilitate and promote the research in this community, we also present an open-source feature selection repository that consists of most of the popular feature selection algorithms (http://featureselection.asu.edu/). Also, we use it as an example to show how to evaluate feature selection algorithms. At last, we also have a discussion about some open problems and challenges that need to be paid more attention in future research.
Networks are used to represent relationships between entities in many complex systems, spanning from online social networks to biological cell development and brain activity. These networks model relationships which present various challenges. In many cases, relationships between entities are unambiguously known: are two users friends in a social network? Do two researchers collaborate on a published paper? Do two road segments in a transportation system intersect? These are unambiguous and directly observable in the system in question. In most cases, relationship between nodes are not directly observable and must be inferred: does one gene regulate the expression of another? Do two animals who physically co-locate have a social bond? Who infected whom in a disease outbreak? Existing approaches use specialized knowledge in different home domains to infer and measure the goodness of inferred network for a specific task. However, current research lacks a rigorous validation framework which employs standard statistical validation. In this survey, we examine how network representations are learned from non-network data, the variety of questions and tasks on these data over several domains, and validation strategies for measuring the inferred network's capability of answering questions on the original system of interest.
Approximate computing has gained research attention recently as a way to increase energy efficiency and/or performance by exploiting some applications' intrinsic error resiliency. However, little attention has been given to its potential for tackling the communication bottleneck which remains as one of the looming challenges to be tackled for efficient parallelism. This paper seeks to explore the potential benefits of approximate computing for communication reduction by surveying four promising techniques for approximate communication - compression, relaxed synchronization, value prediction, and accelerators. The techniques are compared based on an evaluation framework composed of: communication cost reduction, performance, energy reduction, application domain, overheads, and output degradation. Comparison results show that lossy link compression and approximate value prediction are good choices for reducing the communication bottleneck in bandwidth constrained applications, while relaxed synchronization and approximate accelerators can achieve greater speedups on applications amenable to these techniques. Finally, this paper also includes several suggestions for future research on approximate communication techniques.
Web application providers have been migrating their applications to cloud data centers, attracted by the emerging cloud computing paradigm. One of the appealing features of cloud is elasticity. It allows cloud users to acquire or release computing resources on demand, which enables web application providers to auto-scale the resources provisioned to their applications under dynamic workload in order to minimize resource cost while satisfying Quality of Service (QoS) requirements. In this paper, we comprehensively analyze the challenges remain in auto-scaling web applications in clouds and review the developments in this field. We present a taxonomy of auto-scaling systems according to the identified challenges and key properties. We analyze the surveyed works and map them to the taxonomy to identify the weakness in this field. Moreover, based on the analysis, we propose new future directions.
Recent diversity of storage demands made various shortcomings of traditional RDBMS systems revealed, which in turn led to the emergence of a new trend of complementary non-relational data management solutions, named as NoSQL (Not only SQL). This survey mainly aims at presenting the work that has been conducted with regard to four closely related concepts of NoSQL stores: data model, consistency model, data partitioning and replication. For each concept, its different protocols, and for each protocol, its corresponding features, strengths and drawbacks are explained. Furthermore, various implementations of each protocol are exemplified and crystallized through a collection of representative academic and industrial NoSQL technologies. The rationale behind each design decision along with some corresponding extensions and improvements are discussed. Finally, we disclose some existing challenges in developing effective NoSQL stores, which need attention from the research community, application designers and architects.
Network-enabled sensing and actuation devices are key enablers to connect real-world objects to the cyber world. Internet of Things (IoT) uses these network-enabled devices and communication technologies to allow connectivity and integration of physical objects (Things) from real-world into the data-driven digital world (Internet). Enormous amounts of dynamic IoT data are collected from Internet-connected devices. IoT data is, however, often multi-variant streams that are heterogeneous, sporadic, multi-modal and spatio-temporal. IoT data can be disseminated with different granularities and have diverse structures, types and qualities. Dealing with data deluge from heterogeneous IoT resources and services impose challenges on indexing, discovery and ranking mechanisms to build applications that require on-line access and retrieval of IoT data. However, the existing IoT data indexing and discovery approaches are complex (usually based on formal and logical methods) or centralised which hinder their scalability. The primary objective of this paper is to provide a holistic overview of the state-of-the-art on indexing, discovering and ranking of IoT data. We discuss on-line analysis and fast responses to complex queries. The paper aims to pave the way for researchers to design, develop, implement and evaluate techniques and approaches in future for on-line large-scale distributed IoT applications and platforms.
Authenticated encryption (AE) has long been a vital operation in cryptography due to its ability to provide confidentiality, integrity and authenticity at the same time. Its use has soared in parallel with widespread use of Internet and has led to several new schemes. There have already been studies investigating software performance of various schemes. However, the same is yet to be done for hardware. In this paper, we present a comprehensive survey of hardware performance of the most commonly used authenticated encryption schemes in literature. These schemes include encrypt-then-MAC combination, block cipher based AE modes, relatively new authenticated encryption ciphers and the recently-introduced permutation-based AE scheme. For completeness, we implemented each scheme with various standardized block ciphers and/or hash algorithms, and their lightweight versions. In our evaluation, we targeted minimizing the time-area product while maximizing the throughput on ASIC platforms. 45nm NANGATE Open Cell Library was used for syntheses. In the results, we present area, speed, time-area product, throughput, and power figures for both standard and lightweight versions of each scheme. Finally, we provide an unbiased discussion on the impact of the structure and complexity of each scheme on hardware implementation, together with recommendations on hardware-friendly authenticated encryption scheme design.
Shape-changing interfaces are physically tangible, interactive devices, surfaces or spaces. Over the last fifteen years, research has produced functional prototypes over many use-applications, and reviews have identified themes and possible future directions but have not yet looked at possible design or application based research. Here we gather this information together to provide a reference for designers and researchers wishing to build upon existing prototyping work, using synthesis and discussion of existing shape-changing interface reviews and comprehensive analysis and classification of 78 shape-changing interfaces. Eight categories of prototype are identified, alongside recommendations for the field.
Crowd-centric research is receiving increasingly more attention as data sets on crowd behavior are becoming readily available. We have come to a point that many of the models on pedestrian analytics introduced in the last decade, which have mostly not been validated, can now be tested using real-world data sets. In this survey we concentrate exclusively on automatically gathering such data sets, which we refer to as sensing the behavior of pedestrians. We roughly distinguish two approaches: one that requires users to explicitly use local applications and wearables, and one that scans the presence of handheld devices such as smartphones. We come to the conclusion that despite the numerous reports in popular media, relatively few groups have been looking into practical solutions for sensing pedestrian behavior. Moreover, we find that much work is still needed, in particular when it comes to combing privacy, transparency, scalability, and ease of deployment. We report on over 90 relevant articles and discuss and compare in detail 30 reports on sensing pedestrian behavior.
Geomagnetism has recently attracted considerable attention for indoor localization due to its pervasiveness and unreliance on extra infrastructure. Its location signature has been observed to be temporally stable and spatially discernible for localization purposes. This survey investigates the recent challenges and advances in geomagnetism-based indoor localization using smartphones. We first study smartphone-based geomagnetism measurements. We then review recent efforts in database construction and computation reduction, followed by state-of-the-art schemes in localizing the target. For each category, we identify practical deployment challenges and compare related studies. Finally, we summarize future directions and provide guideline for new researchers in this field.
The last decades have seen a growing interest and demand for collaborative systems and platforms. These systems and platforms aim to provide an environment in which users can collaboratively create, share and manage resources. While offering attractive opportunities for online collaboration and information sharing, they also open several security and privacy issues. This has attracted several research efforts towards the design and implementation of novel access control solutions that can handle the complexity introduced by collaboration. Despite these efforts, transition to practice has been hindered by the lack of maturity of the proposed solutions. The access control solutions typically adopted by commercial collaborative systems like online social network websites and collaborative editing platforms, are still rather rudimentary and do not provide users with a sufficient control over their resources. This survey examines the growing literature on access control for collaborative systems centered on communities, and identifies the main challenges to be addressed in order to facilitate the adoption of collaborative access control solutions in real-life settings. Based on the literature study, we delineate a roadmap for future research in the area of access control for community-centered collaborative systems.
Owing to the widespread adoption of GPS-enabled devices, such as smart phones and GPS navigation devices, more and more location information is being collected. Compared with traditional ones (e.g., Amazon, Taobao and Dangdang), recommender systems built on location-based social networks (LBSNs) have received much attention. The former mine users preference through the relationship between users and items, e.g., online commodity, movies and music. Based on their preference, items in which they may be interested are recommended in order to help them find the items that they may like. The latter add location as a new dimension to the former, hence resulting in the three-dimensional relationship among users, locations and activities. Based on this relationship, locations, activities and friends can be recommended to users. For example, users are allowed to check in at different location on Facebook and Foursquare by using their GPS-enabled devices, which can be further used to analyze their preference. In the paper, we review the objectives and state-of-the-art of LBSN recommender systems. We indicate potential research directions.
This article presents an annotated bibliography on automatic software repair. Automatic software repair consists of automatically finding a solution to software bugs, without human intervention. The uniqueness of this article is that it spans the research communities that contribute to this body of knowledge: software engineering, dependability, operating systems, programming languages and security. Furthermore, it provides a novel and structured overview of the diversity of bug oracles and repair operators used in the literature.
High Performance Computing (HPC) clouds are becoming an alternative to on-premise clusters for executing traditional scientific applications and analytics business services. Most research efforts in HPC cloud aim to understand the cost-benefit of moving resource-intensive applications from dedicated on-premise environments to shared public cloud platforms. Industry trends show hybrid environments are the natural path to get the best of the on-premise and cloud resources---steady (and sensitive) workloads can run on on-premise resources and peak demand can leverage remote resources in a pay-as-you-go manner. Nevertheless, there are plenty of questions to be answered in HPC cloud, which range from how to extract the best performance of an unknown underlying platform to what services are essential to make it easier its usage. Moreover, the discussion on the right pricing and contractual models that will fit both small and large users is relevant for the sustainability of HPC clouds. This paper brings a survey and taxonomy of efforts in HPC cloud and a vision on what we believe is ahead of us, including a set of research challenges that, once tackled, can help advance businesses and scientific discoveries. This becomes particularly relevant due to the fast increasing wave of new HPC applications coming from big data and artificial intelligence.
Automatic machine-based Facial Expression Analysis (FEA) has witnessed substantial progress in the past few decades motivated by its importance in psychology, security, health, entertainment and human computer interaction. However, the vast majority of current studies are based on non-occluded faces collected in a controlled laboratory environment, and automatic expression recognition from partially occluded faces remains a largely unresolved field, particularly in real-world scenarios. In recent years, increasing efforts have been directed at investigating techniques to handle partial occlusion for FEA. This survey provides a comprehensive review of the recent advances in dataset creation, algorithm development, and investigations of the effects of occlusion, which are crucial in system design and evaluations. It also outlines existing challenges in overcoming partial occlusion and discusses possible opportunities in advancing the technology. To the best of our knowledge, it is the first FEA survey dedicated to occlusion and devoted to serve as a starting point to promote future work.
Vehicular networks and their associated technologies enable an extremely varied plethora of applications and therefore attract increasing attention from a wide audience. However vehicular networks also have many challenges that arise mainly due to their dynamic and complex environment. Fuzzy Logic, known for its ability to deal with complexity, imprecision and model non-deterministic problems, is a very promising technology for use in such a dynamic and complex context. This paper presents the first comprehensive survey of research on Fuzzy Logic approaches in the context of vehicular networks, and provides fundamental information which enables readers to design their own Fuzzy Logic systems in this context. As such, the paper describes the Fuzzy Logic concepts with emphasis on their implementation in vehicular networks, includes a classification and thorough analysis of the Fuzzy Logic-based solutions in vehicular networks and discusses how Fuzzy Logic could empower the key research directions in the 5G-enabled vehicular networks, the next generation of vehicular communications.
Metamorphic testing is an approach to both test case generation and test result verification. A central element is a set of metamorphic relations, which are necessary properties of the target function or algorithm in relation to multiple inputs and their expected outputs. Since its first publication, we have witnessed a rapidly increasing body of work examining metamorphic testing from various perspectives, including metamorphic relation identification, test case generation, integration with other software engineering techniques, and the validation and evaluation of software systems. In this paper, we review the current research of metamorphic testing and discuss the challenges yet to be addressed. We also present visions for further improvement of metamorphic testing and highlight opportunities for new research.
This article presents a comprehensive survey on parallel I/O. This is an important field for High Performance Computing because of the historic gap between processing power and storage latencies, which causes applications performance to be impaired when accessing or generating large amounts of data. As the available processing power and amount of data increase, I/O remains a central issue for the scientific community. In this survey, we present background concepts everyone could benefit from. Moreover, through the comprehensive study of publications from the most important conferences and journals in a five-year time window, we discuss the state of the art of I/O optimization approaches, access pattern extraction techniques, and performance modeling, in addition to general aspects of parallel I/O research. Through this approach, we aim at identifying the general characteristics of the field and the main current and future research topics.
Robots are sophisticated machines that are susceptible to different types of faults. These faults have to be detected and diagnosed in time to allow recovery and continuous operation. The field of Fault Detection and Diagnosis (FDD) has been studied for many years. Yet, the study of FDD for robotics is relatively new, and only few surveys were presented. These surveys have focused on traditional FDD approaches and how they may broadly apply to a generic type of robots. Yet, robotic systems can be identified by fundamental characteristics, which pose different constraints and requirements from FDD. In this paper, we aim to provide the reader with useful insights regarding the use of FDD approaches which best suit the different characteristics of robotic systems. We elaborate on the advantages and the challenges these approaches must face. We use two perspectives: (1) FDD from the perspective of the different characteristics of robotic systems, and (2) FDD from the perspective of the different approaches. Finally, we describe research opportunities. With these three contributions readers from both the FDD and the robotics research communities are introduced to this subject.
Designing an optimal distributed database is an extremely complex process due to many factors like large number of relations, data transmission costs, number of network sites, communication costs between sites and query response time. In the sake of achieving an optimal design, fragmentation, replication and data allocation techniques are the key factors for providing a high rendering and supporting data access and sharing at different sites. It is worth saying, however, that these techniques often treated separately and rarely processed together. Some researches sought to find only optimal allocation methods regardless of how the fragmentation technique is performed or replication process is adopted. In contrast, others attempt to find the best fragment solution without considering how allocation would be performed. In this paper, most of different fragmentation, replication and allocation techniques are extensively and precisely scrutinized in contemporary literature for both centralized and distributed databases. Furthermore, some of these techniques presented as cases study for well-analyzed fragmentation and allocation models. These cases are cited as evidence proving that a well designed distributed database can result in significant reduction in communication costs, response time and substantial boost in performance outperforming over centralized systems for geographically distributed sites.
While cloud computing has brought paradigm shifts to computing services, researchers and developers have also found some problems inherent to its nature such as bandwidth bottleneck, communication overhead, and location blindness. The concept of fog/edge computing is therefore coined to extend the services from the core in cloud data centers to the edge of the network. In recent years, many systems are proposed to better serve ubiquitous smart devices closer to the user. This paper provides a complete and up-to-date review of edge-oriented computing systems by encapsulating relevant proposals on their architecture features, management approaches, and design objectives.
Despite the increasing use of social media for information and news gathering, its unmoderated nature often leads to the emergence and spread of rumours, i.e. unverified pieces of information. At the same time, the openness of social media provides opportunities to study how users share and discuss rumours, and to explore how natural language processing and data mining techniques may be used to find ways of determining their veracity. In this survey we introduce and discuss two types of rumours that circulate on social media; long-standing rumours that circulate for long periods of time, and newly-emerging rumours spawned during fast-paced events such as breaking news, where unverified reports are often released piecemeal. We provide an overview of research into social media rumours with the ultimate goal of developing a rumour classification system that consists of four components: rumour detection, rumour tracking, rumour stance classification and rumour veracity classification. We delve into the approaches presented in the scientific literature for the development of each of these components. We summarise the efforts and achievements so far towards the development of rumour classification systems and conclude with suggestions for avenues for future research in social media mining for detection and resolution of rumours.
Activity recognition aims to provide accurate and opportune information on peoples activities by leveraging sensory data available in todays sensory rich environments. Nowadays, activity recognition has become an emerging field in the areas of pervasive and ubiquitous computing. A typical activity recognition technique processes data streams that evolve from sensing platforms such as mobile sensors, on body sensors, and/or ambient sensors. This paper surveys the two overlapped areas of research of activity recognition and data stream mining. The perspective of this paper is to review the adaptation capabilities of activity recognition techniques in streaming environment. Broad categories of techniques are identified based on the different features in both data streams and activity recognition. The pros and cons of the algorithms in each category are analysed and the possible directions of future research are indicated.
In recent years, eye-tracking has been used by researchers in the field of programming education to analyse and understand tasks such as code comprehension, debugging, collaborative programming, tractability and the comprehension of non-code programming representations. Eye-trackers are used to gain more insights into the cognitive process of programmers and programming techniques. In this paper, we perform a systematic literature review (SLR) on existing research using eye-tracking in computer programming. We identify, evaluate, and report 65 studies, published between 1990 and 2015. Participants in these studies were mainly students and faculty members with the common programming language used are Java and UML representation. We also report on a range of eye-trackers and attention tracking tools utilized in these studies and found that the Tobii eye-trackers are more preferred among researchers. In this SLR, we report the findings based on the materials, participant sample, and eye-tracking device used in each experiment.
Online judges are systems designed for the reliable evaluation of algorithm source code submitted by users, which is next compiled and tested in a homogeneous environment. Online judges are becoming popular in various applications. Thus, we would like to review the state of the art for these systems. We classify them according to their principal objectives into systems supporting organization of competitive programming contests, enhancing education and recruitment processes, or facilitating the solving of data mining challenges, online compilers and development platforms integrated as components of other custom systems. Moreover, we present the Optil.io platform, which has been proposed for the solving of complex optimization problems. We also present the advantages of our system by analysis of the competition results conducted using the proposed platform. The competition proved that this platform, strengthened by crowdsourcing concepts, can be successfully applied to accurately and efficiently solve complex industrial- and science-driven challenges.
With the proliferation of online services and mobile technologies, the world has stepped into a multimedia big data era. Lots of research work have been done in the multimedia area, targeting at different aspects of big data analytics, such as the capture, storage, indexing, mining, and retrieval of multimedia big data. However, very few research work provides a complete survey of the whole pine-line of the multimedia big data analytics, including the management and analysis of the large amount of data, the challenges and opportunities, and the promising research directions. To serve this purpose, we present this survey which conducts a comprehensive overview of the state-of-the-art research work on multimedia big data analytics. It also aims to bridge the gap between multimedia challenges and big data solutions by providing the current big data frameworks, their applications in multimedia analyses, the strengths and limitations of the existing methods, and the potential future directions in multimedia big data analytics. To the best of our knowledge, this is the first survey which targets the most recent multimedia management techniques for very large-scale data and also provides the research studies and technologies advancing the multimedia analyses in this big data era.
We present a survey of multi-robot assembly applications and methods, and describe trends and general insights into the multi-robot assembly problem for industrial applications. We focus on fixtureless assembly strategies featuring two or more robotic systems. Such robotic systems include industrial robot arms, dexterous robotic hands, and autonomous mobile platforms, such as automated guided vehicles. In this survey, we identify the types of assemblies that are enabled by utilizing multiple robots, the algorithms that synchronize the motions of the robots to complete the assembly operations, and the metrics used to assess the quality and performance of the assemblies.
Crowdsourcing enables one to leverage on the intelligence and wisdom of potentially large groups of individuals toward solving problems. Common problems approached with crowdsourcing are labeling images, translating or transcribing text, providing opinions or ideas, and similar all tasks that computers are not good at or where they may even fail altogether. The introduction of humans into computations and/or everyday work, however, also poses critical, novel challenges in terms of quality control, as the crowd is typically composed of people with unknown and very diverse abilities, skills, interests, personal objectives and technological resources. This survey studies quality in the context of crowdsourcing along several dimensions, so as to define and characterize it and to understand the current state of the art. Specifically, this survey derives a quality model for crowdsourcing tasks, identifies the methods and techniques that can be used to assess the attributes of the model, and the actions and strategies that help prevent and mitigate quality problems. An analysis of how these features are supported by the state of the art further identifies open issues and informs an outlook on hot future research directions.
Network management and maintenance are time-consuming and often challenging tasks. With the emergent Software-Defined Networking paradigm, most of the focus is directed to the evolution of control protocols and platforms, or to deployment problems. Although researchers and network operators consider network management as a primary requirement, its development in SDN has been apparently set aside. This paper reports on the SDN architecture, introduces the concept of SDN tools and surveys the state-of-the-art in different aspects of the network management with emphasis on SDN. Because the SDN ecosystem lacks of a standardized management framework, initiatives are different and scattered.
Context: Recent years have seen growing interest in open-ended interactive tools such as games. One of the most crucial factors in developing games is to model and predict individual behavior. Although model-based approaches have been considered a standard way for this purpose, their application is often extremely difficult due to a huge space of actions can be created by games. For this reason, data-driven approaches have shown promise, in part because they are not completely reliant on expert knowledge. Objective: This study seeks to systematically review the existing research on the use of data-driven approaches in game player modeling. Method: We have carefully surveyed a nine-year sample (2008-2016) of experimental studies conducted on data-driven approaches in game player modeling, and thereby found 36 studies that addressed four primary research questions, and so we analyzed and classified the questions, methods, and findings of these published works, which we evaluated and drew conclusions from based on non-statistical methods. Results: We found that there are three primary avenues in which data-driven approaches have been studied in games research. In conclusion, we highlight critical future challenges in the area and offer directions for future study
Modern cloud environments support a relatively high degree of automation in service provisioning, which allows cloud users to dynamically acquire services required for deploying cloud applications. Cloud modeling languages (CMLs) have been proposed to address the diversity of features provided by todays cloud environments and support different application scenarios, e.g. migrating existing applications to the cloud, developing new cloud applications, or optimizing them. There is, however, still much debate on what a CML is and what aspects of a cloud application and the target cloud environment should be modeled by a CML. Furthermore, the distinction between CMLs on a fine-grained level exposing their modeling concepts is rarely made. In this article, we investigate the diverse features currently provided by existing CMLs. We classify and compare them according to a common framework with the goal to support cloud users in selecting the CML which fits the needs of their application scenario and setting. As a result, not only features of existing CMLs are pointed out for which extensive support is already provided but also in which existing CMLs are deficient, thereby suggesting a research agenda for the future.