- Siqi Ji;Baochun Li;
Big data analytics,the process of organizing and analyzing data to get useful information,is one of the primary uses of cloud services today.Traditionally,collections of data are stored and processed in a single datacenter.As the volume of data grows at a tremendous rate,it is less efficient for only one datacenter to handle such large volumes of data from a performance point of view.Large cloud service providers are deploying datacenters geographically around the world for better performance and availability.A widely used approach for analytics of geo-distributed data is the centralized approach,which aggregates all the raw data from local datacenters to a central datacenter.However,it has been observed that this approach consumes a significant amount of bandwidth,leading to worse performance.A number of mechanisms have been proposed to achieve optimal performance when data analytics are performed over geo-distributed datacenters.In this paper,we present a survey on the representative mechanisms proposed in the literature for wide area analytics.We discuss basic ideas,present proposed architectures and mechanisms,and discuss several examples to illustrate existing work.We point out the limitations of these mechanisms,give comparisons,and conclude with our thoughts on future research directions.
2016年02期 v.21 125-135页 [查看摘要][在线阅读][下载 280K] [下载次数:34 ] |[网刊下载次数:0 ] |[引用频次:3 ] |[阅读次数:0 ] - Yinjun Wu;Zhen Chen;Yuhao Wen;Wenxun Zheng;Junwei Cao;
Bitmap indexing has been widely used in various applications due to its speed in bitwise operations.However,it can consume large amounts of memory.To solve this problem,various bitmap coding algorithms have been proposed.In this paper,we present COMbining Binary And Ternary encoding(COMBAT),a new bitmap index coding algorithm.Typical algorithms derived from Word Aligned Hybrid(WAH)are COMPressed Adaptive inde X(COMPAX)and Compressed"n"Composable Integer Set(CONCISE),which can combine either two or three continuous words after WAH encoding.COMBAT combines both mechanisms and results in more compact bitmap indexes.Moreover,querying time of COMBAT can be faster than that of COMPAX and CONCISE,since bitmap indexes are smaller and it would take less time to load them into memory.To prove the advantages of COMBAT,we extend a theoretical analysis model proposed by our group,which is composed of the analysis of various possible bitmap indexes.Some experimental results based on real data are also provided,which show COMBAT’s storage and speed superiority.Our results demonstrate the advantages of COMBAT and codeword statistics are provided to solidify the proof.
2016年02期 v.21 136-145页 [查看摘要][在线阅读][下载 931K] [下载次数:44 ] |[网刊下载次数:0 ] |[引用频次:5 ] |[阅读次数:0 ] - Wei Dai;Yufeng Wang;Qun Jin;Jianhua Ma;
Currently,mobile devices(e.g.,smartphones)are equipped with multiple wireless interfaces and rich builtin functional sensors that possess powerful computation and communication capabilities,and enable numerous Mobile Crowdsourced Sensing(MCS)applications.Generally,an MCS system is composed of three components:a publisher of sensing tasks,crowd participants who complete the crowdsourced tasks for some kinds of rewards,and the crowdsourcing platform that facilitates the interaction between publishers and crowd participants.Incentives are a fundamental issue in MCS.This paper proposes an integrated incentive framework for MCS,which appropriately utilizes three widely used incentive methods:reverse auction,gamification,and reputation updating.Firstly,a reverse-auction-based two-round participant selection mechanism is proposed to incentivize crowds to actively participate and provide high-quality sensing data.Secondly,in order to avoid untruthful publisher feedback about sensing-data quality,a gamification-based verification mechanism is designed to evaluate the truthfulness of the publisher’s feedback.Finally,the platform updates the reputation of both participants and publishers based on their corresponding behaviors.This integrated incentive mechanism can motivate participants to provide high-quality sensed contents,stimulate publishers to give truthful feedback,and make the platform profitable.
2016年02期 v.21 146-156页 [查看摘要][在线阅读][下载 664K] [下载次数:58 ] |[网刊下载次数:0 ] |[引用频次:18 ] |[阅读次数:0 ] - Zhiyao Hu;Xiaoqiang Teng;Deke Guo;Bangbang Ren;Pin Lv;Zhong Liu;
Set reconciliation between two nodes is widely used in network applications.The basic idea is that each member of a node pair has an object set and seeks to deliver its unique objects to the other member.The Standard Bloom Filter(SBF)and its variants,such as the Invertible Bloom Filter(IBF),are effective approaches to solving the set reconciliation problem.The SBF-based method requires each node to represent its objects using an SBF,which is exchanged with the other node.A receiving node queries the received SBF against its local objects to identify the unique objects.Finally,each node exchanges its unique objects with the other node in the node pair.For the IBFbased method,each node represents its objects using an IBF,which is then exchanged.A receiving node subtracts the received IBF from its local IBF so as to decode the different objects between the two sets.Intuitively,it would seem that the IBF-based method,with only one round of communication,entails less communication overhead than the SBF-based method,which incurs two rounds of communication.Our research results,however,indicate that neither of these two methods has an absolute advantages over the others.In this paper,we aim to provide an in-depth understanding of the two methods,by evaluating and comparing their communication overhead.We find that the best method depends on parameter settings.We demonstrate that the SBF-based method outperforms the IBF-based method in most cases.But when the number of different objects in the two sets is below a certain threshold,the IBF-based method outperforms the SBF-based method.
2016年02期 v.21 157-167页 [查看摘要][在线阅读][下载 943K] [下载次数:23 ] |[网刊下载次数:0 ] |[引用频次:1 ] |[阅读次数:0 ] - Xiang Zhang;Wenyao Cheng;
Link patterns are consensus practices characterizing how different types of objects are typically interlinked in linked data.Mining link patterns in large-scale linked data has been inefficient due to the computational complexity of mining algorithms and memory limitations.To improve scalability,partitioning strategies for pattern mining have been proposed.But the efficiency and completeness of mining results are still under discussion.In this paper we propose a novel partitioning strategy for mining link patterns in large-scale linked data,in which linked data is partitioned according to edge-labeling rules:Edges are grouped into a primary multi-partition according to edge labels.A feedback mechanism is proposed to produce a secondary bi-partition according to a quick mining process.Local discovered link patterns in partitions are then merged into global patterns.Experiments show that our partition strategy is feasible and efficient.
2016年02期 v.21 168-175页 [查看摘要][在线阅读][下载 640K] [下载次数:28 ] |[网刊下载次数:0 ] |[引用频次:2 ] |[阅读次数:0 ] - Benika Hall;Andrew Quitadamo;Xinghua Shi;
Integrative network analysis is powerful in helping understand the underlying mechanisms of genetic and epigenetic perturbations for disease studies.Although it becomes clear that micro RNAs,one type of epigenetic factors,have direct effect on target genes,it is unclear how micro RNAs perturb downstream genetic neighborhood.Hence,we propose a network community approach to integrate micro RNA and gene expression profiles,to construct an integrative genetic network perturbed by micro RNAs.We apply this approach to an ovarian cancer dataset from The Cancer Genome Atlas project to identify the fluctuation of micro RNA expression and its effects on gene expression.First,we perform expression quantitative loci analysis between micro RNA and gene expression profiles via both a classical regression framework and a sparse learning model.Then,we apply the spin glass community detection algorithm to find genetic neighborhoods of the micro RNAs and their associated genes.Finally,we construct an integrated network between micro RNA and gene expression based on their community structure.Various disease related micro RNAs and genes,particularly related to ovarian cancer,are identified in this network.Such an integrative network allows us to investigate the genetic neighborhood affected by micro RNA expression that may lead to disease manifestation and progression.
2016年02期 v.21 176-195页 [查看摘要][在线阅读][下载 602K] [下载次数:55 ] |[网刊下载次数:0 ] |[引用频次:2 ] |[阅读次数:0 ]
- Chang Chen;Xiaohe Hu;Kai Zheng;Xiang Wang;Yang Xiang;Jun Li;
Most types of Software-Defined Networking(SDN)architectures employ reactive rule dispatching to enhance real-time network control.The rule dispatcher,as one of the key components of the network controller,generates and dispatches the cache rules with response for the packet-in messages from the forwarding devices.It is important not only for ensuring semantic integrity between the control plane and the data plane,but also for preserving the performance and efficiency of the forwarding devices.In theory,generating the optimal cache rules on demands is a knotty problem due to its high theoretical complexity.In practice,however,the characteristics lying in real-life traffic and rule sets demonstrate that temporal and spacial localities can be leveraged by the rule dispatcher to significantly reduce computational overhead.In this paper,we take a deep-dive into the reactive rule dispatching problem through modeling and complexity analysis,and then we propose a set of algorithms named Hierarchy-Based Dispatching(HBD),which exploits the nesting hierarchy of rules to simplify the theoretical model of the problem,and trade the strict coverage optimality off for a more practical but still superior rule generation result.Experimental result shows that HBD achieves performance gain in terms of rule cache capability and rule storage efficiency against the existing approaches.
2016年02期 v.21 196-209页 [查看摘要][在线阅读][下载 476K] [下载次数:32 ] |[网刊下载次数:0 ] |[引用频次:1 ] |[阅读次数:0 ] - Yanting Ren;Liji Wu;Hexin Li;Xiangyu Li;Xiangmin Zhang;An Wang;Hongyi Chen;
The security of CPU smart cards,which are widely used throughout China,is currently being threatened by side-channel analysis.Typical countermeasures to side-channel analysis involve adding noise and filtering the power consumption signal.In this paper,we integrate appropriate preprocessing methods with an improved attack strategy to generate a key recovery solution to the shortcomings of these countermeasures.Our proposed attack strategy improves the attack result by combining information leaked from two adjacent clock cycles.Using our laboratory-based power analysis system,we verified the proposed key recovery solution by performing a successful correlation power analysis on a Triple Data Encryption Standard(3DES)hardware module in a real-life 32-bit CPU smart card.All 112 key bits of the 3DES were recovered with about 80 000 power traces.
2016年02期 v.21 210-220页 [查看摘要][在线阅读][下载 905K] [下载次数:49 ] |[网刊下载次数:0 ] |[引用频次:4 ] |[阅读次数:0 ] - Jinghuan Wen;Huimin Ma;Xiaoqin Zhang;
Interference and anti-interference are two opposite and important issues in visual tracking.Occlusion interference can disguise the features of a target and can also be used as an effective benchmark to determine whether a tracking algorithm is reliable.In this paper,we proposed an inner Particle Swarm Optimization(PSO)algorithm to locate the optimal occlusion strategy under different tracking conditions and to identify the most effective occlusion positions and direction of movement to allow a target to evade tracking.This algorithm improved the standard PSO process in three ways.First,it introduced a death process,which greatly reduced the time cost of optimization.Second,it used statistical data to determine the fitness value of the particles so that the fitness more accurately described the tracking.Third,the algorithm could avoid being trapped in local optima,as the fitness changes with time.Experimental results showed that this algorithm was able to identify a global optimal occlusion strategy that can disturb the tracking machine with 86.8%probability over more than 10 000 tracking processes.In addition,it reduced the time cost by approximately 80%,compared with conventional PSO algorithms.
2016年02期 v.21 221-230页 [查看摘要][在线阅读][下载 787K] [下载次数:41 ] |[网刊下载次数:0 ] |[引用频次:4 ] |[阅读次数:0 ] - Jiangzhou He;Wenguang Chen;Zhizhong Tang;
Structure Data Layout Optimization(SDLO)is a prevailing compiler optimization technique to improve cache efficiency.Structure transformation is a critical step for SDLO.Diversity of transformation methods and existence of complex data types are major challenges for structure transformation.We have designed and implemented STrans,a well-defined system which provides controllable and comprehensive functionality on structure transformation.Compared to known systems,it has less limitation on data types for transformation.In this paper we give formal definition of the approach STrans transforms data types.We have also designed Transformation Specification Language,a mini language to configure how to transform structures,which can be either manually tuned or generated by compiler.STrans supports three kinds of transformation methods,i.e.,splitting,peeling,and pool-splitting,and works well on different combinations of compound data types.STrans is the transformation system used in ASLOP and is well tested for all benchmarks for ASLOP.
2016年02期 v.21 231-240页 [查看摘要][在线阅读][下载 636K] [下载次数:19 ] |[网刊下载次数:0 ] |[引用频次:0 ] |[阅读次数:0 ]