Tsinghua Science and Technology

SPECIAL SECTION ON CLOUD COMPUTING AND BIG DATA

  • Efficient Location-Aware Data Placement for Data-Intensive Applications in Geo-distributed Scientific Data Centers

    Jinghui Zhang;Jian Chen;Junzhou Luo;Aibo Song;

    Recent developments in cloud computing and big data have spurred the emergence of data-intensive applications for which massive scientific datasets are stored in globally distributed scientific data centers that have a high frequency of data access by scientists worldwide. Multiple associated data items distributed in different scientific data centers may be requested for one data processing task, and data placement decisions must respect the storage capacity limits of the scientific data centers. Therefore, the optimization of data access cost in the placement of data items in globally distributed scientific data centers has become an increasingly important goal.Existing data placement approaches for geo-distributed data items are insufficient because they either cannot cope with the cost incurred by the associated data access, or they overlook storage capacity limitations, which are a very practical constraint of scientific data centers. In this paper, inspired by applications in the field of high energy physics, we propose an integer-programming-based data placement model that addresses the above challenges as a Non-deterministic Polynomial-time(NP)-hard problem. In addition we use a Lagrangian relaxation based heuristics algorithm to obtain ideal data placement solutions. Our simulation results demonstrate that our algorithm is effective and significantly reduces overall data access cost.

    2016年05期 v.21 471-481页 [查看摘要][在线阅读][下载 389K]
    [下载次数:40 ] |[网刊下载次数:0 ] |[引用频次:9 ] |[阅读次数:0 ]
  • A Pricing Model for Big Personal Data

    Yuncheng Shen;Bing Guo;Yan Shen;Xuliang Duan;Xiangqian Dong;Hong Zhang;

    Big Personal Data is growing explosively. Consequently, an increasing number of internet users are drowning in a sea of data. Big Personal Data has enormous commercial value; it is a new kind of data asset. An urgent problem has thus arisen in the data market: How to price Big Personal Data fairly and reasonably. This paper proposes a pricing model for Big Personal Data based on tuple granularity, with the help of comparative analysis of existing data pricing models and strategies. This model is put forward to implement positive rating and reverse pricing for Big Personal Data by investigating data attributes that affect data value, and analyzing how the value of data tuples varies with information entropy, weight value, data reference index, cost, and other factors. The model can be adjusted dynamically according to these parameters. With increases in data scale, reductions in its cost,and improvements in its quality, Big Personal Data users can thereby obtain greater benefits.

    2016年05期 v.21 482-490页 [查看摘要][在线阅读][下载 586K]
    [下载次数:338 ] |[网刊下载次数:0 ] |[引用频次:79 ] |[阅读次数:0 ]
  • SED:An SDN-Based Explicit-Deadline-Aware TCP for Cloud Data Center Networks

    Yifei Lu;

    Cloud data centers now provide a plethora of rich online applications such as web search, social networking, and cloud computing. A key challenge for such applications, however, is to meet soft real-time constraints. Due to the deadline-agnostic congestion control in Transmission Control Protocol(TCP), many deadline-sensitive flows cannot finish transmission before their deadlines. In this paper, we propose an SDNbased Explicit-Deadline-aware TCP(SED) for cloud Data Center Networks(DCN). SED assigns a base rate for non-deadline flows first and gives spare bandwidth to the deadline flows as much as possible. Subsequently,a Retransmission-enhanced SED(RSED) is introduced to solve the packet-loss timeout problem. Through our experiments, we show that SED can make flows meet deadlines effectively, and that it significantly outperforms previous protocols in the cloud data center environment.

    2016年05期 v.21 491-499页 [查看摘要][在线阅读][下载 789K]
    [下载次数:38 ] |[网刊下载次数:0 ] |[引用频次:4 ] |[阅读次数:0 ]
  • PCA-Based Network Traffic Anomaly Detection

    Meimei Ding;Hui Tian;

    The use of a Traffic Matrix(TM) to describe the characteristics of a global network has attracted significant interest in network performance research. Due to the high dimensionality and sparsity of network traffic, Principal Component Analysis(PCA) has been successfully applied to TM analysis. PCA is one of the most common methods used in analysis of high-dimensional objects. This paper shows how to apply PCA to TM analysis and anomaly detection. The experiment results demonstrate that the PCA-based method can detect anomalies for both single and multiple nodes with high accuracy and efficiency.

    2016年05期 v.21 500-509页 [查看摘要][在线阅读][下载 1057K]
    [下载次数:73 ] |[网刊下载次数:0 ] |[引用频次:21 ] |[阅读次数:0 ]
  • Multiple Routes Recommendation System on Massive Taxi Trajectories

    Yaobin He;Fan Zhang;Ye Li;Jun Huang;Ling Yin;Chengzhong Xu;

    This paper presents a cloud-based multiple-route recommendation system, xGo, that enables smartphone users to choose suitable routes based on knowledge discovered in real taxi trajectories. In modern cities, GPS-equipped taxicabs report their locations regularly, which generates a huge volume of trajectory data every day. The optimized routes can be learned by mining these massive repositories of spatio-temporal information. We propose a system that can store and manage GPS log files in a cloud-based platform, probe traffic conditions, take advantage of taxi driver route-selection intelligence, and recommend an optimal path or multiple candidates to meet customized requirements. Specifically, we leverage a Hadoop-based distributed route clustering algorithm to distinguish different routes and predict traffic conditions through the latent traffic rhythm. We evaluate our system using a real-world dataset(>100 GB) generated by about 20 000 taxis over a 2-month period in Shenzhen, China. Our experiments reveal that our service can provide appropriate routes in real time and estimate traffic conditions accurately.

    2016年05期 v.21 510-520页 [查看摘要][在线阅读][下载 997K]
    [下载次数:80 ] |[网刊下载次数:0 ] |[引用频次:17 ] |[阅读次数:0 ]

REGULAR ARTICLES

  • Accurate Indoor Navigation System Using Human-Item Spatial Relation

    Qiongzheng Lin;Yi Guo;

    Indoor navigation has received much attention by both industry and academia in recent years. To locate users, a number of existing methods use various localization algorithms in combination with an indoor map, which require expensive infrastructures deployed in advance. In this study, we propose the use of existing indoor objects with attached RFID tags and a reader to navigate users to their destinations, without the need for any additional hardware. The key insight upon which our proposal is based is that a person's movement has an impact on the frequency shift values collected from indoor objects when they near a tag. We leverage this local human-item spatial relation to infer the user's position and then navigate the user to the desired destination step by step. We implement a prototype navigation system, called Roll Caller, and conduct a comprehensive range of experiments to examine its performance.

    2016年05期 v.21 521-537页 [查看摘要][在线阅读][下载 921K]
    [下载次数:63 ] |[网刊下载次数:0 ] |[引用频次:2 ] |[阅读次数:0 ]
  • Analysis of Outage Capacity of NOMA: SIC vs. JD

    Shuang Chen;Kewu Peng;Huangpin Jin;Jian Song;

    In fifth-generation wireless communication networks, Non-Orthogonal Multiple Access(NOMA) has attracted much attention in both academic and industrial fields because of its higher spectral efficiency in comparison with orthogonal multiple access. Recently, numerous uplink NOMA techniques have been proposed,some of which are based on Successive Interference Cancellation(SIC) and others on Joint Decoding(JD, or simultaneous decoding). In this study, we analyze the outage capacities of SIC and JD in the case of single-block transmission over a two-user Gaussian multiple-access channel with partial channel state information at transmitter from the perspective of information theory. Results of the analysis and numerals show that compared to SIC, JD can achieve a sum-rate gain of up to 10% or sum-power gain of 0.8 dB.

    2016年05期 v.21 538-543页 [查看摘要][在线阅读][下载 712K]
    [下载次数:131 ] |[网刊下载次数:0 ] |[引用频次:7 ] |[阅读次数:0 ]
  • Observer Design Based on Self-Recurrent Consequent-Part Fuzzy Wavelet Neural Network

    Xin Wen;Xin Li;

    In this paper, we propose and construct an observer design based on a Self-Recurrent Consequent-Part Fuzzy Wavelet Neural Network(SRCPFWNN) for a class of nonlinear system. We use a Self-Recurrent Wavelet Neural Network(SRWNN) to construct a self-recurrent consequent part for each rule of the Takagi-Sugeno-Kang(TSK) model in the SRCPFWNN and analyze the structure of the fuzzy wavelet neural network model. Based on the Direct Adaptive Control Theory(DACT) and a back propagation-based learning algorithm, all parameters of the consequent parts are updated online in the SRCPFWNN. On this basis, we propose a design method using an adaptive state observer based on an SRCPFWNN for nonlinear systems. Using the Lyapunov function, we then prove the stability of this observer design method. Our simulation results confirm that the observer can accurately and quickly estimate the state values of the system.

    2016年05期 v.21 544-551页 [查看摘要][在线阅读][下载 1284K]
    [下载次数:23 ] |[网刊下载次数:0 ] |[引用频次:2 ] |[阅读次数:0 ]
  • Fast Remote-Sensing Image Registration Using Priori Information and Robust Feature Extraction

    Xijia Liu;Xiaoming Tao;Ning Ge;

    In this paper, we propose a fast registration scheme for remote-sensing images for use as a fundamental technique in large-scale online remote-sensing data processing tasks. First, we introduce priori-information images,and use machine learning techniques to identify robust remote-sensing image features from state-of-the-art ScaleInvariant Feature Transform(SIFT) features. Next, we apply a hierarchical coarse-to-fine feature matching and image registration scheme on the basis of additional priori information, including a robust feature location map and platform imaging parameters. Numerical simulation results show that the proposed scheme increases position repetitiveness by 34%, and can speed up the overall image registration procedure by a factor of 7:47 while maintaining the accuracy of the image registration performance.

    2016年05期 v.21 552-560页 [查看摘要][在线阅读][下载 361K]
    [下载次数:74 ] |[网刊下载次数:0 ] |[引用频次:10 ] |[阅读次数:0 ]
  • Probabilistic Modeling and Optimization of Real-Time Protocol for Multifunction Vehicle Bus

    Lifan Su;Min Zhou;Hai Wan;Ming Gu;

    In this paper, we present the modeling and optimization of a Real-Time Protocol(RTP) used in Train Communication Networks(TCN). In the proposed RTP, message arbitration is represented by a probabilistic model and the number of arbitration checks is minimized by using the probability of device activity. Our optimized protocol is fully compatible with the original standard and can thus be implemented easily. The experimental results demonstrate that the proposed algorithm can reduce the number of checks by about 50%, thus significantly enhancing bandwidth.

    2016年05期 v.21 561-569页 [查看摘要][在线阅读][下载 317K]
    [下载次数:34 ] |[网刊下载次数:0 ] |[引用频次:5 ] |[阅读次数:0 ]
  • HW/SW Co-optimization for Stencil Computation:Beginning with a Customizable Core

    Yanhua Li;Youhui Zhang;Weiming Zheng;

    Energy efficiency is one of the most important issues for High Performance Computing(HPC) today.Heterogeneous HPC platform with some energy-efficient customizable cores(as application-specific accelerators)is believed as one of the promising solutions to meet ever-increasing computing needs and to overcome power density limitations. In this paper, we focus on using customizable processor cores to optimize the typical stencil computations—— the kernel of many high-performance applications. We develop a series of effective software/hardware co-optimization strategies to exploit the instruction-level and memory-computation parallelism,as well as to decrease the energy consumption. These optimizations include loop tiling, prefetching, cache customization, Single Instruction Multiple Data(SIMD), and Direct Memory Access(DMA), as well as necessary ISA extensions. Detailed tests of power-efficiency are given to evaluate the effect of all these optimizations comprehensively. The results are impressive: the combination of these optimizations has improved the application performance by 341% while the energy consumption has been decreased by 35%; a preliminary comparison with X86, GPU, and FPGA platforms also showed that the design could achieve an order of magnitude higher performance efficiency. We believe this work can help understand sources of inefficiency in general-purpose chips and can be used as a beginning to customize an energy efficient CMP for further improvement.

    2016年05期 v.21 570-580页 [查看摘要][在线阅读][下载 296K]
    [下载次数:17 ] |[网刊下载次数:0 ] |[引用频次:0 ] |[阅读次数:0 ]

  • Information for Contributors

    <正>Tsinghua Science and Technology(Tsinghua Sci Technol),an academic journal sponsored by Tsinghua University,is published bimonthly.This journal aims at presenting the up-to-date scientific achievements with high creativity and great significance in computer and electronic engineering.Contributions all over the world are welcome.Tsinghua Sci Technol is indexed by SCI,Engineering index(Ei,USA),INSPEC,SA,Cambridge Abstract,and other abstracting indexes.

    2016年05期 v.21 581页 [查看摘要][在线阅读][下载 553K]
    [下载次数:22 ] |[网刊下载次数:0 ] |[引用频次:0 ] |[阅读次数:0 ]
  • 下载本期数据