Tsinghua Science and Technology

SPECIAL SECTION ON DATA MINING

  • Mining Sensor Data in Cyber-Physical Systems

    Lu-An Tang;Jiawei Han;Guofei Jiang;

    A Cyber-Physical System(CPS) integrates physical devices(i.e., sensors) with cyber(i.e., informational)components to form a context sensitive system that responds intelligently to dynamic changes in real-world situations. Such a system has wide applications in the scenarios of traffic control, battlefield surveillance,environmental monitoring, and so on. A core element of CPS is the collection and assessment of information from noisy, dynamic, and uncertain physical environments integrated with many types of cyber-space resources. The potential of this integration is unbounded. To achieve this potential the raw data acquired from the physical world must be transformed into useable knowledge in real-time. Therefore, CPS brings a new dimension to knowledge discovery because of the emerging synergism of the physical and the cyber. The various properties of the physical world must be addressed in information management and knowledge discovery. This paper discusses the problems of mining sensor data in CPS: With a large number of wireless sensors deployed in a designated area, the task is real time detection of intruders that enter the area based on noisy sensor data. The framework of IntruMine is introduced to discover intruders from untrustworthy sensor data. IntruMine first analyzes the trustworthiness of sensor data, then detects the intruders' locations, and verifies the detections based on a graph model of the relationships between sensors and intruders.

    2014年03期 v.19 225-234页 [查看摘要][在线阅读][下载 770K]
    [下载次数:95 ] |[网刊下载次数:0 ] |[引用频次:14 ] |[阅读次数:0 ]
  • Activity Recognition with Smartphone Sensors

    Xing Su;Hanghang Tong;Ping Ji;

    The ubiquity of smartphones together with their ever-growing computing, networking, and sensing powers have been changing the landscape of people's daily life. Among others, activity recoginition, which takes the raw sensor reading as inputs and predicts a user's motion activity, has become an active research area in recent years.It is the core building block in many high-impact applications, ranging from health and fitness monitoring, personal biometric signature, urban computing, assistive technology, and elder-care, to indoor localization and navigation,etc. This paper presents a comprehensive survey of the recent advances in activity recognition with smartphones' sensors. We start with the basic concepts such as sensors, activity types, etc. We review the core data mining techniques behind the main stream activity recognition algorithms, analyze their major challenges, and introduce a variety of real applications enabled by activity recognition.

    2014年03期 v.19 235-249页 [查看摘要][在线阅读][下载 678K]
    [下载次数:176 ] |[网刊下载次数:0 ] |[引用频次:33 ] |[阅读次数:0 ]
  • Efficient View-Based 3-D Object Retrieval via Hypergraph Learning

    Yue Gao;Qionghai Dai;

    View-based 3-D object retrieval has become an emerging topic in recent years, especially with the fast development of visual content acquisition devices, such as mobile phones with cameras. Extensive research efforts have been dedicated to this task, while it is still difficult to measure the relevance between two objects with multiple views. In recent years, learning-based methods have been investigated in view-based 3-D object retrieval, such as graph-based learning. It is noted that the graph-based methods suffer from the high computational cost from the graph construction and the corresponding learning process. In this paper, we introduce a general framework to accelerate the learning-based view-based 3-D object matching in large scale data. Given a query object Q and one object O from a 3-D dataset D, the first step is to extract a small set of candidate relevant 3-D objects for object O.Then multiple hypergraphs can be constructed based on this small set of 3-D objects and the learning on the fused hypergraph is conducted to generate the relevance between Q and O, which can be further used in the retrieval procedure. Experiments demonstrate the effectiveness of the proposed framework.

    2014年03期 v.19 250-256页 [查看摘要][在线阅读][下载 926K]
    [下载次数:34 ] |[网刊下载次数:0 ] |[引用频次:5 ] |[阅读次数:0 ]
  • Semiparametric Preference Learning

    Yi Zhen;Yangqiu Song;Dit-Yan Yeung;

    Unlike traditional supervised learning problems, preference learning learns from data available in the form of pairwise preference relations between instances. Existing preference learning methods are either parametric or nonparametric in nature. We propose in this paper a semiparametric preference learning model, abbreviated as SPPL, with the aim of combining the strengths of the parametric and nonparametric approaches. SPPL uses multiple Gaussian processes which are linearly coupled to determine the preference relations between instances.SPPL is more powerful than previous models while keeping the computational complexity low(linear in the number of distinct instances). We devise an efficient algorithm for model learning. Empirical studies have been conducted on two real-world data sets showing that SPPL outperforms related preference learning methods.

    2014年03期 v.19 257-264页 [查看摘要][在线阅读][下载 644K]
    [下载次数:20 ] |[网刊下载次数:0 ] |[引用频次:0 ] |[阅读次数:0 ]
  • ACTPred: Activity Prediction in Mobile Social Networks

    Jibing Gong;Jie Tang;A.C.M. Fong;

    A current trend for online social networks is to turn mobile. Mobile social networks directly reflect our real social life, and therefore are an important source to analyze and understand the underlying dynamics of human behaviors(activities). In this paper, we study the problem of activity prediction in mobile social networks. We present a series of observations in two real mobile social networks and then propose a method, ACTPred, based on a dynamic factor-graph model for modeling and predicting users' activities. An approximate algorithm based on mean fields is presented to efficiently learn the proposed method. We deploy a real system to collect users' mobility behaviors and validate the proposed method on two collected mobile datasets. Experimental results show that the proposed ACTPred model can achieve better performance than baseline methods.

    2014年03期 v.19 265-274页 [查看摘要][在线阅读][下载 1064K]
    [下载次数:92 ] |[网刊下载次数:0 ] |[引用频次:4 ] |[阅读次数:0 ]
  • An Integrated Workflow for Proteome-Wide Off-Target Identification and Polypharmacology Drug Design

    Thomas Evangelidis;Lei Xie;

    Polypharmacology, which focuses on designing drugs to target multiple receptors, has emerged as a new paradigm in drug discovery. To rationally design multi-target drugs, it is fundamental to understand protein-ligand interactions on a proteome scale. We have developed a Proteome-wide Off-target Pipeline(POP) that integrates ligand binding site analysis, protein-ligand docking, the statistical analysis of docking scores, and electrostatic potential calculations. The utility of POP is demonstrated by a case study, in which the molecular mechanism of anti-cancer effect of Nelfinavir is hypothesized. By combining structural proteome-wide off-target identification and systems biology, it is possible for us to correlate drug perturbations with clinical outcomes.

    2014年03期 v.19 275-284页 [查看摘要][在线阅读][下载 618K]
    [下载次数:59 ] |[网刊下载次数:0 ] |[引用频次:3 ] |[阅读次数:0 ]
  • Multiple-Instance Learning with Instance Selection via Constructive Covering Algorithm

    Yanping Zhang;Heng Zhang;Huazhen Wei;Jie Tang;Shu Zhao;

    Multiple-Instance Learning(MIL) is used to predict the unlabeled bags' label by learning the labeled positive training bags and negative training bags. Each bag is made up of several unlabeled instances. A bag is labeled positive if at least one of its instances is positive, otherwise negative. Existing multiple-instance learning methods with instance selection ignore the representative degree of the selected instances. For example, if an instance has many similar instances with the same label around it, the instance should be more representative than others. Based on this idea, in this paper, a multiple-instance learning with instance selection via constructive covering algorithm(MilCa) is proposed. In MilCa, we firstly use maximal Hausdorff to select some initial positive instances from positive bags, then use a Constructive Covering Algorithm(CCA) to restructure the structure of the original instances of negative bags. Then an inverse testing process is employed to exclude the false positive instances from positive bags and to select the high representative degree instances ordered by the number of covered instances from training bags. Finally, a similarity measure function is used to convert the training bag into a single sample and CCA is again used to classification for the converted samples. Experimental results on synthetic data and standard benchmark datasets demonstrate that MilCa can decrease the number of the selected instances and it is competitive with the state-of-the-art MIL algorithms.

    2014年03期 v.19 285-292页 [查看摘要][在线阅读][下载 435K]
    [下载次数:33 ] |[网刊下载次数:0 ] |[引用频次:6 ] |[阅读次数:0 ]
  • Personalized Recommendation Algorithm Based on Preference Features

    Liang Hu;Guohang Song;Zhenzhen Xie;Kuo Zhao;

    A hybrid collaborative filtering algorithm based on the user preferences and item features is proposed. A thorough investigation of Collaborative Filtering(CF) techniques preceded the development of this algorithm. The proposed algorithm improved the user-item similarity approach by extracting the item feature and applying various item features' weight to the item to confirm different item features. User preferences for different item features were obtained by employing user evaluations of the items. It is expected that providing better recommendations according to preferences and features would improve the accuracy and efficiency of recommendations and also make it easier to deal with the data sparsity. In addition, it is expected that the potential semantics of the user evaluation model would be revealed. This would explain the recommendation results and increase accuracy. A portion of the MovieLens database was used to conduct a comparative experiment among the proposed algorithms,i.e., the collaborative filtering algorithm based on the item and the collaborative filtering algorithm based on the item feature. The Mean Absolute Error(MAE) was utilized to conduct performance testing. The experimental results show that employing the proposed personalized recommendation algorithm based on the preference-feature would significantly improve the accuracy of evaluation predictions compared to two previous approaches.

    2014年03期 v.19 293-299页 [查看摘要][在线阅读][下载 601K]
    [下载次数:194 ] |[网刊下载次数:0 ] |[引用频次:39 ] |[阅读次数:0 ]

REGULAR ARTICLES

  • A Generalized Comfort Function of Subway Systems Based on a Nested Logit Model

    Yichen Zheng;Wei Guo;Yi Zhang;Jianming Hu;

    With rapid urbanization, subway systems are widely acknowledged as one of the best solutions to urban transportation problems. The operators or managers of subway systems should pay more attention to passenger's perceptions of service quality to maintain its competitive position. Taking the traffic state, efficiency,and environmental impact into consideration, the concept of generalized comfort is proposed in this paper. Based on a nested logit model, the selection probability for each factor in a generalized comfort function can be estimated using a nested structure. A certain factor is considered to be more significant in a generalized comfort function than others, when the corresponding probability of this factor is higher in value. Using stated preference and revealed preference data about passenger travel behavior obtained from the Beijing subway, the parameters of generalized comfort function are estimated by maximum likelihood techniques.

    2014年03期 v.19 300-306页 [查看摘要][在线阅读][下载 259K]
    [下载次数:91 ] |[网刊下载次数:0 ] |[引用频次:9 ] |[阅读次数:0 ]
  • The Extended Linear-Drift Model of Memristor and Its Piecewise Linear Approximation

    Xiaomu Mu;Juntang Yu;Shuning Wang;

    Memristor is introduced as the fourth basic circuit element. Memristor exhibits great potential for numerous applications, such as emulating synapse, while the mathematical model of the memristor is still an open subject. In the linear-drift model, the boundary condition of the device is not considered. This paper proposes an extended linear-drift model of the memristor. The extended linear-drift model keeps the linear characteristic and simplicity of the linear-drift model and considers the boundary condition of the device. A piecewise linear approximation model of the extended linear-drift model is given. Both models are suitable for describing the memristor.

    2014年03期 v.19 307-313页 [查看摘要][在线阅读][下载 661K]
    [下载次数:40 ] |[网刊下载次数:0 ] |[引用频次:3 ] |[阅读次数:0 ]
  • A New Algorithm for the Establishing Data Association Between a Camera and a 2-D LIDAR

    Lipu Zhou;Zhidong Deng;

    In this paper, we propose a new algorithm to establish the data association between a camera and a 2-D LIght Detection And Ranging sensor(LIDAR). In contrast to the previous works, where data association is established by calibrating the intrinsic parameters of the camera and the extrinsic parameters of the camera and the LIDAR, we formulate the map between laser points and pixels as a 2-D homography. The line-point correspondence is employed to construct geometric constraint on the homography matrix. This enables checkerboard to be not essential and any object with straight boundary can be an effective target. The calculation of the 2-D homography matrix consists of a linear least-squares solution of a homogeneous system followed by a nonlinear minimization of the geometric error in the image plane. Since the measurement quality impacts on the accuracy of the result, we investigate the equivalent constraint and show that placing the calibration target nearby the 2-D LIDAR will provide sufficient constraints to calculate the 2-D homography matrix. Simulation and experimental results validate that the proposed algorithm is robust and accurate. Compared with the previous works, which require two calibration processes and special calibration targets such as checkerboard, our method is more flexible and easier to perform.

    2014年03期 v.19 314-322页 [查看摘要][在线阅读][下载 1092K]
    [下载次数:42 ] |[网刊下载次数:0 ] |[引用频次:17 ] |[阅读次数:0 ]

  • Information for Contributors

    <正>Tsinghua Science and Technology(Tsinghua Sci Technol),an academic journal sponsored by Tsinghua University,is published bimonthly.This journal aims at presenting the up-to-date scientific achievements with high creativity and great significance in computer and electronic engineering.Contributions all over the world are welcome.Tsinghua Sci Technol is indexed by IEEE Xplore,Engineering index(Ei,USA),INSPEC,SA,Cambridge Abstract and other abstracting indexes.

    2014年03期 v.19 323页 [查看摘要][在线阅读][下载 854K]
    [下载次数:10 ] |[网刊下载次数:0 ] |[引用频次:0 ] |[阅读次数:0 ]
  • 下载本期数据