- Yi Pan;
<正>Bioinformatics and computational biology research is fundamental to our understanding of complex biological systems,impacting the science and technology of fields ranging from agricultural and environmental sciences to pharmaceutical and medical sciences.It is one of the fastest developing research fields in the last two decades.High throughput biological data that are used to provide information at molecular and genetic level are rapidly generated.Almost all research problems in biological and medical sciences nowadays are computationally hard.Computational techniques
2013年05期 v.18 429-430页 [查看摘要][在线阅读][下载 81K] [下载次数:19 ] |[网刊下载次数:0 ] |[引用频次:0 ] |[阅读次数:63 ] - Anuj Srivastava;Xiaoyu Zhang;Sal LaMarca;Liming Cai;Russell L. Malmberg;
Dynamic regulation and packaging of genetic information is achieved by the organization of DNA into chromatin. Nucleosomal core histones, which form the basic repeating unit of chromatin, are subject to various post-translational modifications such as acetylation, methylation, phosphorylation, and ubiquitinylation. These modifications have effects on chromatin structure and, along with DNA methylation, regulate gene transcription.The goal of this study was to determine if patterns in modifications were related to different categories of genomic features, and, if so, if the patterns had predictive value. In this study, we used publically available data(ChIP-chip)for different types of histone modifications(methylation and acetylation) and for DNA methylation for Arabidopsis thaliana and then applied a machine learning based approach(a support vector machine) to demonstrate that patterns of these modifications are very different among different kinds of genomic feature categories(protein, RNA,pseudogene, and transposon elements). These patterns can be used to distinguish the types of genomic features.DNA methylation and H3K4me3 methylation emerged as features with most discriminative power. From our analysis on Arabidopsis, we were able to predict 33 novel genomic features, whose existence was also supported by analysis of RNA-seq experiments. In summary, we present a novel approach which can be used to discriminate/detect different categories of genomic features based upon their patterns of chromatin modification and DNA methylation.
2013年05期 v.18 431-440页 [查看摘要][在线阅读][下载 543K] [下载次数:34 ] |[网刊下载次数:0 ] |[引用频次:0 ] |[阅读次数:49 ] - Chang Liu;Youping Deng;Leilei Wang;Yong Mei;Rui Zhang;
To evaluate the early diagnostic value of circulating miRNA-21 in diagnosis of lung cancer, databases such as Wan Fang, VIP, PubMed, and Elsevier were systematically searched from 2005 to 2013 to collect relevant references in which the diagnostic value had been evaluated. The statistics were consolidated and the qualities of the studies were classified. The data were analyzed using Meta Disc1.4 software. The diagnostic value of circulating miRNA-21 in lung cancer was assessed by pooling sensitivity, specificity, the likelihood ratio, and the Summary Receiver Operating Characteristic(SROC) curve. Publication biases of the studies involved were analyzed using Stata 11.0 software. A total of 143 papers were collected of which 8 were included, which contained 600 cases and440 controls. A heterogeneity test proved the existence of homogeneity in this study. Upon analysis using random effects models, the weighted sensitivity was 0.68, the specificity 0.77, the positive likelihood ratio 2.84, the negative likelihood ratio 0.40, and the SROC Area Under the Curve(AUC) was 0.8133. Further analysis by subgroup showed that the 5 indicators mentioned above were 0.72, 0.84, 4.50, 0.27, and 0.8987, respectively, for the serum group and 0.63, 0.70, 1.95, 0.53, and 0.7318, respectively, for the plasma group. We conclude that circulating miRNA-21can be regarded a valuable reference in diagnosis of lung cancer. This research showed that in lung cancer the early diagnostic value of miRNA-21 in serum was better than that in plasma.
2013年05期 v.18 441-445页 [查看摘要][在线阅读][下载 309K] [下载次数:86 ] |[网刊下载次数:0 ] |[引用频次:0 ] |[阅读次数:55 ] - Hongjie Yu;Deshuang Huang;
Numerical characterizations of DNA sequence can facilitate analysis of similar sequences. To visualize and compare different DNA sequences in less space, a novel descriptors extraction approach was proposed for numerical characterizations and similarity analysis of sequences. Initially, a transformation method was introduced to represent each DNA sequence with dinucleotide physicochemical property matrix. Then, based on the approximate joint diagonalization theory, an eigenvalue vector was extracted from each DNA sequence,which could be considered as descriptor of the DNA sequence. Moreover, similarity analyses were performed by calculating the pair-wise distances among the obtained eigenvalue vectors. The results show that the proposed approach can capture more sequence information, and can jointly analyze the information contained in all involved multiple sequences, rather than separately, whose effectiveness was demonstrated intuitively by constructing a dendrogram for the 15 beta-globin gene sequences.
2013年05期 v.18 446-453页 [查看摘要][在线阅读][下载 355K] [下载次数:23 ] |[网刊下载次数:0 ] |[引用频次:0 ] |[阅读次数:30 ] - Wei Liu;Ling Chen;
The identification of communities is imperative in the understanding of network structures and functions.Using community detection algorithms in biological networks, the community structure of biological networks can be determined, which is helpful in analyzing the topological structures and predicting the behaviors of biological networks. In this paper, we analyze the diseasome network using a new method called disease-gene network detecting algorithm based on principal component analysis, which can be used to investigate the connection between nodes within the same group. Experimental results on real-world networks have demonstrated that our algorithm is more efficient in detecting community structures when compared with other well-known results.
2013年05期 v.18 454-461页 [查看摘要][在线阅读][下载 1211K] [下载次数:34 ] |[网刊下载次数:0 ] |[引用频次:7 ] |[阅读次数:70 ] - Sheng-You Huang;Gordon K. Springer;
It has been well accepted that the folding energy landscape may resemble a funnel according to the theory of protein folding. This theory of "folding funnel" has been extensively studied and thought to play an important role in guiding the sampling process of the protein folding and refinement in protein structure prediction. Here, we have investigated the relationship between the "funnel likeness" of protein folding and the size/structure of the proteins based on a set of non-homologous proteins we have recently evaluated using a statistical mechanicsbased scoring function ITScorePro. It was found that larger proteins that consist of more helix/sheet structures tend to have a higher score-Root Mean Square Deviation(RMSD) correlation(or a more funnel like energy landscape).Another measurement in protein folding, Z-score, has also shown some correlation with the size of the proteins.As expected, proteins with a better "olding funnel likeness"(or score-RMSD correlation) tend to have a betterpredicted conformation with a lower RMSD from their native structures. These findings can be extremely valuable for the development and improvement of sampling and scoring algorithms for protein structure prediction.
2013年05期 v.18 462-468页 [查看摘要][在线阅读][下载 1078K] [下载次数:18 ] |[网刊下载次数:0 ] |[引用频次:0 ] |[阅读次数:25 ] - Wooyoung Kim;Martin Diko;Keith Rawson;
Network motif is defined as a frequent and unique subgraph pattern in a network, and the search involves counting all the possible instances or listing all patterns, testing isomorphism known as NP-hard and large amounts of repeated processes for statistical evaluation. Although many efficient algorithms have been introduced, exhaustive search methods are still infeasible and feasible approximation methods are yet implausible.Additionally, the fast and continual growth of biological networks makes the problem more challenging. As a consequence, parallel algorithms have been developed and distributed computing has been tested in the cloud computing environment as well. In this paper, we survey current algorithms for network motif detection and existing software tools. Then, we show that some methods have been utilized for parallel network motif search algorithms with static or dynamic load balancing techniques. With the advent of cloud computing services, network motif search has been implemented with MapReduce in Hadoop Distributed File System(HDFS), and with Storm, but without statistical testing. In this paper, we survey network motif search algorithms in general, including existing parallel methods as well as cloud computing based search, and show the promising potentials for the cloud computing based motif search methods.
2013年05期 v.18 469-489页 [查看摘要][在线阅读][下载 5238K] [下载次数:109 ] |[网刊下载次数:0 ] |[引用频次:10 ] |[阅读次数:34 ] - Feng Shi;Qilong Feng;Jianer Chen;Lusheng Wang;Jianxin Wang;
Phylogenetic trees have been widely used in the study of evolutionary biology for representing the tree-like evolution of a collection of species. However, different data sets and different methods often lead to the construction of different phylogenetic trees for the same set of species. Therefore, comparing these trees to determine similarities or, equivalently, dissimilarities, becomes the fundamental issue. Typically, Tree Bisection and Reconnection(TBR)and Subtree Prune and Regraft(SPR) distances have been proposed to facilitate the comparison between different phylogenetic trees. In this paper, we give a survey on the aspects of computational complexity, fixed-parameter algorithms, and approximation algorithms for computing the TBR and SPR distances of phylogenetic trees.
2013年05期 v.18 490-499页 [查看摘要][在线阅读][下载 387K] [下载次数:35 ] |[网刊下载次数:0 ] |[引用频次:1 ] |[阅读次数:58 ] - Yiming He;Zhen Zhang;Xiaoqing Peng;Fangxiang Wu;Jianxin Wang;
The recent breakthroughs in next-generation sequencing technologies, such as those of Roche 454,Illumina/Solexa, and ABI SOLID, have dramatically reduced the cost of producing short reads of the genome of new species. The huge volume of reads, along with short read length, high coverage, and sequencing errors, poses a great challenge to de novo genome assembly. However, the paired-end information provides a new solution to these problems. In this paper, we review and compare some current assembly tools, including Newbler, CAP3, Velvet,SOAPdenovo, AllPaths, Abyss, IDBA, PE-Assembly, and Telescoper. In general, we compare the seed extension and graph-based methods that use the overlap/lapout/consensus approach and the de Bruijn graph approach for assembly. At the end of the paper, we summarize these methods and discuss the future directions of genome assembly.
2013年05期 v.18 500-514页 [查看摘要][在线阅读][下载 368K] [下载次数:108 ] |[网刊下载次数:0 ] |[引用频次:4 ] |[阅读次数:57 ] - Jiang Xie;Zhonghua Zhou;Kai Lu;Luonan Chen;Wu Zhang;
Similarities and dissimilarities between biomolecular networks cannot be intuitively recognized even after the development of several comparison algorithms because of the lack of visualization tools. In this paper, an integrated tool kit named Biomolecular Network Match(BNMatch) is designed and developed based on Cytoscape—a popular and open-source tool for analyzing and visualizing networks. BNMatch integrates the comparison of the outputs of algorithms used for processing biomolecular networks and expresses the matching data between them by defining similar vertices and links with similar attributes. Moreover, in order to maintain consistency, their counterparts in other networks change when the nodes and edges in one of the compared networks are changed. It becomes easy for users to analyze similar networks by invoking comparison algorithms and visualizing the matching data between the networks using BNMatch.
2013年05期 v.18 515-521页 [查看摘要][在线阅读][下载 1795K] [下载次数:123 ] |[网刊下载次数:0 ] |[引用频次:3 ] |[阅读次数:30 ] - Wei Hu;
A novel avian-origin H7N9 influenza virus was discovered in March in China and has caused a total of131 people infected including 39 deaths in China as of June 9, 2013. Adaptation of avian viruses to efficiently infect humans requires the viral hemagglutinin(HA) binding switches from avian to human type receptors with help of some mutations in HA. As such it is critical for pandemic assessment to discover these mutations as hallmarks of adaptation. To continue our previous study of this novel H7N9 virus, we identified two sets of mutations in HA. The first set of mutations are present in the current circulating strains of 2013 H7N9 in China, and the second set are potential mutations that were found when compared to the HAs of previous human H7 subtype. These two sets of mutations exhibited unique features. The first group of mutations, on average, enhanced the HA binding to human type receptors whereas reduced that to avian types. Further the reduction of avian binding was almost three times of the increase of the human binding. The second group increased the binding to both human and avian types.But the increase in human types was almost three times of that in the avian types. Though different in their way of changing the binding preference, these two sets of mutations both contained more mutations to decrease the avian binding and increase the human binding than those that did the opposite. Our research highlighted the pandemic potential of this novel virus by showing the important mutations that could potentially help it to adapt to human hosts. Our findings offered new insights into the current state of evolution of this virus, which might be helpful for the continued surveillance of the emergence of H7N9 strains having the ability of human-to-human transmission.
2013年05期 v.18 522-529页 [查看摘要][在线阅读][下载 643K] [下载次数:75 ] |[网刊下载次数:0 ] |[引用频次:0 ] |[阅读次数:65 ] - Yuan Zhang;Nan Du;Kang L;Kebin Jia;Aidong Zhang;
Current methods for the detection of differential gene expression focus on finding individual genes that may be responsible for certain diseases or external irritants. However, for common genetic diseases, multiple genes and their interactions should be understood and treated together during the exploration of disease causes and possible drug design. The present study focuses on analyzing the dynamic patterns of co-regulated modules during biological progression and determining those having remarkably varying activities, using the yeast cell cycle as a case study. We first constructed dynamic active protein-protein interaction networks by modeling the activity of proteins and assembling the dynamic co-regulation protein network at each time point. The dynamic active modules were detected using a method based on the Bayesian graphical model and then the modules with the most varied dispersion of clustering coefficients, which could be responsible for the dynamic mechanism of the cell cycle, were identified. Comparison of results from our functional module detection with the state-of-art functional module detection methods and validation of the ranking of activities of functional modules using GO annotations demonstrate the efficacy of our method for narrowing the scope of possible essential responding modules that could provide multiple targets for biologists to further experimentally validate.
2013年05期 v.18 530-540页 [查看摘要][在线阅读][下载 1641K] [下载次数:29 ] |[网刊下载次数:0 ] |[引用频次:6 ] |[阅读次数:42 ] -
<正>The publication of Tsinghua Science and Technology was started in 1996.Since then,it has been an international academic journal sponsored by Tsinghua University and published bimonthly.This journal aims at presenting the state-of-art scientific achievements in computer science and other IT fields.One paper on Cloud Computing published in Vol.18,Issue 1,2013,has been ranked No.1 of IEEE download list continuously for five months:http://ieeexplore.ieee.org/xpl/browsePopular.jsp?topArticlesDate=August+2013.This special issue on Cloud Computing and Big Data of Tsinghua Science and Technology is devoted to gather
2013年05期 v.18 541页 [查看摘要][在线阅读][下载 52K] [下载次数:47 ] |[网刊下载次数:0 ] |[引用频次:0 ] |[阅读次数:27 ] <正>The publication of Tsinghua Science and Technology was started in 1996.Since then,it has been an international academic journal sponsored by Tsinghua University and published bimonthly.This journal aims at presenting the state-of-art scientific achievements in computer science and other IT fields.One paper on Cloud Computing published in Vol.18,Issue 1,2013,has been ranked No.1 of IEEE download list continuously for five months:http://ieeexplore.ieee.org/xpl/browsePopular.jsp?topArticlesDate=August+2013.
2013年05期 v.18 542页 [查看摘要][在线阅读][下载 53K] [下载次数:12 ] |[网刊下载次数:0 ] |[引用频次:0 ] |[阅读次数:22 ] -
<正>Tsinghua Science and Technology(Tsinghua Sci Technol),an academic journal sponsored by Tsinghua University,is published bimonthly.This journal aims at presenting the up-to-date scientific achievements with high creativity and great significance in computer and electronic engineering.Contributions all over the world are welcome.Tsinghua Sci Technol is indexed by IEEE Xplore,Engineering index(Ei,USA),INSPEC,SA,Cambridge Abstract
2013年05期 v.18 543页 [查看摘要][在线阅读][下载 842K] [下载次数:12 ] |[网刊下载次数:0 ] |[引用频次:0 ] |[阅读次数:22 ] 下载本期数据