Advances in computers and biotechnology have had a profound impact on biomedical research, and as a result complex data sets can now be generated to address extremely complex biological questions. Correspondingly, advances in the statistical methods necessary to analyze such data are following closely behind the advances in data generation methods. The statistical methods required by bioinformatics present many new and difficult problems for the research community.
This book provides an introduction to some of these new methods. The main biological topics treated include sequence analysis, BLAST, microarray analysis, gene finding, and the analysis of evolutionary processes. The main statistical techniques covered include hypothesis testing and estimation, Poisson processes, Markov models and Hidden Markov models, and multiple testing methods.
The second edition features new chapters on microarray analysis and on statistical inference, including a discussion of ANOVA, and discussions of the statistical theory of motifs and methods based on the hypergeometric distribution. Much material has been clarified and reorganized.
The book is written so as to appeal to biologists and computer scientists who wish to know more about the statistical methods of the field, as well as to trained statisticians who wish to become involved with bioinformatics. The earlier chapters introduce the concepts of probability and statistics at an elementary level, but with an emphasis on material relevant to later chapters and often not covered in standard introductory texts. Later chapters should be immediately accessible to the trained statistician. Sufficient mathematical background consists of introductory courses in calculus and linear algebra. The basic biological concepts that are used are explained, or can be understood from the context, and standard mathematical concepts are summarized in an Appendix. Problems are provided at the end of each chapter allowing the reader to develop aspects of the theory outlined in the main text.
Warren J. Ewens holds the Christopher H. Brown Distinguished Professorship at the University of Pennsylvania. He is the author of two books, Population Genetics and Mathematical Population Genetics. He is a senior editor of Annals of Human Genetics and has served on the editorial boards of Theoretical Population Biology, GENETICS, Proceedings of the Royal Society B and SIAM Journal in Mathematical Biology. He is a fellow of the Royal Society and the Australian Academy of Science.
Gregory R. Grant is a senior bioinformatics researcher in the University of Pennsylvania Computational Biology and Informatics Laboratory. He obtained his Ph.D. in number theory from the University of Maryland in 1995 and his Masters in Computer Science from the University of Pennsylvania in 1999.
Comments on the first edition:
"This book would be an ideal text for a postgraduate course?[and] is equally well suited to individual study?. I would recommend the book highly." (Biometrics)
"Ewens and Grant have given us a very welcome introduction to what is behind those pretty [graphical user] interfaces." (Naturwissenschaften)
"The authors do an excellent job of presenting the essence of the material without getting bogged down in mathematical details." (Journal American Statistical Association)
"The authors have restructured classical material to a great extent and the new organization of the different topics is one of the outstanding services of the book." (Metrika)
From the reviews of the second edition:
"Overall, Ewens and Grant have constructed a needed book in bioinformatics. It should help statisticians understand the emerging field of bioinformatics and serve as an introduction to bioinformatics for a statistician." Journal of the American Statistical Association, March 2006
"This book is the second edition of a book that was based on the content of a two-semester course in bioinformatics and computational biology ? . is one of the most important books in this area from the perspective of teaching final year undergraduates and post-graduates in a range of disciplines. ? this is a very good book, the best currently available for undergraduates and post-graduates at the intersection of computational biology, bioinformatics, statistics and applied mathematics and a worthwhile improvement on the first edition." (Mark Broom, Journal of the Royal Statistical Society, Vol. 169 (1), 2006)
"This is the second edition of Ewens and Grant's very well written book on statistical methods in bioinformatics. ? The authors have presented an excellent text for a graduate course ? . It is clearly and interestingly written and is well organized and has comprehensive references to the literature. The writing style is excellent ? . It is ? truly a reference book for statistical methods in bioinformatics ? . So I strongly recommend the book to both molecular biologists and statisticians ? ." (Hamid Pezeshk, ISCB Newsletter, Issue 42, 2006)
"Ewens and Grant aim to fill a gap in the literature on statistics and probability in bioinformatics. ? provides a review of the use of familiar statistical techniques and approaches to a new area. ? it provides a rigorous treatment of statistical issues associated with bioinformatics tools and a strong statement of the statistical principles and philosophy which needs to underpin these tools. It admirably meets its objectives in thisrespect and is to be recommended." (David Lovell, Pharmaceutical Statistics, Issue 6, 2007)
"The most impressive achievement of this book is its development of blast theory. ? The authors pace the knowledge flow smoothly. ? The examples and exercises are well thought and highly motivated ? . The authors do a fine job of emphasising the false discovery rate ? . This book is structured perfectly for a textbook for everyone, statisticians, biologists and computer scientists. ? I think this book does an excellent job in introducing many exciting statistical theories." (Lang Li, Briefings in Bioinformatics, Vol. 6 (4), 2005)
"In this book, Ewens and Grant seek to provide a link between bioinformatics and applied statistics. ? The book provides detailed discussions of a number of useful distributions and highlights their role in bioinformatics. I found it quite useful and easy to follow. It is a good reference for multidisciplinary research teams in bioinformatics and students on some specialised taught courses." (Kassim S. Mwitondi, Journal of Applied Statistics, Vol. 33 (8), September, 2006)