书名:Pro Hadoop Data Analytics _ Designing and Building Big Data Systems using the Hadoop Ecosystem -Apress (2017)
内容介绍:Hadoop真的已死?作者不这么认为,书中详细介绍了Hadoop相关技术,并介绍了基于Hadoop的数据分析方法,这里不再赘述,详见英文介绍。
Introduction
The Apache Hadoop software library has come into it’s own. It is the basis for advanced distributed development for a host of companies, government institutions, and scientific research facilities. The Hadoop ecosystem now contains dozens of components for everything from search, databases, and data warehousing to image processing, deep learning, and natural language processing. With the advent of Hadoop 2, different resource managers may be used to provide an even greater level of sophistication and control than previously possible. Competitors, replacements, as well as successors and mutations of theHadoop technologies and architectures abound. These include Apache Flink, Apache Spark, and many others. The “death of Hadoop” has been announced many times by software experts and commentators.
We have to face the question squarely: is Hadoop dead? It depends on the perceived boundaries of Hadoop itself. Do we consider Apache Spark, the in-memory successor to Hadoop’s batch file approach, a part of the Hadoop family simply because it also uses HDFS, the Hadoop file system? Many other examples of “gray areas” exist in which newer technologies replace or enhance the original “Hadoop classic” features. Distributed computing is a moving target and the boundaries of Hadoop and its ecosystem have changed remarkably over a few short years. In this book, we attempt to show some of the diverse and dynamic aspects of Hadoop and its associated ecosystem, and to try to convince you that, although changing, Hadoop is still very much alive, relevant to current software development, and particularly interesting to data analytics programmers.