Advanced Information and Knowledge Processing
The growth in the amount of data collected and generated has exploded in
recent times with the widespread automation of various day-to-day activities,
advances in high-level scientific and engineering research and the development
of efficient data collection tools. This has given rise to the need for automatically
analyzing the data in order to extract knowledge from it, thereby making
the data potentially more useful.
Knowledge discovery and data mining (KDD) is the process of identifying
valid, novel, potentially useful and ultimately understandable patterns from
massive data repositories. It is a multi-disciplinary topic, drawing from several
fields including expert systems, machine learning, intelligent databases,
knowledge acquisition, case-based reasoning, pattern recognition and statistics.
Many data mining systems have typically evolved around well-organized
database systems (e.g., relational databases) containing relevant information.
But, more and more, one finds relevant information hidden in unstructured
text and in other complex forms. Mining in the domains of the world-wide
web, bioinformatics, geoscientific data, and spatial and temporal applications
comprise some illustrative examples in this regard. Discovery of knowledge,
or potentially useful patterns, from such complex data often requires the application
of advanced techniques that are better able to exploit the nature
and representation of the data. Such advanced methods include, among others,
graph-based and tree-based approaches to relational learning, sequence
mining, link-based classification, Bayesian networks, hidden Markov models,
neural networks, kernel-based methods, evolutionary algorithms, rough sets
and fuzzy logic, and hybrid systems. Many of these methods are developed in
the following chapters.