000 10481cam a2200469Ii 4500
999 _c89637
_d89637
005 20250307085601.0
006 m o d
007 cr cnu---unuuu
008 250307b ||||| |||| 00| 0 eng d
015 _aGBB9F3928
_2bnb
020 _a9781119516071
_q(electronic book)
020 _a1119516072
_q(electronic book)
020 _a9781119515982
_q(electronic book)
020 _a111951598X
_q(electronic book)
020 _a9781119516057
_q(electronic book)
020 _a1119516056
_q(electronic book)
020 _a9781119516040
_q(hardback)
020 _a1119516048
_q(hardback)
028 0 2 _aEB00780221
_bRecorded Books
040 _aRECBK
_beng
_epn
_erda
_cRECBK
_dTEFOD
_dUKMGB
_dN$T
_dOCLCF
_dYDXIT
_dYDX
_dOCLCQ
_dRDF
_dUKAHL
_dEBLCP
041 _aeng
050 4 _aQA76.9.D343
_bK35 2020
082 0 4 _a006.3/12
_223
100 1 _aKantardzic, Mehmed.
_0http://id.loc.gov/authorities/names/no2003007090.
245 1 0 _aData mining :
_bconcepts, models, methods, and algorithms /
_cMehmed Kantardzic.
250 _aThird edition.
264 1 _a[Piscataway, NJ] :
_bIEEE Press ;
_aHoboken, New Jersey :
_bWiley,
_c[2020]
264 4 _c2020.
300 _a1 online resource (xix, 639 pages)
336 _atext
_btxt
_2rdacontent.
337 _acomputer
_bc
_2rdamedia.
338 _aonline resource
_bcr
_2rdacarrier.
340 _2rdacc
_0http://rdaregistry.info/termList/RDAColourContent/1003.
504 _aIncludes bibliographical references and index.
505 0 _aTable of Contents Preface xiii Preface to the Second Edition xv Preface to the First Edition xvii 1 Data-Mining Concepts 1 1.1 Introduction 2 1.2 Data-Mining Roots 4 1.3 Data-Mining Process 6 1.4 From Data Collection to Data Preprocessing 10 1.5 Data Warehouses for Data Mining 15 1.6 From Big Data to Data Science 18 1.7 Business Aspects of Data Mining: Why a Data-Mining Project Fails? 22 1.8 Organization of This Book 26 1.9 Review Questions and Problems 28 1.10 References for Further Study 30 2 Preparing the Data 33 2.1 Representation of Raw Data 34 2.2 Characteristics of Raw Data 38 2.3 Transformation of Raw Data 40 2.4 Missing Data 43 2.5 Time-Dependent Data 44 2.6 Outlier Analysis 49 2.7 Review Questions and Problems 56 2.8 References for Further Study 59 3 Data Reduction 61 3.1 Dimensions of Large Data Sets 62 3.2 Features Reduction 64 3.3 Relief Algorithm 75 3.4 Entropy Measure for Ranking Features 77 3.5 Principal Component Analysis 80 3.6 Value Reduction 83 3.7 Feature Discretization: ChiMerge Technique 86 3.8 Case Reduction 90 3.9 Review Questions and Problems 93 3.10 References for Further Study 95 4 Learning from Data 97 4.1 Learning Machine 99 4.2 Statistical Learning Theory 104 4.3 Types of Learning Methods 110 4.4 Common Learning Tasks 112 4.5 Support Vector Machines 117 4.6 Semi-Supervised Support Vector Machines (S3VM) 131 4.7 kNN: Nearest Neighbor Classifier 134 4.8 Model Selection vs. Generalization 138 4.9 Model Estimation 142 4.10 Imbalanced Data Classification 150 4.11 90% Accuracy … Now What? 154 4.12 Review Questions and Problems 158 4.13 References for Further Study 161 5 Statistical Methods 165 5.1 Statistical Inference 166 5.2 Assessing Differences in Data Sets 168 5.3 Bayesian Inference 172 5.4 Predictive Regression 175 5.5 Analysis of Variance 181 5.6 Logistic Regression 184 5.7 Log-Linear Models 185 5.8 Linear Discriminant Analysis 189 5.9 Review Questions and Problems 191 5.10 References for Further Study 194 6 Decision Trees and Decision Rules 197 6.1 Decision Trees 199 6.2 C4.5 Algorithm: Generating a Decision Tree 201 6.3 Unknown Attribute Values 209 6.4 Pruning Decision Trees 214 6.5 C4.5 Algorithm: Generating Decision Rules 215 6.6 Cart Algorithm and Gini Index 219 6.7 Limitations of Decision Trees and Decision Rules 222 6.8 Review Questions and Problems 225 6.9 References for Further Study 229 7 Artificial Neural Networks 231 7.1 Model of an Artificial Neuron 233 7.2 Architectures of Artificial Neural Networks 237 7.3 Learning Process 239 7.4 Learning Tasks Using Anns 243 7.5 Multilayer Perceptrons 245 7.6 Competitive Networks and Competitive Learning 255 7.7 Self-Organizing Maps 259 7.8 Deep Learning 264 7.9 Convolutional Neural Networks (CNNs) 270 7.10 Review Questions and Problems 273 7.11 References for Further Study 276 8 Ensemble Learning 279 8.1 Ensemble Learning Methodologies 280 8.2 Combination Schemes for Multiple Learners 285 8.3 Bagging and Boosting 286 8.4 AdaBoost 288 8.5 Review Questions and Problems 290 8.6 References for Further Study 293 9 Cluster Analysis 295 9.1 Clustering Concepts 296 9.2 Similarity Measures 299 9.3 Agglomerative Hierarchical Clustering 306 9.4 Partitional Clustering 310 9.5 Incremental Clustering 313 9.6 DBSCAN Algorithm 317 9.7 BIRCH Algorithm 320 9.8 Clustering Validation 323 9.9 Review Questions and Problems 328 9.10 References for Further Study 333 10 Association Rules 335 10.1 Market-Basket Analysis 337 10.2 Algorithm Apriori 338 10.3 From Frequent Itemsets to Association Rules 340 10.4 Improving the Efficiency of the Apriori Algorithm 342 10.5 Frequent Pattern Growth Method 344 10.6 Associative-Classification Method 346 10.7 Multidimensional Association Rule Mining 349 10.8 Review Questions and Problems 351 10.9 References for Further Study 355 11 Web Mining and Text Mining 357 11.1 Web Mining 358 11.2 Web Content, Structure, and Usage Mining 360 11.3 Hits and Logsom Algorithms 362 11.4 Mining Path-Traversal Patterns 368 11.5 PageRank Algorithm 371 11.6 Recommender Systems 374 11.7 Text Mining 375 11.8 Latent Semantic Analysis 379 11.9 Review Questions and Problems 385 11.10 References for Further Study 388 12 Advances in Data Mining 391 12.1 Graph Mining 392 12.2 Temporal Data Mining 406 12.3 Spatial Data Mining 422 12.4 Distributed Data Mining 426 12.5 Correlation Does not Imply Causality! 435 12.6 Privacy, Security, and Legal Aspects of Data Mining 442 12.7 Cloud Computing Based on Hadoop and Map/Reduce 449 12.8 Reinforcement Learning 454 12.9 Review Questions and Problems 459 12.10 References for Further Study 461 13 Genetic Algorithms 465 13.1 Fundamentals of Genetic Algorithms 466 13.2 Optimization Using Genetic Algorithms 468 13.3 A Simple Illustration of a Genetic Algorithm 474 13.4 Schemata 480 13.5 Traveling Salesman Problem 483 13.6 Machine Learning Using Genetic Algorithms 485 13.7 Genetic Algorithms for Clustering 490 13.8 Review Questions and Problems 493 13.9 References for Further Study 494 14 Fuzzy Sets and Fuzzy Logic 497 14.1 Fuzzy Sets 498 14.2 Fuzzy Set Operations 504 14.3 Extension Principle and Fuzzy Relations 509 14.4 Fuzzy Logic and Fuzzy Inference Systems 513 14.5 Multifactorial Evaluation 518 14.6 Extracting Fuzzy Models from Data 521 14.7 Data Mining and Fuzzy Sets 526 14.8 Review Questions and Problems 528 14.9 References for Further Study 530 15 Visualization Methods 533 15.1 Perception and Visualization 534 15.2 Scientific Visualization and Information Visualization 535 15.3 Parallel Coordinates 542 15.4 Radial Visualization 544 15.5 Visualization Using Self-Organizing Maps 547 15.6 Visualization Systems for Data Mining 549 15.7 Review Questions and Problems 554 15.8 References for Further Study 555 Appendix A: Information on Data Mining 559 A.1 Data-Mining Journals 559 A.2 Data-Mining Conferences 564 A.3 Data-Mining Forums/Blogs 568 A.4 Data Sets 570 A.5 Comercially and Publicly Available Tools 574 A.6 Web Site Links 583 Appendix B: Data-Mining Applications 589 B.1 Data Mining for Financial Data Analyses 589 B.2 Data Mining for the Telecomunication Industry 593 B.3 Data Mining for the Retail Industry 596 B.4 Data Mining in Healthcare and Biomedical Research 599 B.5 Data Mining in Science and Engineering 602 B.6 Pitfalls of Data Mining 605 Bibliography 607 Index 633
520 _aPresents the latest techniques for analyzing and extracting information from large amounts of data in high-dimensional data spaces The revised and updated third edition of Data Mining contains in one volume an introduction to a systematic approach to the analysis of large data sets that integrates results from disciplines such as statistics, artificial intelligence, data bases, pattern recognition, and computer visualization. Advances in deep learning technology have opened an entire new spectrum of applications. The author'a noted expert on the topic'explains the basic concepts, models, and methodologies that have been developed in recent years. This new edition introduces and expands on many topics, as well as providing revised sections on software tools and data mining applications. Additional changes include an updated list of references for further study, and an extended list of problems and questions that relate to each chapter. This third edition presents new and expanded information that: -''' Explores big data and cloud computing -''' Examines deep learning -''' Includes information on convolutional neural networks (CNN) -''' Offers reinforcement learning -''' Contains semi-supervised learning and S3VM -''' Reviews model evaluation for unbalanced data Written for graduate students in computer science, computer engineers, and computer information systems professionals, the updated third edition of Data Mining continues to provide an essential guide to the basic principles of the technology and the most recent developments in the field.
545 0 _aAbout the Author MEHMED KANTARDZIC, PHD, is a Professor in the Department of Computer Engineering and Computer Science (CECS) at the University of Louisville, and is Director of the Data Mining Lab and CECS Graduate Programs. He is a member of IEEE, ISCA, KAS, WSEAS, IEE, and SPIE.
650 0 _aData mining.
_0http://id.loc.gov/authorities/subjects/sh97002073.
655 4 _aElectronic books.
856 _uhttps://onlinelibrary.wiley.com/doi/book/10.1002/9781119516057
_yFull text is available at Wiley Online Library Click here to view
942 _2ddc
_cER