Normal view MARC view ISBD view

Big data management and analytics / Brij B Gupta, Asia University, Taiwan; Mamta, Thapar Institute of Engineering and Technology, India.

By: Gupta, Brij, 1982- [author.]

Contributor(s): Mamta [author.]

Language: English Series: World Scientific series on future computing paradigms and applications ; vol. 1Publisher: New Jersey : World Scientific, [2024]Description: xxix, 257 pages ; 24 cmContent type: text Media type: unmediated Carrier type: volumeISBN: 9789811257117Subject(s): Big data | Database management | Data miningDDC classification: 005.7 LOC classification: QA76.9.B45 | G86 2024

Contents:

Contents List of Figures List of Tables Chapter 1 Introduction to Big Data 1.1 Data: The New Oil and the New Soil 1.2 What is Big Data and What are its Sources 1.2.1 Big Data Generated by Machines 1.2.2 Big Data Generated by Humans 1.2.3 Big Data Generated by Organizations 1.3 Characteristics of Big Data 1.3.1 Volume 1.3.2 Velocity 1.3.3 Variety 1.3.4 Veracity 1.3.5 Valence 1.3.6 Value 1.4 Importance of Big Data: Popular Use Cases 1.5 Chapter Summary References Chapter 2 Big Data Management and Modeling 2.1 Big Data Management 2.1.1 Data Acquisition/Ingestion 2.1.2 Data Storage 2.1.3 Data Quality 2.1.4 Data Operations 2.1.5 Data Scalability 2.1.6 Data Security 2.2 Challenges in Big Data Management: Case Study 2.3 Big Data Modeling 2.3.1 Data Model Structures 2.3.2 Data Model Operations 2.3.3 Data Model Constraints 2.4 Types of Data Models 2.4.1 Relational Data Model 2.4.2 Semi-Structured Data Model 2.4.3 Unstructured Data Model: Vector Space Data Model 2.4.4 Graph Data Model 2.5 Chapter Summary References Chapter 3 Big Data Processing 3.1 Requirements for Big Data Processing 3.2 Big Data Retrieval 3.2.1 Relational Data Query 3.2.2 JSON Data Query Using MongoDB and Aerospike 3.3 Big Data Integration 3.3.1 Big Data Integration Problems 3.4 Big Data Processing Pipeline 3.4.1 Data Transformation Operations in Big Data Processing Pipeline 3.5 Big Data Management and Processing Using Splunk and Datameer 3.5.1 Splunk 3.5.2 Datameer 3.6 Chapter Summary References Chapter 4 Big Data Analytics and Machine Learning 4.1 Introduction to Machine Learning 4.1.1 Machine Learning Techniques 4.2 Machine Learning Process 4.2.1 Acquire 4.2.2 Prepare 4.2.3 Analyze 4.2.4 Evaluation of Machine Learning Models 4.3 Scaling Up Machine Learning Algorithms 4.4 Chapter Summary References Chapter 5 Big Data Analytics Through Visualization 5.1 Graph Definition 5.1.1 Examples of Graph Analytics for Big Data 5.2 Graph Analytics from the Perspective of Big Data 5.3 Techniques for Graph Analytics 5.3.1 Basic Definitions 5.3.2 Path Analytics 5.3.3 Connectivity Analytics 5.3.4 Community Analytics 5.3.5 Centrality Analytics 5.4 Large-Scale Graph Processing 5.4.1 Parallel Programming Model for Graphs 5.5 Chapter Summary References Chapter 6 Taming Big Data with Spark 2.0 6.1 Introduction to Spark 2.0 6.1.1 Why Spark 2.0 Replaced Hadoop 6.2 Resilient Distributed Datasets 6.3 Spark 2.0 6.3.1 Language Processing with Spark 2.0 6.3.2 Analysis of Streaming Data with Spark 2.0 6.3.3 Streaming API 6.3.4 Kafka 6.3.5 Apache Spark Streaming 6.4 Spark Machine Learning Library 6.5 Chapter Summary References Chapter 7 Managing Big Data in Cloud Storage 7.1 Large-Scale Data Storage 7.1.1 Challenges of Storing Large Data in Distributed Systems 7.2 Hadoop Distributed File System (HDFS) 7.2.1 HDFS Permission Checks 7.2.2 HDFS Shell Commands 7.2.3 Chaining and Scripting HDFS Commands 7.2.4 Loading Data on HDFS 7.3 Hadoop User Experience (HUE) 7.3.1 Features of HUE 7.3.2 HUE Components 7.4 Chapter Summary References Chapter 8 Big Data in Healthcare 8.1 Digitalization in Healthcare Sector 8.1.1 Use of Big Data in Medical Care 8.2 Big Data in Public Health 8.2.1 Big Data Surveillance Using Machine Learning 8.2.2 Big Data in Public Health Training 8.2.3 Limitations and Open Issues for Big Data While Using Machine Learning in Public Health 8.3 The Four V’s of Big Data in Healthcare 8.4 Big Data in Genomics 8.5 Architectural Framework 8.5.1 Methodology of Big Data Analytics in Healthcare 8.5.2 Advantages of Big Data Analytics to Healthcare 8.5.3 Challenges of Big Data in Healthcare 8.6 Chapter Summary References Chapter 9 Big Data in Finance 9.1 Digitalization in Financial Industry 9.2 Sources of Financial Data 9.3 Challenges of Using Big Data in Financial Research 9.4 Financial Big Data 9.4.1 FBD Management 9.4.2 FBD Analytics 9.5 Theoretical Framework of Big Data in Financial Services 9.6 Popular Use Cases of FBD Analytics 9.7 Chapter Summary References Chapter 10 Enabling Tools and Technologies for Big Data Analytics 10.1 Big Data Management and Modeling Tools 10.1.1 Data Modeling Tools 10.1.2 Vector Data Model with Lucene 10.1.3 Graph Data Model with Gephi 10.1.4 Data Management Tools 10.2 Big Data Integration and Processing Tools 10.2.1 Big Data Processing Using Splunk and Datameer 10.3 Big Data Machine Learning Tools 10.3.1 KNIME 10.3.2 Spark MLlib 10.4 Big Data Graph Analytics Tools 10.4.1 Giraph 10.4.2 GraphX 10.4.3 Neo4j 10.5 Chapter Summary References Index

Summary: "With the proliferation of information, big data management and analysis have become an indispensable part of any system to handle such amounts of data. The amount of data generated by the multitude of interconnected devices increases exponentially, making the storage and processing of these data a real challenge. Big data management and analytics have gained momentum in almost every industry, ranging from finance or healthcare. Big data can reveal key insights if handled and analyzed properly; it has great application potential to improve the working of any industry. This book covers the spectrum aspects of big data; from the preliminary level to specific case studies. It will help readers gain knowledge of the big data landscape. Highlights of the topics covered include description of the Big Data ecosystem; real-world instances of big data issues; how the Vs of Big Data (volume, velocity, variety, veracity, valence, and value) affect data collection, monitoring, storage, analysis, and reporting; structural process to get value out of Big Data and recognize the differences between a standard database management system and a big data management system. Readers will gain insights into choice of data models, data extraction, data integration to solve large data problems, data modelling using machine learning techniques, Spark's scalable machine learning techniques, modeling a big data problem into a graph database and performing scalable analytical operations over the graph and different tools and techniques for processing big data and its applications including in healthcare and finance"-- Provided by publisher.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings ( 1 )
Title notes
Comments ( 0 )
Images

Item type	Current location	Home library	Call number	Status	Date due	Barcode	Item holds
BOOK	COLLEGE LIBRARY	COLLEGE LIBRARY SUBJECT REFERENCE	005.7 G9593 2024 (Browse shelf)	Available		CITU-CL-54271

Total holds: 0

Browsing COLLEGE LIBRARY Shelves , Shelving location: SUBJECT REFERENCE Close shelf browser

Previous				No cover image available				Next
Previous	005.7/3 St93 1989 Data structures with abstract data types and Pascal /	005.7 B4801 2019 Big data analytics : methods and applications /	005.7 C587 2017 Big data : how the information revolution is transforming our lives /	005.7 G9593 2024 Big data management and analytics /	005.7 H9974 2024 Data analytics & visualization all-in-one for dummies /	005.7 K513 2001 Dreamweaver 4 : the complete reference /	005.7 R451 1987 Data abstraction and structures : an introduction to computer science II /	Next

Includes bibliographical references and index.

Contents
List of Figures
List of Tables
Chapter 1 Introduction to Big Data
1.1 Data: The New Oil and the New Soil
1.2 What is Big Data and What are its Sources
1.2.1 Big Data Generated by Machines
1.2.2 Big Data Generated by Humans
1.2.3 Big Data Generated by Organizations
1.3 Characteristics of Big Data
1.3.1 Volume
1.3.2 Velocity
1.3.3 Variety
1.3.4 Veracity
1.3.5 Valence
1.3.6 Value
1.4 Importance of Big Data: Popular Use Cases
1.5 Chapter Summary
References
Chapter 2 Big Data Management and Modeling
2.1 Big Data Management
2.1.1 Data Acquisition/Ingestion
2.1.2 Data Storage
2.1.3 Data Quality
2.1.4 Data Operations
2.1.5 Data Scalability
2.1.6 Data Security
2.2 Challenges in Big Data Management: Case Study
2.3 Big Data Modeling
2.3.1 Data Model Structures
2.3.2 Data Model Operations
2.3.3 Data Model Constraints
2.4 Types of Data Models
2.4.1 Relational Data Model
2.4.2 Semi-Structured Data Model
2.4.3 Unstructured Data Model: Vector Space Data Model
2.4.4 Graph Data Model
2.5 Chapter Summary
References
Chapter 3 Big Data Processing
3.1 Requirements for Big Data Processing
3.2 Big Data Retrieval
3.2.1 Relational Data Query
3.2.2 JSON Data Query Using MongoDB and Aerospike
3.3 Big Data Integration
3.3.1 Big Data Integration Problems
3.4 Big Data Processing Pipeline
3.4.1 Data Transformation Operations in Big Data Processing Pipeline
3.5 Big Data Management and Processing Using Splunk and Datameer
3.5.1 Splunk
3.5.2 Datameer
3.6 Chapter Summary
References
Chapter 4 Big Data Analytics and Machine Learning
4.1 Introduction to Machine Learning
4.1.1 Machine Learning Techniques
4.2 Machine Learning Process
4.2.1 Acquire
4.2.2 Prepare
4.2.3 Analyze
4.2.4 Evaluation of Machine Learning Models
4.3 Scaling Up Machine Learning Algorithms
4.4 Chapter Summary
References
Chapter 5 Big Data Analytics Through Visualization
5.1 Graph Definition
5.1.1 Examples of Graph Analytics for Big Data
5.2 Graph Analytics from the Perspective of Big Data
5.3 Techniques for Graph Analytics
5.3.1 Basic Definitions
5.3.2 Path Analytics
5.3.3 Connectivity Analytics
5.3.4 Community Analytics
5.3.5 Centrality Analytics
5.4 Large-Scale Graph Processing
5.4.1 Parallel Programming Model for Graphs
5.5 Chapter Summary
References
Chapter 6 Taming Big Data with Spark 2.0
6.1 Introduction to Spark 2.0
6.1.1 Why Spark 2.0 Replaced Hadoop
6.2 Resilient Distributed Datasets
6.3 Spark 2.0
6.3.1 Language Processing with Spark 2.0
6.3.2 Analysis of Streaming Data with Spark 2.0
6.3.3 Streaming API
6.3.4 Kafka
6.3.5 Apache Spark Streaming
6.4 Spark Machine Learning Library
6.5 Chapter Summary
References
Chapter 7 Managing Big Data in Cloud Storage
7.1 Large-Scale Data Storage
7.1.1 Challenges of Storing Large Data in Distributed Systems
7.2 Hadoop Distributed File System (HDFS)
7.2.1 HDFS Permission Checks
7.2.2 HDFS Shell Commands
7.2.3 Chaining and Scripting HDFS Commands
7.2.4 Loading Data on HDFS
7.3 Hadoop User Experience (HUE)
7.3.1 Features of HUE
7.3.2 HUE Components
7.4 Chapter Summary
References
Chapter 8 Big Data in Healthcare
8.1 Digitalization in Healthcare Sector
8.1.1 Use of Big Data in Medical Care
8.2 Big Data in Public Health
8.2.1 Big Data Surveillance Using Machine Learning
8.2.2 Big Data in Public Health Training
8.2.3 Limitations and Open Issues for Big Data While Using Machine Learning in Public Health
8.3 The Four V’s of Big Data in Healthcare
8.4 Big Data in Genomics
8.5 Architectural Framework
8.5.1 Methodology of Big Data Analytics in Healthcare
8.5.2 Advantages of Big Data Analytics to Healthcare
8.5.3 Challenges of Big Data in Healthcare
8.6 Chapter Summary
References
Chapter 9 Big Data in Finance
9.1 Digitalization in Financial Industry
9.2 Sources of Financial Data
9.3 Challenges of Using Big Data in Financial Research
9.4 Financial Big Data
9.4.1 FBD Management
9.4.2 FBD Analytics
9.5 Theoretical Framework of Big Data in Financial Services
9.6 Popular Use Cases of FBD Analytics
9.7 Chapter Summary
References
Chapter 10 Enabling Tools and Technologies for Big Data Analytics
10.1 Big Data Management and Modeling Tools
10.1.1 Data Modeling Tools
10.1.2 Vector Data Model with Lucene
10.1.3 Graph Data Model with Gephi
10.1.4 Data Management Tools
10.2 Big Data Integration and Processing Tools
10.2.1 Big Data Processing Using Splunk and Datameer
10.3 Big Data Machine Learning Tools
10.3.1 KNIME
10.3.2 Spark MLlib
10.4 Big Data Graph Analytics Tools
10.4.1 Giraph
10.4.2 GraphX
10.4.3 Neo4j
10.5 Chapter Summary
References
Index

"With the proliferation of information, big data management and analysis have become an indispensable part of any system to handle such amounts of data. The amount of data generated by the multitude of interconnected devices increases exponentially, making the storage and processing of these data a real challenge. Big data management and analytics have gained momentum in almost every industry, ranging from finance or healthcare. Big data can reveal key insights if handled and analyzed properly; it has great application potential to improve the working of any industry. This book covers the spectrum aspects of big data; from the preliminary level to specific case studies. It will help readers gain knowledge of the big data landscape. Highlights of the topics covered include description of the Big Data ecosystem; real-world instances of big data issues; how the Vs of Big Data (volume, velocity, variety, veracity, valence, and value) affect data collection, monitoring, storage, analysis, and reporting; structural process to get value out of Big Data and recognize the differences between a standard database management system and a big data management system. Readers will gain insights into choice of data models, data extraction, data integration to solve large data problems, data modelling using machine learning techniques, Spark's scalable machine learning techniques, modeling a big data problem into a graph database and performing scalable analytical operations over the graph and different tools and techniques for processing big data and its applications including in healthcare and finance"-- Provided by publisher.

There are no comments for this item.

to post a comment.

Click on an image to view it in the image viewer