图书介绍

大数据分析 R语言实现PDF|Epub|txt|kindle电子书版本网盘下载

大数据分析 R语言实现
  • (英)西蒙?沃克威克 著
  • 出版社: 南京:东南大学出版社
  • ISBN:9787564173616
  • 出版时间:2017
  • 标注页数:490页
  • 文件大小:61MB
  • 文件页数:503页
  • 主题词:程序语言-程序设计-英文

PDF下载


点此进入-本书在线PDF格式电子书下载【推荐-云解压-方便快捷】直接下载PDF格式图书。移动端-PC端通用
种子下载[BT下载速度快]温馨提示:(请使用BT下载软件FDM进行下载)软件下载地址页直链下载[便捷但速度慢]  [在线试读本书]   [在线获取解压码]

下载说明

大数据分析 R语言实现PDF格式电子书版下载

下载的文件为RAR压缩包。需要使用解压软件进行解压得到PDF格式图书。

建议使用BT下载工具Free Download Manager进行下载,简称FDM(免费,没有广告,支持多平台)。本站资源全部打包为BT种子。所以需要使用专业的BT下载软件进行下载。如BitComet qBittorrent uTorrent等BT下载工具。迅雷目前由于本站不是热门资源。不推荐使用!后期资源热门了。安装了迅雷也可以迅雷进行下载!

(文件页数 要大于 标注页数,上中下等多册电子书除外)

注意:本站所有压缩包均有解压码: 点击下载压缩包解压工具

图书目录

Preface1

Chapter 1:The Era of Big Data7

Big Data-The monster re-defined7

Big Data toolbox-dealing with the giant11

Hadoop-the elephant in the room12

Databases15

Hadoop Spark-ed up16

R-The unsung Big Data hero17

Summary24

Chapter 2:Introduction to R Programming Language and Statistical Environment25

Learning R25

Revisiting R basics28

Getting R and RStudio ready28

Setting the URLs to R repositories30

R data structures32

Vectors32

Scalars35

Matrices35

Arrays37

Data frames38

Lists41

Exporting R data objects42

Applied data science with R47

Importing data from different formats48

Exploratory Data Analysis50

Data aggregations and contingency tables53

Hypothesis testing and statistical inference56

Tests of differences57

Independent t-test example(with power and effect size estimates)57

ANOVA example60

Tests of relationships63

An example of Pearson's r correlations63

Multiple regression example65

Data visualization packages70

Summary71

Chapter 3:Unleashing the Power of R from Within73

Traditional limitations of R74

Out-of-memory data74

Processing speed75

To the memory limits and beyond76

Data transformations and aggregations with the ff and ffbase packages76

Generalized linear models with the ff and ffbase packages87

Logistic regression example with ffbase and biglm89

Expanding memory with the bigmemory package97

Parallel R106

From bigmemory to faster computations107

An apply()example with the big.matrix object108

A for()loop example with the ffdf object108

Using apply()and for()loop examples on a data.frame109

A parallel package example110

A foreach package example113

The future of parallel processing in R115

Utilizing Graphics Processing Units with R115

Multi-threading with Microsoft R Open distribution117

Parallel machine learning with H2O and R118

Boosting R performance with the data.table package and other tools118

Fast data import and manipulation with the data.table package118

Data import with data.table119

Lightning-fast subsets and aggregations on data.table120

Chaining,more complex aggregations,and pivot tables with data.table123

Writing better R code126

Summary127

Chapter 4:Hadoop and MapReduce Framework for R129

Hadoop architecture130

Hadoop Distributed File System130

MapReduce framework131

A simple MapReduce word count example132

Other Hadoop native tools134

Learning Hadoop136

A single-node Hadoop in Cloud137

Deploying Hortonworks Sandbox on Azure138

A word count example in Hadoop using Java159

A word count example in Hadoop using the R language169

RStudio Server on a Linux RedHat/CentOS virtual machine169

Installing and configuring RHadoop packages177

HDFS management and MapReduce in R-a word count example179

HDInsight-a multi-node Hadoop cluster on Azure194

Creating your first HDInsight cluster194

Creating a new Resource Group195

Deploying a Virtual Network197

Creating a Network Security Group200

Setting up and configuring an HDInsight cluster203

Starting the cluster and exploring Ambari211

Connecting to the HDInsight cluster and installing RStudio Server215

Adding a new inbound security rule for port 8787218

Editing the Virtual Network's public IP address for the head node221

Smart energy meter readings analysis example-using R on HDInsight cluster229

Summary241

Chapter 5:R with Relational Database Management Systems(RDBMSs)243

Relational Database Management Systems(RDBMSs)244

A short overview of used RDBMSs244

Structured Query Language(SQL)245

SQLite with R247

Preparing and importing data into a local SQLite database248

Connecting to SQLite from RStudio250

MariaDB with R on a Amazon EC2 instance255

Preparing the EC2 instance and RStudio Server for use255

Preparing MariaDB and data for use257

Working with MariaDB from RStudio266

PostgreSQL with R on Amazon RDS281

Launching an Amazon RDS database instance281

Preparing and uploading data to Amazon RDS290

Remotely querying PostgreSQL on Amazon RDS from RStudio304

Summary314

Chapter 6:R with Non-Relational(NoSQL)Databases315

Introduction to NoSQL databases315

Review of leading non-relational databases316

MongoDB with R319

Introduction to MongoDB319

MongoDB data models319

Installing MongoDB with R on Amazon EC2322

Processing Big Data using MongoDB with R325

Importing data into MongoDB and basic MongoDB commands326

MongoDB with R using the rmongodb package333

MongoDB with R using the RMongo package346

MongoDB with R using the mongolite package350

HBase with R355

Azure HDInsight with HBase and RStudio Server355

Importing the data to HDFS and HBase363

Reading and querying HBase using the rhbase package367

Summary372

Chapter 7:Faster than Hadoop-Spark with R373

Spark for Big Data analytics374

Spark with R on a multi-node HDInsight cluster375

Launching HDInsight with Spark and R/RStudio375

Reading the data into HDFS and Hive383

Getting the data into HDFS385

Importing data from HDFS to Hive386

Bay Area Bike Share analysis using SparkR393

Summary411

Chapter 8:Machine Learning Methods for Big Data in R413

What is machine learning?414

Supervised and unsupervised machine learning methods415

Classification and clustering algorithms416

Machine learning methods with R417

Big Data machine learning tools418

GLM example with Spark and R on the HDInsight cluster419

Preparing the Spark cluster and reading the data from HDFS419

Logistic regression in Spark with R425

Naive Bayes with H2O on Hadoop with R437

Running an H2O instance on Hadoop with R437

Reading and exploring the data in H2O441

Naive Bayes on H2O with R446

Neural Networks with H2O on Hadoop with R458

How do Neural Networks work?458

Running Deep Learning models on H2O461

Summary469

Chapter 9:The Future of R-Big,Fast,and Smart Data471

The current state of Big Data analytics with R471

Out-of-memory data on a single machine471

Faster data processing with R473

Hadoop with R475

Spark with R476

R with databases477

Machine learning with R478

The future of R478

Big Data479

Fast data480

Smart data481

Where to go next482

Summary482

Index483

热门推荐