Applied Supervised Learning with R
上QQ阅读APP看书,第一时间看更新

Introduction

R was one of the early programming languages developed for statistical computing and data analysis with good support for visualization. With the rise of data science, R emerged as an undoubted choice of programming language among many data science practitioners. Since R was open-source and extremely powerful in building sophisticated statistical models, it quickly found adoption in both industry and academia.

Tools and software such as SAS and SPSS were only affordable by large corporations, and traditional programming languages such as C/C++ and Java were not suitable for performing complex data analysis and building model. Hence, the need for a much more straightforward, comprehensive, community-driven, cross-platform compatible, and flexible programming language was a necessity.

Though Python programming language is increasingly becoming popular in recent times because of its industry-wide adoption and robust production-grade implementation, R is still the choice of programming language for quick prototyping of advanced machine learning models. R has one of the most populous collection of packages (a collection of functions/methods for accomplishing a complicated procedure, which otherwise requires a lot of time and effort to implement). At the time of writing this book, the Comprehensive R Archive Network (CRAN), a network of FTP and web servers around the world that store identical, up-to-date, versions of code and documentation for R, has more than 13,000 packages.

While there are numerous books and online resources on learning the fundamentals of R, in this chapter, we will limit the scope only to cover the important topics in R programming that will be used extensively in many data science projects. We will use a real-world dataset from the UCI Machine Learning Repository to demonstrate the concepts. The material in this chapter will be useful for learners who are new to R Programming. The upcoming chapters in supervised learning concepts will borrow many of the implementations from this chapter.