Practical Data Analysis
上QQ阅读APP看书,第一时间看更新

Chapter 2. Working with Data

Building real world's data analytics requires accurate data. In this chapter we discuss how to obtain, clean, normalize, and transform raw data into a standard format such as Comma-Separated Values (CSV) or JavaScript Object Notation (JSON) using OpenRefine.

In this chapter we will cover:

  • Datasource
    • Open data
    • Text files
    • Excel files
    • SQL databases
    • NoSQL databases
    • Multimedia
    • Web scraping
  • Data scrubbing
    • Statistical methods
    • Text parsing
    • Data transformation
  • Data formats
    • CSV
    • JSON
    • XML
    • YAML
  • Getting started with OpenRefine