
上QQ阅读APP看书,第一时间看更新
Data pre-processing
In this step, we apply some conversions to our data to make it consistent and concrete. There are lots of different conversions that you can consider while pre-processing your data:
- Renaming (relabeling): This means converting categorical values to numbers, as categorical values are dangerous if used with some learning methods, and also numbers will impose an order between the values
- Rescaling (normalization): Transforming/bounding continuous values to some range, typically [-1, 1] or [0, 1]
- New features: Making up new features from the existing ones. For example, obesity-factor = weight/height