Dataset description and preprocessing