by Pablo Duboue, PhD
This book is structured into two parts. The first part presents feature engineering ideas and approaches that are as much domain independent as feature engineering can possibly be. The second part exemplifies different techniques in key domains through cases studies.
Preorder on Amazon expected availability, September 2020.
In one place summarizes dozens of blogs, videos, forum posts under a unified view and nomenclature. The book references more than 300 sources.
Helps the practitioner obtain better end-to-end performance than just tuning model parameters.
Helps the practitioner work with sets, lists, trees and graphs, traditionally problematic for statistical machine learning
Practitioners working on new domains can study solutions in other domains to help build new ones on their own. Note that each domain uses a different language and the book bridges this interdisciplinary barriers.
It helps readers compare techniques across domains as different as text and images. Instructors can reuse the dataset for their class examples.
The readers can look at the code for lower level details, the instructors can extend it and adapt it for their own classroom use.
This part focuses on domain independent techniques and overall process, where careful data analysis can steer practitioners away from bad assumptions and yield high-performing models.
Topics: machine learning cycle, f-measure, precision, recall, error analysis, feature ideation, feature creation, feature extraction, feature engineering, domain modelling, data preparationLearn More
Topics: normalization, binning, outliers, outlier detection, histogram, descriptive statistics, whitening, zca whitening, scaling, standardizationLearn More
Topics: computable features, feature imputation, kernels, target rate encoding, one hot encoding, training expansion, tidy dataLearn More
Topics: feature selection, feature utility, recursive feature elimination, ablation study, dimensionality reduction, lasso, elasticnet, embeddings, word2vec, non-negative matrix factorizationLearn More
Topics: variable length feature vector, encoding lists, encoding sets, automated feature engineering, featuretools, deep learning, autoencondersLearn More
Tapping into domain expertise allows to avoid known problems in a target domain. This parts seeks to learn from well understood domains to help practitioners tackle new, less understood domains.