The Art of Feature Engineering

Essentials for Machine Learning

by Pablo Duboue, PhD


Pablo Duboue is passionate about improving society through technology. He has a Ph.D. in Computer Science from Columbia University and splits his time between teaching machine learning, doing open research, contributing to free software, and consulting for start-ups.

Order on Amazon Download on Cambridge Core

The Author

Pablo Duboue, PhD
Director of Textualization Software Ltd.
Pablo Duboue

Pablo Duboue is passionate about improving society through technology. He has a Ph.D. in Computer Science from Columbia University. He splits his time between teaching machine learning, doing open research, contributing to free software projects, and consulting for start-ups.

  • Member of the IBM Watson team that beat the Jeopardy! Champions
  • Taught in three different countries
  • Joint research with more than fifty co-authors
  • Best paper award in the Canadian AI conference industrial track
  • Consulted for a start-up acquired by Intel Corp.

Reasons for writing

The reasons for writing this book were three fold:

First, as part of his work on the IBM Watson Jeopardy! team, he created a custom programming language for feature engineering and witnessed how the impact of feature engineering outperformed any changes on the underlining models tried by machine learning colleagues.

Second, as a visiting professor in 2014, teaching machine learning for large datasets, he struggled to find textbooks on the topic for preparing lectures.

And third, as a consultant in the field, he has seen many practitioners leave substantive improvements in model performance by wrongly focusing on finding better models rather than improving their features.