Session Title: Data Science – Feature Engineering
Speaker: Neil Hambly

Abstract: Data is the life (blood) of Analytics. Without it we have nothing. In this session we will focus on a core aspect of the TDSP “Feature engineering”. Feature Engineering involves aggregation and transformation of data variables to create features which are used in the analysis. We need to understand how features relate to each other and then select an optimum algorithms for these features. Feature Selection is where we define a subset of features to reduce the dimensionality of the training (model) and improve performance & costs of the ML Algorithms. One way we can understand these features better is to perform a PCA (Principal Component Analysis). For this important step using an example, we step though the process in a Jupyter (python) notebook to identify the features of a dataset which conveys important information about the structure of the dataset.

300+ sessions are now available on-demand from Data Platform Summit 2021 & 2020 at no cost. Browse all sessions.

Stay tuned, more learning coming your way.