This brick provides a tool to handle missing values in your data.
It is pretty common for a data set to have missing values. This cases should be treated accordingly to not cause any issues or errors while training (e. g., some models do not support empty elements or will cause performance deterioration).
There are several ways to handle missing values: fill "cells" with function value or specify a constant one, delete rows with missing values or delete an entire column. It is highly advised to analyze your subject area (or to contact an expert if possible), as well as to get columns' information value.
There are some recommendations to consider while choosing the method to handle missing values:
If you are not sure which method to choose, you can use brick's auto-suggestions (you will also get these suggestions when you first open the brick's settings) and then adjust them.
Bricks → Analytics → Features Engineering → Missing Values Treatment
Bricks → Use Cases → Credit Scoring → Features Engineering → Missing Values Treatment
Bricks → Use Cases → Demand Forecasting → Data Processing → Missing Values Treatment
Treatment
In this parameter you can specify one of the treatment types:
Value
Enabled only if treatment type is either 'Fill with custom value' or 'Fill with function'.
If treatment is 'Fill with function' then you need to chose one of these functions:
With 'Fill with custom value' you would need to specify the value.
Brick frozen
This parameter enables the frozen run for this brick. It means that all columns with the 'Fill with function' (min, max, mean, median, mode) treatment type will save their current calculated value for the next runs, which may be useful after pipeline deployment.
This option appears only after successful regular run.
Inputs
Brick takes the data set without any restrictions.