With supervised classification, you have only on one table, which includes the outcome feature. This means, in particular, that it is only possible to analyse or to predict one line per entity.
Many features can be described in one line: the name of a customer, his age, if he already bought a type of product... That's why when you press the GET INSIGHTS button, the first model fits only the central table data. And provides a first bunch of insights.
Here follows a preview of the central table, used in the Step by step articles.
But so much raw data, easily linkable to the entity to analyse/predict (here, the customer), is discarded because it does not fit the "one line constraint". For example, all the customer reactions to emails sent by the company.
We see that the first client of the central table, identified by the KEY 0000168fc71e2e... , named Lawrence Ross, has been sent several emails. Each event is listed in a line, and shortly described.
The usual way to integrate information from peripheral tables, such as this one, is to calculate aggregates from its features. This means giving a kind of summary of the feature for each entity to be analysed and/or predicted. For example, the most frequent value of campaign_label for each client, or the sum of nb_of_days_since_event for each client... This is a way to add a feature, i.e. one value per line, per customer in the central table.
PredicSis.ai automates this discovery phase of relevant aggregates. The user needs to ask a NUMBER of smart aggregates, and watch the performance of the successive models increasing.
Please download here this data set (≈95MB).
Then start again the step by step workflow, with several tables this time.
Suggestion of articles to read:
2) Get insights