How to Use AutoKeras for Classification and Regression

Author: Jason Brownlee

AutoML refers to techniques for automatically discovering the best-performing model for a given dataset.

When applied to neural networks, this involves both discovering the model architecture and the hyperparameters used to train the model, generally referred to as neural architecture search.

AutoKeras is an open-source library for performing AutoML for deep learning models. The search is performed using so-called Keras models via the TensorFlow tf.keras API.

It provides a simple and effective approach for automatically finding top-performing models for a wide range of predictive modeling tasks, including tabular or so-called structured classification and regression datasets.

In this tutorial, you will discover how to use AutoKeras to find good neural network models for classification and regression tasks.

After completing this tutorial, you will know:

  • AutoKeras is an implementation of AutoML for deep learning that uses neural architecture search.
  • How to use AutoKeras to find a top-performing model for a binary classification dataset.
  • How to use AutoKeras to find a top-performing model for a regression dataset.

Let’s get started.

How to Use AutoKeras for Classification and Regression

How to Use AutoKeras for Classification and Regression
Photo by kanu101, some rights reserved.

Tutorial Overview

This tutorial is divided into three parts; they are:

  1. AutoKeras for Deep Learning
  2. AutoKeras for Classification
  3. AutoKeras for Regression

AutoKeras for Deep Learning

Automated Machine Learning, or AutoML for short, refers to automatically finding the best combination of data preparation, model, and model hyperparameters for a predictive modeling problem.

The benefit of AutoML is allowing machine learning practitioners to quickly and effectively address predictive modeling tasks with very little input, e.g. fire and forget.

Automated Machine Learning (AutoML) has become a very important research topic with wide applications of machine learning techniques. The goal of AutoML is to enable people with limited machine learning background knowledge to use machine learning models easily.

Auto-keras: An efficient neural architecture search system, 2019.

AutoKeras is an implementation of AutoML for deep learning models using the Keras API, specifically the tf.keras API provided by TensorFlow 2.

It uses a process of searching through neural network architectures to best address a modeling task, referred to more generally as Neural Architecture Search, or NAS for short.

… we have developed a widely adopted open-source AutoML system based on our proposed method, namely Auto-Keras. It is an open-source AutoML system, which can be downloaded and installed locally.

Auto-keras: An efficient neural architecture search system, 2019.

In the spirit of Keras, AutoKeras provides an easy-to-use interface for different tasks, such as image classification, structured data classification or regression, and more. The user is only required to specify the location of the data and the number of models to try and is returned a model that achieves the best performance (under the configured constraints) on that dataset.

Note: AutoKeras provides a TensorFlow 2 Keras model (e.g. tf.keras) and not a Standalone Keras model. As such, the library assumes that you have Python 3 and TensorFlow 2.1 or higher installed.

To install AutoKeras, you can use Pip, as follows:

sudo pip install autokeras

You can confirm the installation was successful and check the version number as follows:

sudo pip show autokeras

You should see output like the following:

Name: autokeras
Version: 1.0.1
Summary: AutoML for deep learning
Home-page: http://autokeras.com
Author: Data Analytics at Texas A&M (DATA) Lab, Keras Team
Author-email: jhfjhfj1@gmail.com
License: MIT
Location: ...
Requires: scikit-learn, packaging, pandas, keras-tuner, numpy
Required-by:

Once installed, you can then apply AutoKeras to find a good or great neural network model for your predictive modeling task.

We will take a look at two common examples where you may want to use AutoKeras, classification and regression on tabular data, so-called structured data.

AutoKeras for Classification

AutoKeras can be used to discover a good or great model for classification tasks on tabular data.

Recall tabular data are those datasets composed of rows and columns, such as a table or data as you would see in a spreadsheet.

In this section, we will develop a model for the Sonar classification dataset for classifying sonar returns as rocks or mines. This dataset consists of 208 rows of data with 60 input features and a target class label of 0 (rock) or 1 (mine).

A naive model can achieve a classification accuracy of about 53.4 percent via repeated 10-fold cross-validation, which provides a lower-bound. A good model can achieve an accuracy of about 88.2 percent, providing an upper-bound.

You can learn more about the dataset here:

No need to download the dataset; we will download it automatically as part of the example.

First, we can download the dataset and split it into a randomly selected train and test set, holding 33 percent for test and using 67 percent for training.

The complete example is listed below.

# load the sonar dataset
from pandas import read_csv
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/sonar.csv'
dataframe = read_csv(url, header=None)
print(dataframe.shape)
# split into input and output elements
data = dataframe.values
X, y = data[:, :-1], data[:, -1]
print(X.shape, y.shape)
# basic data preparation
X = X.astype('float32')
y = LabelEncoder().fit_transform(y)
# separate into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

Running the example first downloads the dataset and summarizes the shape, showing the expected number of rows and columns.

The dataset is then split into input and output elements, then these elements are further split into train and test datasets.

(208, 61)
(208, 60) (208,)
(139, 60) (69, 60) (139,) (69,)

We can use AutoKeras to automatically discover an effective neural network model for this dataset.

This can be achieved by using the StructuredDataClassifier class and specifying the number of models to search. This defines the search to perform.

...
# define the search
search = StructuredDataClassifier(max_trials=15)

We can then execute the search using our loaded dataset.

...
# perform the search
search.fit(x=X_train, y=y_train, verbose=0)

This may take a few minutes and will report the progress of the search.

Next, we can evaluate the model on the test dataset to see how it performs on new data.

...
# evaluate the model
loss, acc = search.evaluate(X_test, y_test, verbose=0)
print('Accuracy: %.3f' % acc)

We then use the model to make a prediction for a new row of data.

...
# use the model to make a prediction
row = [0.0200,0.0371,0.0428,0.0207,0.0954,0.0986,0.1539,0.1601,0.3109,0.2111,0.1609,0.1582,0.2238,0.0645,0.0660,0.2273,0.3100,0.2999,0.5078,0.4797,0.5783,0.5071,0.4328,0.5550,0.6711,0.6415,0.7104,0.8080,0.6791,0.3857,0.1307,0.2604,0.5121,0.7547,0.8537,0.8507,0.6692,0.6097,0.4943,0.2744,0.0510,0.2834,0.2825,0.4256,0.2641,0.1386,0.1051,0.1343,0.0383,0.0324,0.0232,0.0027,0.0065,0.0159,0.0072,0.0167,0.0180,0.0084,0.0090,0.0032]
X_new = asarray([row]).astype('float32')
yhat = search.predict(X_new)
print('Predicted: %.3f' % yhat[0])

We can retrieve the final model, which is an instance of a TensorFlow Keras model.

...
# get the best performing model
model = search.export_model()

We can then summarize the structure of the model to see what was selected.

...
# summarize the loaded model
model.summary()

Finally, we can save the model to file for later use, which can be loaded using the TensorFlow load_model() function.

...
# save the best performing model to file
model.save('model_sonar.h5')

Tying this together, the complete example of applying AutoKeras to find an effective neural network model for the Sonar dataset is listed below.

# use autokeras to find a model for the sonar dataset
from numpy import asarray
from pandas import read_csv
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from autokeras import StructuredDataClassifier
# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/sonar.csv'
dataframe = read_csv(url, header=None)
print(dataframe.shape)
# split into input and output elements
data = dataframe.values
X, y = data[:, :-1], data[:, -1]
print(X.shape, y.shape)
# basic data preparation
X = X.astype('float32')
y = LabelEncoder().fit_transform(y)
# separate into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)
# define the search
search = StructuredDataClassifier(max_trials=15)
# perform the search
search.fit(x=X_train, y=y_train, verbose=0)
# evaluate the model
loss, acc = search.evaluate(X_test, y_test, verbose=0)
print('Accuracy: %.3f' % acc)
# use the model to make a prediction
row = [0.0200,0.0371,0.0428,0.0207,0.0954,0.0986,0.1539,0.1601,0.3109,0.2111,0.1609,0.1582,0.2238,0.0645,0.0660,0.2273,0.3100,0.2999,0.5078,0.4797,0.5783,0.5071,0.4328,0.5550,0.6711,0.6415,0.7104,0.8080,0.6791,0.3857,0.1307,0.2604,0.5121,0.7547,0.8537,0.8507,0.6692,0.6097,0.4943,0.2744,0.0510,0.2834,0.2825,0.4256,0.2641,0.1386,0.1051,0.1343,0.0383,0.0324,0.0232,0.0027,0.0065,0.0159,0.0072,0.0167,0.0180,0.0084,0.0090,0.0032]
X_new = asarray([row]).astype('float32')
yhat = search.predict(X_new)
print('Predicted: %.3f' % yhat[0])
# get the best performing model
model = search.export_model()
# summarize the loaded model
model.summary()
# save the best performing model to file
model.save('model_sonar.h5')

Running the example will report a lot of debug information about the progress of the search.

The models and results are all saved in a folder called “structured_data_classifier” in your current working directory.

...
[Trial complete]
[Trial summary]
 |-Trial ID: e8265ad768619fc3b69a85b026f70db6
 |-Score: 0.9259259104728699
 |-Best step: 0
 > Hyperparameters:
 |-classification_head_1/dropout_rate: 0
 |-optimizer: adam
 |-structured_data_block_1/dense_block_1/dropout_rate: 0.0
 |-structured_data_block_1/dense_block_1/num_layers: 2
 |-structured_data_block_1/dense_block_1/units_0: 32
 |-structured_data_block_1/dense_block_1/units_1: 16
 |-structured_data_block_1/dense_block_1/units_2: 512
 |-structured_data_block_1/dense_block_1/use_batchnorm: False
 |-structured_data_block_1/dense_block_2/dropout_rate: 0.25
 |-structured_data_block_1/dense_block_2/num_layers: 3
 |-structured_data_block_1/dense_block_2/units_0: 32
 |-structured_data_block_1/dense_block_2/units_1: 16
 |-structured_data_block_1/dense_block_2/units_2: 16
 |-structured_data_block_1/dense_block_2/use_batchnorm: False

The best-performing model is then evaluated on the hold-out test dataset.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.

In this case, we can see that the model achieved a classification accuracy of about 82.6 percent.

Accuracy: 0.826

Next, the architecture of the best-performing model is reported.

We can see a model with two hidden layers with dropout and ReLU activation.

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         [(None, 60)]              0
_________________________________________________________________
categorical_encoding (Catego (None, 60)                0
_________________________________________________________________
dense (Dense)                (None, 256)               15616
_________________________________________________________________
re_lu (ReLU)                 (None, 256)               0
_________________________________________________________________
dropout (Dropout)            (None, 256)               0
_________________________________________________________________
dense_1 (Dense)              (None, 512)               131584
_________________________________________________________________
re_lu_1 (ReLU)               (None, 512)               0
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 513
_________________________________________________________________
classification_head_1 (Sigmo (None, 1)                 0
=================================================================
Total params: 147,713
Trainable params: 147,713
Non-trainable params: 0
_________________________________________________________________

AutoKeras for Regression

AutoKeras can also be used for regression tasks, that is, predictive modeling problems where a numeric value is predicted.

We will use the auto insurance dataset that involves predicting the total payment from claims given the total number of claims. The dataset has 63 rows and one input and one output variable.

A naive model can achieve a mean absolute error (MAE) of about 66 using repeated 10-fold cross-validation, providing a lower-bound on expected performance. A good model can achieve a MAE of about 28, providing a performance upper-bound.

You can learn more about this dataset here:

We can load the dataset and split it into input and output elements and then train and test datasets.

The complete example is listed below.

# load the sonar dataset
from pandas import read_csv
from sklearn.model_selection import train_test_split
# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/auto-insurance.csv'
dataframe = read_csv(url, header=None)
print(dataframe.shape)
# split into input and output elements
data = dataframe.values
data = data.astype('float32')
X, y = data[:, :-1], data[:, -1]
print(X.shape, y.shape)
# separate into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

Running the example loads the dataset, confirming the number of rows and columns, then splits the dataset into train and test sets.

(63, 2)
(63, 1) (63,)
(42, 1) (21, 1) (42,) (21,)

AutoKeras can be applied to a regression task using the StructuredDataRegressor class and configured for the number of models to trial.

...
# define the search
search = StructuredDataRegressor(max_trials=15, loss='mean_absolute_error')

The search can then be run and the best model saved, much like in the classification case.

...
# define the search
search = StructuredDataRegressor(max_trials=15, loss='mean_absolute_error')
# perform the search
search.fit(x=X_train, y=y_train, verbose=0)

We can then use the best-performing model and evaluate it on the hold out dataset, make a prediction on new data, and summarize its structure.

...
# evaluate the model
mae, _ = search.evaluate(X_test, y_test, verbose=0)
print('MAE: %.3f' % mae)
# use the model to make a prediction
X_new = asarray([[108]]).astype('float32')
yhat = search.predict(X_new)
print('Predicted: %.3f' % yhat[0])
# get the best performing model
model = search.export_model()
# summarize the loaded model
model.summary()
# save the best performing model to file
model.save('model_insurance.h5')

Tying this together, the complete example of using AutoKeras to discover an effective neural network model for the auto insurance dataset is listed below.

# use autokeras to find a model for the insurance dataset
from numpy import asarray
from pandas import read_csv
from sklearn.model_selection import train_test_split
from autokeras import StructuredDataRegressor
# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/auto-insurance.csv'
dataframe = read_csv(url, header=None)
print(dataframe.shape)
# split into input and output elements
data = dataframe.values
data = data.astype('float32')
X, y = data[:, :-1], data[:, -1]
print(X.shape, y.shape)
# separate into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)
# define the search
search = StructuredDataRegressor(max_trials=15, loss='mean_absolute_error')
# perform the search
search.fit(x=X_train, y=y_train, verbose=0)
# evaluate the model
mae, _ = search.evaluate(X_test, y_test, verbose=0)
print('MAE: %.3f' % mae)
# use the model to make a prediction
X_new = asarray([[108]]).astype('float32')
yhat = search.predict(X_new)
print('Predicted: %.3f' % yhat[0])
# get the best performing model
model = search.export_model()
# summarize the loaded model
model.summary()
# save the best performing model to file
model.save('model_insurance.h5')

Running the example will report a lot of debug information about the progress of the search.

The models and results are all saved in a folder called “structured_data_regressor” in your current working directory.

...
[Trial summary]
|-Trial ID: ea28b767d13e958c3ace7e54e7cb5a14
|-Score: 108.62509155273438
|-Best step: 0
> Hyperparameters:
|-optimizer: adam
|-regression_head_1/dropout_rate: 0
|-structured_data_block_1/dense_block_1/dropout_rate: 0.0
|-structured_data_block_1/dense_block_1/num_layers: 2
|-structured_data_block_1/dense_block_1/units_0: 16
|-structured_data_block_1/dense_block_1/units_1: 1024
|-structured_data_block_1/dense_block_1/units_2: 128
|-structured_data_block_1/dense_block_1/use_batchnorm: True
|-structured_data_block_1/dense_block_2/dropout_rate: 0.5
|-structured_data_block_1/dense_block_2/num_layers: 2
|-structured_data_block_1/dense_block_2/units_0: 256
|-structured_data_block_1/dense_block_2/units_1: 64
|-structured_data_block_1/dense_block_2/units_2: 1024
|-structured_data_block_1/dense_block_2/use_batchnorm: True

The best-performing model is then evaluated on the hold-out test dataset.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.

In this case, we can see that the model achieved a MAE of about 24.

MAE: 24.916

Next, the architecture of the best-performing model is reported.

We can see a model with two hidden layers with ReLU activation.

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         [(None, 1)]               0
_________________________________________________________________
categorical_encoding (Catego (None, 1)                 0
_________________________________________________________________
dense (Dense)                (None, 64)                128
_________________________________________________________________
re_lu (ReLU)                 (None, 64)                0
_________________________________________________________________
dense_1 (Dense)              (None, 512)               33280
_________________________________________________________________
re_lu_1 (ReLU)               (None, 512)               0
_________________________________________________________________
dense_2 (Dense)              (None, 128)               65664
_________________________________________________________________
re_lu_2 (ReLU)               (None, 128)               0
_________________________________________________________________
regression_head_1 (Dense)    (None, 1)                 129
=================================================================
Total params: 99,201
Trainable params: 99,201
Non-trainable params: 0
_________________________________________________________________

Further Reading

This section provides more resources on the topic if you are looking to go deeper.

Summary

In this tutorial, you discovered how to use AutoKeras to find good neural network models for classification and regression tasks.

Specifically, you learned:

  • AutoKeras is an implementation of AutoML for deep learning that uses neural architecture search.
  • How to use AutoKeras to find a top-performing model for a binary classification dataset.
  • How to use AutoKeras to find a top-performing model for a regression dataset.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

The post How to Use AutoKeras for Classification and Regression appeared first on Machine Learning Mastery.

Go to Source