Author: Jason Brownlee
Systematic experimentation is a key part of applied machine learning.
Given the complexity of machine learning methods, they resist formal analysis methods. Therefore, we must learn about the behavior of algorithms on our specific problems empirically. We do this using controlled experiments.
In this tutorial, you will discover the important role that controlled experiments play in applied machine learning.
After completing this tutorial, you will know:
- The need for systematic discovery via controlled experiments.
- The need to repeat experiments in order to control for the sources of variance.
- Examples of experiments performed in machine learning and the challenge and opportunity they represent.
Let’s get started.
Tutorial Overview
This tutorial is divided into 3 parts; they are:
- Systematic Experimentation
- Controlling For Variance
- Experiments in Machine Learning
Need help with Statistics for Machine Learning?
Take my free 7-day email crash course now (with sample code).
Click to sign-up and also get a free PDF Ebook version of the course.
Systematic Experimentation
In applied machine learning, you must become the scientist and perform systematic experiments.
The answers to questions that you care about, such as what algorithm works best on your data or which input features to use, can only be found through the results of experimental trials.
This is due mainly to the fact that machine learning methods are complex and resist formal methods of analysis.
[…] many learning algorithms are too complex for formal analysis, at least at the level of generality assumed by most theoretical treatments. As a result, empirical studies of the behavior of machine learning algorithms must retain a central role.
— The Experimental Study of Machine Learning, 1991.
In statistics, the choice of a type of experiment is called experimental design, and there are many types of experiments to choose from. For example, you may have heard that randomized double-blind placebo-controlled experimentation as the gold standard for evaluating the effectiveness of medical treatments.
Applied machine learning is special in that we have complete control over the experiment and we can run as few or as many trials as we wish on our computer. Because of the ease of running experiments, it is important that we are running the right types of experiments.
In the natural sciences, one can never control all possible variables. […] As a science of the artificial, machine learning can usually avoid such complications.
— Machine Learning as an Experimental Science, Editorial, 1998.
The type of experiments we wish to perform are called controlled experiments.
These are experiments where all known independent variables are held constant and modified one at a time in order to determine their impact on the dependent variable. The results are compared to a baseline, or no-treatment, called a “control.” This could be the result of a baseline method like persistence or the Zero Rule algorithm or the default-configuration for the method.
As normally defined, an experiment involves systematically varying one or more independent variables and examining their effect on some dependent variables. Thus, a machine learning experiment requires more than a single learning run; it requires a number of runs carried out under different conditions. In each case, one must measure some aspect of the system’s behavior for comparison across the different conditions.
— Machine Learning as an Experimental Science, Editorial, 1998.
Controlling For Variance
In many ways, experiments with machine learning methods have more in common with simulation studies, such as those in physics, than with evaluating medical treatments.
As such, the results of a single experiment are probabilistic, subjected to variance.
There are two main types of variance that we seek to understand in our controlled experiments; they are:
- Variance in the data, such as the data used to train the learning algorithm and the data used to evaluate its skill.
- Variance in the model, such as the use of randomness in the learning algorithm, such as random initial weights in neural nets, selection of cut points in bagging, shuffled order of data in stochastic gradient descent, and so on.
A result from a single run or trial of a controlled experiment would be misleading given these sources of variance.
The experiment must control for these sources of variance. This is done by repeating the experimental trial multiple times in order to elicit the range of variance so that we can both report the expected result and the variance in the expected result, e.g. mean and confidence interval.
In simulation studies, such as Monte Carlo methods, the repetition of an experiment is called variance reduction.
Experiments in Machine Learning
Experimentation is a key part of applied machine learning.
This is both a challenge to beginners who must learn some rigor and an exciting opportunity for discovery and contribution.
Let’s make this concrete with some examples of the types of controlled experiments you may need to perform:
- Choose-Features Experiments. When determining what data features (input variables) are most relevant to a model, the independent variables may be the input features and the dependent variable might be the estimated skill of the model on unseen data.
- Tune-Model Experiments. When tuning a machine learning model, the independent variables may be the hyperparameters of the learning algorithm and the dependent variable might be the estimated skill of the model on unseen data.
- Compare-Models Experiments. When comparing the performance of machine learning models, the independent variables may be the learning algorithms themselves with a specific configuration and the dependent variable is the estimated skill of the model on unseen data.
What makes the experimental focus of applied machine learning so exciting is two fold:
- Discovery. You can discover what works best for your specific problem and data. A challenge and an opportunity.
- Contribution. You can make broader discoveries in the field, without any specialized knowledge other than rigorous and systematic experimentation.
Using off-the-shelf tools and careful experimental methods, you can make discoveries and contributions.
In summary machine learning occupies a fortunate position that makes systematic experimentation easy and profitable. […] Although experimental studies are not the only path to understanding, we feel they constitute one of machine learning s brightest hopes for rapid scientific progress, and we encourage other researchers to join in our fields evolution toward an experimental science.
— The Experimental Study of Machine Learning, 1991.
Further Reading
This section provides more resources on the topic if you are looking to go deeper.
Books
- The Design and Analysis of Computer Experiments, 2003.
- Empirical Methods for Artificial Intelligence, 1995.
Papers
- Machine Learning as an Experimental Science, Editorial, 1998.
- The Experimental Study of Machine Learning, 1991.
- Machine Learning as an Experimental Science (Revisited), 2006.
Articles
- Scientific control on Wikipedia
- Design of experiments on Wikipedia
- Blinded experiment on Wikipedia
- Controlling for a variable on Wikipedia
- Computer experiment on Wikipedia
- Variance Reduction on Wikipedia
Summary
In this tutorial, you discovered the important role that controlled experiments play in applied machine learning.
Specifically, you learned:
- The need for systematic discovery via controlled experiments.
- The need to repeat experiments in order to control for the sources of variance.
- Examples of experiments performed in machine learning and the challenge and opportunity they represent.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
The post Controlled Experiments in Machine Learning appeared first on Machine Learning Mastery.