Author: Stephanie Glen
Grab a copy of The Elements of Statistical Learning (“the machine learning bible”) and you might be a little overwhelmed by the mathematics. For example, this equation (p.34), for a cubic smoothing spline, might send shivers down your spine if math isn’t your forte:
In order to grasp that equation, nested firmly in the “Introductory” section of the book, you need to know function notation, sigma (summation) notation, derivatives, and Greek letters. Basically, if you haven’t taken a calculus class, you’re not going to be able to follow along. But, do you really need to know all of that math to grasp the fundamentals of ML? The answer is no.
In fact, ESL isn’t the only machine learning “bible” out there. An Introduction to Statistical Learning covers much of the same material, but in a less mathematical way. And the only prerequisite to understanding the material is a basic statistics class. So while it’s safe to say a statistics class with a smattering of basic math is going to be very useful, ninety percent of the math you learn in school you probably won’t use in ML.
High School Math That You [Probably] Don’t Need
I say probably don’t need because I’m a statistician, so it’s ingrained in me to never say “100%”. If I say you’ll never, ever come across a quadratic equation or a system of inequalities in ML, then I would be incorrect. You might come across it once, or twice, in many years of wading through a sea of scatter plots, matrices, and code.
So what follows is a rather incomplete list of the top items in high school that you [probably] hated and [probably] don’t need:
- Solving systems of inequalities: While you should be able to recognize an inequality at fifty feet, you can skip over the section on “how to solve a system of inequalities”. While it’s very useful to know what a system of inequalities represents (e.g. it can show you the maximum or minimum values in a system with multiple constraints.), computers can do the solving for you.
- Solving systems of linear equations: Outside of teaching it in the classroom, I’ve never actually solved one by hand in real life. You’ll probably want to have an awareness of what it can do for you–it helps you to compare different rates of things–but don’t worry about actually doing the solving: there’s an algorithm for that.
- Solving quadratic equations: After basic math classes, you’re probably never going to need to solve those quadratic equations again; If you do come across them in something relatively obscure like multivariate regression-based learning for eigenvalue problem solving, then knowing the steps might be useful. But by the point you get even close to multivariate regression, then taking a step back to learn the basics about quadratics will be a cake walk.
- Trigonomenty: That old trigonometry mantra SOHCAHTOA? I might have used it once, but don’t ask me to remember what for. All the trig you’ll ever used in ML will likely be covered in a good calculus class, which should include analytical geometry as part of the course. And, even then, you don’t need calculus either.
- Calculus or Linear algebra: You don’t need them to start out with ML, but they can help. Jason Brownlee, PhD , from machine learning mastery is on point when he states that “Having an appreciation for the abstract operations that underly some machine learning algorithms is not required in order to use machine learning as a tool to solve problems.”
- All those properties (Distributive, Additive…zzz) and acronyms (LCD, LCM,…, zzz). For example, the distributive property and Least Common Multiple needed to solve this equation:
While you should be able to understand equations like the one above (e.g. what does “y” represent?), you don’t need to actually be able to solve it (you can use a computer), nor do you need to know the names of all those properties and acronyms. And if your algebra is a little rusty? Well, there’s a calculator on the internet for that. Distributive property? Try this one. Least common multiple? There’s one here.
College Math That You [Probably] Don’t Need
Let’s say you want to get a firm mathematics-for-ML grounding in your undergraduate degree, but you only want to take classes that are very ML specific. Imperial College London runs a course titled Mathematics for Machine Learning Specialization which lists the following as skills you’ll gain during the class:
- Eigenvalues And Eigenvectors,
- Principal Component Analysis (PCA),
- Multivariable Calculus,
- Linear Algebra.
That’s a pretty short list of fundamentals, but don’t forget that those classes require prerequisites. For example, in order to understand multivariable calculus, you should take calculus I and II first. PCE requires statistics. All the other college math classes? You can skip them. For example, you don’t need any of these classes for ML:
- Abstract Algebra
- Advanced Multivariable Calculus
- Advanced Probability Theory
- Biostatistics
- Complex Analysis
- Differential Equations
- Discrete Mathematics
- Engineering Mathematics
- Functions and Modeling
- Geometry
- Graph Theory
- Intermediate Analysis
- Logic
- Number Theory
- Numerical Analysis
- Real Analysis
- Stochastic Processes
- Vector Analysis
- Number Theory
Math you do need for ML (Your walk in the door checklist)
What solid mathematical-related skills do you need to walk in the door? This varies on what company you’re working for and what kind of position you’re applying for. If you want to get into machine learning theory, you’re going to need some fairly advanced mathematics (like PCA and calculus). However, if you’re wanting to snag a junior analyst job and dip your toes into becoming a machine learning practitioner, you can get by with a few basics. At a minimum, you should know:
- Data manipulation skills: basic algebra, like the meaning of variables (x,y,z) basic function types (e.g. exponential, logarithmic, or linear) and function notation (e.g. f(x) = 8 + 783^x 8x).
- Data visualization skills: Ability to create scatter plots, histograms and x-y charts (all of which are covered in basic high school math). You’ll definitely need to know how to interpret a scatter chart, but you’ll never need to sketch a histogram by hand.
- Data analysis skills: basic descriptive statistics terms like mean, mode, median, standard deviation and variance. Summation notation is extremely important, as it appears frequently in machine learning. Sharp Sight calls data analysis the “real prerequisite for machine learning.”
This is an absolute minimum. The more you know, the more you’ll be able to jump into a project. While you’ll probably have no trouble cleaning and performing basic data analytics, you might run into difficulties with improving models and making predictions unless you’ve got higher level math skills.
Math that will probably be VERY useful down the road
As a junior data analyst, you probably won’t be involved in many machine learning projects. But the further down your career path you go, the more math you’ll need. According to the University of Warwick, a strong mathematics background gives you more than just math knowledge, it enhances your ability to think clearly, follow complex reasoning, and pay attention to detail. Some areas that are recommended as helpful include:
- Calculus: functions, derivatives, multi-variable calculus, differential equations.
- Linear algebra: vectors, matrices, eigenvalues/eigenvectors, principle component analysis, singular value decomposition. That said, you don’t really need to know all the bells and whistles; You just need a basic idea of what matrix algebra is all about. Common machine learning libraries like R’s caret package and Python’s scikit-learn take care of the complex calculations behind the scenes. With these tools, the most common ML algorithms are ready to use with a single line of code.
- Statistics: statistics is really at the heart of data science and machine learning. A course in statistics gives you much more than just “statistics.” You’ll learn probability and data analysis too. A one semester class should give you an excellent groundwork.
- Logic. Rather than memorizing all those properties and acronyms from high school algebra, a much better use of your valuable learning time is to stretch your brain and learn logic. If you can think logically, you can learn anything at a later date. They don’t actually teach “logic” in high school, and the college “logic” class is probably overkill. But you can teach yourself. How? Pick up a few books on logic problems, or try them online (like here).
As a final note, if you’re looking to learn the bare minimum of math for ML, perhaps the best way you can learn is by actually coding. Write a line of code (any code!), execute it, and see what does (or doesn’t happen). Go behind the scenes and see what the code actually did, mathematically speaking. Tinkering with code that actually does something useful (as opposed to learning a whole lot of theoretical bits you might never use) is going to pay you dividends. And will make your math-learning burden much lighter.
References
University of Warwick: Transferable Skills
The College Board: The Heart of Algebra.
Book Review: Statistical Learning with Sparsity – The Lasso and Generalizations
Sharp Sight Labs: The Real Prerequisite for Machine Learning
Logic puzzle image: Salix alba [CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0)]
The Elements of Statistical Learning (ESL)
An Introduction to Statistical Learning (ISL)
Trevor Hastie, Robert Tibshirani, Jerome Friedman. The Elements of Statistical Learning Data Mining, Inference, and Prediction. Second Edition. Springer.
Imperial College London: Mathematics for Machine Learning Specialization