Author: Tarun Saini
Machine Learning is an important skill to have in today’s age. But acquiring the skill set could take some time especially when the path to it is unscattered. The below-mentioned points have a very wider reach to the topics it covers and essentially would give anyone a very good start when it comes to starting from scratch. Learners should not limit themselves to only the below-mentioned set of skills as machine learning is an ever-expanding field and keeping abreast about the latest things and events always becomes very beneficial in scaling new heights in this field.
- Programming knowledge
- Applied Mathematics
- Data Modeling and Evaluation
- Machine Learning Algorithms
- Neural Network Architecture
1. PROGRAMMING KNOWLEDGE
The very essence of machine learning is coding(until and unless you are building something using drag and drop tools and which does not require a lot of customization) for cleaning the data, building the model, and validating them as well. Having a very good knowledge of programming skills along with the best practices always helps. You might be using java based programming or object-oriented based programming. But irrespective of what learners are using, debugging, writing efficient user defines functions and for loops and using inherent properties of the data structures essentially pays in the longer run. Having a good understanding of the below things will help
- Computer Science Fundamentals and Programming
- Software Engineering and System Design
- Machine Learning Algorithms and Libraries
- Distributed computing
- Unix
2. APPLIED MATHEMATICS
Knowledge of mathematics and related skills will always be beneficial when it comes to the understanding of the theoretical concepts of machine learning algorithms. Statistics, calculus, coordinate geometry, probability, permutations, and combinations come in very handy, although learners do not have to practically do mathematics using them. We have the libraries and the programming language to aid in a few of these, but in order to understand the underlying principles, these are very useful. Below listed are some of the mathematical concepts which are useful
2.1. Linear Algebra
Below skills in linear algebra could be very useful
- Principal Component Analysis (PCA)
- Singular Value Decomposition (SVD)
- Symmetric Matrices
- Matrix Operations
- Eigenvalues & Eigenvectors
- Vector Spaces and Norms
2.2. Probability Theory and Statistics
There are a lot of probabilistic based algorithms in machine learning and knowledge of provability becomes useful in such cases. The below mention topics in probability are good to have
- Probability Rules
- Bayes’ Theorem
- Variance and Expectation
- Conditional and Joint Distributions
- Standard Distributions (Bernoulli, Binomial, Multinomial, Uniform and Gaussian)
- Moment Generating Functions, Maximum Likelihood Estimation (MLE)
- Prior and Posterior
- Maximum a Posteriori Estimation (MAP)
- Sampling Methods
3. BUILDING MODELS AND EVALUATION
In the world of machine learning, there is no one fixed algorithm which could be identified well in advance and used to build the model. Irrespective of whether its classification, regression, or unsupervised, there are a host of techniques that need to be applied before deciding the best one for a given set of data points. Of course, with the due course, for time and experience modelers do have the idea which out of the lot could be better used than the rest but that is subjected to the situation.
Finalizing the model always leads to interpreting the model output and there are a lot of technical terms involved in this part that could decide the direction of interpretation. As such, not only model selection, developers would also need to stress equally on the aspect of model interpretation and hence would be in a better position to evaluate and suggest changes. Model validation is comparatively easier and well-defined when it comes to supervised learning but in the case of unsupervised, learners need to tread carefully before choosing the hows and whens of model evaluation.
The below concepts related to model validation are very useful to know in order to be a better judge of the models
- Components of the confusion matrix
- Logarithmic Loss
- Confusion Matrix
- Area under Curve
- F1 Score
- Mean Absolute Error
- Mean Squared Error
- Rand index
- Silhouette score
4. MACHINE LEARNING ALGORITHMS
While a machine learning engineer may not have to explicitly apply complex concepts of calculus and probability, they always have the in-build libraries(irrespective of the platform/programming language being used) to help simplify things. When it comes to the libraries, be it for data cleansing/wrangling or building models or model evaluation, they are aplenty. Knowing each and every one of them in any platform is almost impossible and more often not beneficial.
However, there would be a set of libraries which would be used day in and day out for task related to either machine learning, natural language processing, or deep learning. Hence getting familiarised with the lot would always lead to an advantageous situation and faster development time as well. Machine learning libraries associated with the below techniques are useful
- Exposure to packages and APIs such as scikit-learn, Theano, Spark MLlib, H2O, TensorFlow
- Expertise in models such as decision trees, nearest neighbor, neural net, support vector machine, and a knack for deciding which one fits the best
- Deciding and choosing hyperparameters that affect the learning model and the outcome
- Familiarity and understanding of concepts such as gradient descent, convex optimization, quadratic programming, partial differential equations
- Underlying working principle of techniques like random forests, support vector machines (SVMs), Naive Bayes Classifiers, etc helps drive the model building process faster
Source: Google Image
5. NEURAL NETWORK ARCHITECTURE
Understanding neural networks working principle requires time as it is a different terrain in the field of AI especially if one considers neural nets to be an extension of machine learning techniques. Having said that, it is not impossible to have a very good understanding after spending some time with them, getting to know the underlying principles, and working on them as well. The architecture of neural nets takes a lot of inspiration from the human brain and hence the terms are related to the architecture that has been derived from biology. Neural nets form the very essence of deep learning. Depending on the architecture of a neural net, we will have a shallow or deeper model. Depending on the depth of the architecture, the computational complexity would increase proportionately. But they evidently have an edge when it comes to solving complex problems or problems with higher input dimensions. They almost have a magical effect on the model performance when compared to the traditional machine learning algorithms. Hence it is always better to have some initial understanding so that over a period of time learner can transition smoothly
- The architecture of neural nets
- Single-layer perceptron
- Backpropagation and Forward propagation
- Activation functions
- Supervised Deep Learning techniques
- Unsupervised Deep Learning techniques
The neural network is an ever-growing field of study. It is primarily divided into supervised and unsupervised techniques similar to machine learning techniques. In the area of deep learning(the basis of which is neural networks), supervised techniques are mostly studied.
Source: Google Image
You can get more on the best machine learning course.