Author: Vincent O Ajayi
When I first heard about machine learning (ML), I thought only big companies applied it to explore big data. On searching the internet for the meaning of ML, I discovered that Wikipedia defines it as a subset of artificial intelligence (AI). In particular, it involves the scientific study of algorithms and statistical models that computer systems use to perform a specific task effectively, without using explicit instructions and relying on patterns and inferences instead. A few common examples of ML’s application available on the internet include skin cancer detection, facial recognition, churn prediction, diagnosis of diabetic eye disease, in addition to those of natural language processing such as language translation. Moreover, ML plays in a role in the way companies such as Amazon and Netflix apply recommended engines to predict and suggest what any given user might want to buy or watch.
I never thought that ML could be useful for small-scale businesses. Surprisingly, almost everyone I met and discussed the concept of ML with also held the same view. I continued to hold this opinion until I registered for a short training course on ML. Subsequently, I was assigned a project to develop a ML model that could effectively improve the cost of marketing campaigns for a charitable organisation, which changed my perspective completely.
This article aims to discuss the importance of ML for small-scale businesses and gives an example of the way an ML algorithm can be employed to estimate costs.
Six Common Important Functions of Machine Learning (ML) for Small-Scale Businesses
- Trend and Pattern Recognition – Several owners of small-scale businesses maintain a sales book and an account one, wherein they record their customers’ names, sales volume, cash transactions, and so on, from a different store. This record generates data that can be analysed to identify buying patterns of customers, along with other factors that drive sales or influence customers’ preferences.
- Modelling and Forecasting – The information in one’s sales book can be capitalised to estimate costs, predict sales volume and gauge revenues, profits as well as expected market share. The advantage is that the sustainability and success of a business depend upon the accuracy of such forecasts.
- Security – ML can help in analysing data and identifying patterns in such data. Further, this process can be used to identify suspicious transaction behaviour, track errors and detect fraud. As a result, business owners can easily take immediate actions to cover a loophole and prevent its occurrence in future.
- Information Extraction – ML can be used to extract valuable information from other database and encourage operation coordination. The fact is that no business owner can be an island of knowledge. In order to take proper business decisions, business owners require external information, to conduct appropriate data analysis of the same (for instance, information on weather, inflation, interest rate, etc.).
- Good Business Environment – ML creates a suitable environment for small businesses to grow and become efficient; it also provides staffs with new technology to function better. For example, ML recommendation engines can imitate customer behaviour, recommend additional products and promote upselling. Media companies use ML to identify patterns of lip movements, which they convert into text.
- Advertisement and Marketing – It is noteworthy that 75% of enterprises utilise ML to enhance customer satisfaction by more than 10%, while three in four organisations employ AI and ML to increase the sale of new products and services by more than 10%. Within seconds, ML apps can reach millions of customers to inform them about new products and why the existing product is better than the product produced by other competitors according to Columbus (2018).
An Example of How Machine Learning Can be Used to Estimate Costs
A charitable organisation relies on the generosity of its well-wisher to cover the operational cost and provide the necessary capital to pursue charitable endeavours. Owing to the higher numbers of donors from different parts of various countries, the cost of soliciting for funds through postcard has increased over the years, for 10% record that the average donation received through postcard for an individual is £15, while the cost to produce and send the card is £2. The expectation is that this cost could increase if a charitable organisation chooses to send a postcard to everybody that identifies with the organisation. As a result, the organisation would need to hire a data scientist to develop a cost-effective model that can identify donors with the highest potential and likelihood of making donations.
Here is the link to the code of this project. I skipped a lot of code for brevity. Please follow the Github code provided on the side while reading this article: Machine Learning: Donor Prediction.
Machine Learning Algorithms
We compare the forecasting performance of six different supervised machine learning techniques together using python with aim of chosen the appropriate algorithm to estimate cost. In particular, we considered the following classification techniques: Logistic Regression (LR), Kneigbors Classifier (KNN), GaussianNB, Random Forest Classifier(RF), Linear Discriminant Analysis and Neural Network Classifier (NN).
The results represent below with box and whisker plots
From the above graph, we observed that the decision tree classifier (CART) outperformed other prediction models because the model has the highest mean when compared with other selected models. Therefore, we apply CART model for prediction. The results are available below:
True positives 280
False positives 796
True negatives 2106
False negatives 693
Classification Report
precision recall f1-score support
0 0.75 0.73 0.74 2902
1 0.26 0.29 0.27 973
avg / total 0.63 0.62 0.62 3875
For the classification report comprising 3875 households, we test for the actual number of households that are likely to donate funds for the charitable organisation. In our report, there are two possible predicted classes: “0” and “1”. For the two predicted outcomes, “1” indicates the actual number of households that are likely to donate, whereas “0” represents the actual number of households that are unlikely to do so.
Confusion Matrix
N = 3875 |
Predicted “0” |
Predicted “1” |
|
ACTUAL “0” |
TN = 2106 |
FP = 796 |
2902 |
ACTUAL “1” |
FN = 693 |
TP =280 |
973 |
|
2799 |
1076 |
|
From the 3875 households in total, the decision tree classifier predicted 1076 households likely to donate and 2799 households unlikely to do so. In reality, 973 households from the sample will donate, while 2902 households may not.
To calculate the cost, recall the following:
Unit cost = £2
Unit average revenue = £15
For the organisation to minimise its cost, it could avoid sending postcards to everyone who expressed an interest and send them to only those households that are most likely to donate.
In this case, the total cost will be (TP+FN) *unit cost = £1946
Revenue = TP*unit revenue = £4200
Profit = (TP*unit revenue) – ([TP + FN] *unit cost) = £2254
This result implies that if the charity sends postcards to only those households that are likely to donate, it will spend £ 1946 and earn £4200, to generate a profit of £2254.
Conclusion
In this article, we discussed how small-scale business can apply ML to improve their performance and generate greater revenue. We also provided an example of how ML can be used to estimate costs and identify likely donors for a charity. We believe that it is crucial for business owners to learn about the importance of data collection and use ML algorithms to improve their businesses’ performance.