The technology responsible for AI’s recent breakthrough is data-driven machine learning. In this post we lift the veil on what ‘data-driven’ means, and why it’s important to understand even if you yourself are just a user of AI. The terminology used in the AI world is borrowed from our daily life language, but be aware that there are subtle differences in the meaning of terms.
Humans can learn by experience, for example on-the-job training. The concept of learning without an explicit knowledge transfer from man to machine has inspired the data-driven learning approach. Imagine the simple act of catching a thrown ball. There’s a mix of physics formulas needed to predict where the ball is going to go. We give medals to those few high school students who can work out those equations during their 3 hour final exams. Yet we expect kids in kindergarten to catch the ball in a split second. These kids probably don’t even understand the concept of gravity, other than “falling is ouch”.
Youngsters don’t hit the books, they are learning by example. So does supervised machine learning. A health AI may receive an example of a patient aged 30, with a weight of 60kg who recovered from surgery in 3 days. Another patient aged 40, weighing 80kg, recovered in 4 days, etc.. With enough examples, a machine learning algorithm trains a model that observes characteristics of a new patient (age 32, weight 72kg) to predict a recovery time of 4.32 days. A prediction (inference) for a particular patient could be wrong, just as a kid won’t catch all balls. But a successful algorithm will be approximately right most of the time.
Data-driven machine learning is powerful, but there is a catch: it’s only as good as the data it was fed. A model trained on children’s hospital data may not work well in the geriatric ward. If blood pressure is a key factor, but it’s not recorded in the data set, the model has no choice but to ignore. If the data set contains many characteristics that are not or hardly related to recovery time, the model may find spurious correlations by coincidence.
Machine learning methods are designed to cater for the possibility that a new case is not exactly the same as one of the cases in the training data set. However, the better the distribution of characteristics used for training reflects the real world, the more accurate AI will perform. That’s why “big data” has been one of the foundations for the rise of AI. Just remember that “big” is not all about size, but also about coverage and variety.
In the above example, we showed input values (age, weight) which are available at both training time and at inference time when a new patient is encountered. These are called samples or instances. The output variable (recovery time) is available at training time only, and is known as a target for the sample. At inference time, the challenge is to predict that value.
Data is the key ingredient for machine learning, but how does an AI know what outcomes we expect it to extract from the data? In the next post, we’ll discuss how problem definitions can go horribly wrong.
In this post’s example a surgery recovery time is predicted. What type of AI task is that? (See answer in blog post 1.)
Activ is an organisation that continues to grow and improve the choices and freedom of those living with disability.
The prospect of asking the boss for a raise might strike fear into your heart, but you’ve to to be in it to win it…
How does Australia stack up? What does it say about us as a workforce?
Whether you need a small parcel delivered, or large mining machinery transported, Brown’s Express can provide. With their business based around paper consignment notes, they needed to find a simple way to digitise all of their crucial paperwork.
EOFY isn’t only about getting your financials in order. It’s a great time for small businesses to review and plan
Government body ASIC had some tough goals to achieve when it came to sustainability.
With predictions that the workplace will replace universities as training grounds, how can you get your boss allowing you to upskill or pivot?
How has it affected their bottom line and at what price to their culture?
While email has become an everyday part of our work-life, you still might be doing it wrong.
With automation becoming more and more sought after, what common tasks should remain sacred?
Boost your skills, discover new opportunities and improve your mood by giving up your time for the common good
Men, women and managers, how do their approach and attitudes toward remote working differ?