How do you like them apples? The data analytics powerhouse behind Tesco groceries
The £6.4bn financial loss that supermarket giant Tesco posted last year obviously hasn't gone unchecked for the company, and it transpires that the firm is rapidly learning how to scrape back as much as possible on the smallest of transactions - down to the last cucumber or orange.
"Every little helps," as the company refrain goes.
And despite the firm's proposed sale of data analytics firm Dunnhumby in 2015, Tesco is still clearly bent on making big data a focal point of its ongoing activities.
Speaking at Teradata's Universe 2016 conference in Hamburg, Tesco technical manager Adam Yeoman lets us into the detailed, almost arcane world of big data analytics that surrounds your weekly food run.
Data analytics projects at Tesco run from an initial phase of "insight", explains Yeoman.
"They usually come from a spark of an idea," he explains. "One of these might be, for example, that a weekly shopping pattern repeats itself".
With 3,500 stores across 11 countries, taking in 80 million separate instances of shopping every week, choosing from 40,000 different products and necessitating 36 million containers of products to be shifted around the world every week to satisfy demand, pulling behaviour patterns out of human shopping habits is probably easier for Tesco than for most.
And with tuna arriving daily from Alaska, wine from Australia, and general food produce shipping in from all over Europe, the need to keep different items circulating in as timely a fashion as possible is also paramount for Tesco.
"We discovered that on Monday, Tuesday and Wednesday, customers don't really want to go to shops, but as we get closer to the weekend, they get keener and they'll go out and prepare for the weekend. On Sunday, we have shorter trading hours, which obviously affects behaviour," Yeoman explains.
But it gets more complicated than that. Yeoman says that while Tesco "knows from experience" that different products have different patterns of trade, in the past the firm would use a basic "product hierarchy" idea - lemons and limes being in the same place in the store, and thus managed by the same staff member, for example.
"But lemons are bought more often and used for everyday cooking, whereas as limes tend to be bought closer to the weekend, for cocktails," says Yeoman, who works within a team of 70 inside a 100-strong supply chain department to "run projects and improve" the customer buying experience.
Yeoman's insight was around finding a better way to manage stock, and what resulted is a spectacular visual cloud of food produce. Brocolli sits at the top, along with potatoes and carrots, all of which are often bought for Sundays, whereas apples, bananas and cucumbers tend to be bought frequently whenever shoppers visit.
"Convenience items" such as fruit snacking pots are also grouped together, as they're bought and eaten instantly with a "flatter" trade.
It's the sort of visualisation that could never be figured out by the human brain alone, and has a sort of erratic elegance, as you can see:
Next page: The talent behind tomato trending
How do you like them apples? The data analytics powerhouse behind Tesco groceries
The visualisation ultimately allowed Yeoman and his colleagues to forecast "weekly shapes of trade" with products that took on similar shapes on the curve, and work out, for example, how many bananas to stock in stores on a daily basis.
Rolling out solutions "to every store, in order to raise the bar higher" is a particularly key concept for Yeoman's team, and many of their improvement projects produce a 300 to 500 per cent ROI. Otherwise, they're unlikely to get through IT in the first place when attempting to procure resources to carry them out.
"A perfect forecast could save considerable money and resources, as long as it's the right problem to work on," observes Yeoman.
Experimenting in these ways is "data heavy", explains Yeoman, so Tesco employs Teradata partner Fuzzy Logix to run its in-database analytics functions, as well as some serious crunching with MathWorks' MATLAB environment for some serious data manipulation.
"The data in here is basically all our sales and forecasting - the end to end supply chain" explains Yeoman.
Also, a team of analysts are employed.
"Of the 70 people in supply chain transformation, 50 are analysts we've recruited from top universities with backgrounds in sciences. They perform functions like data discovery - they're a keen bunch," says Yeoman.
"We give them access to everything here, and training in it, and they're curious about finding things to make Tesco work a bit better. It also helps with insight generation as well, as they'll be working on more than one project at once."
The analysts also work on modelling and simulation, using day-to-day data to "tweak" processes to try and make weekly trade better, with the carrying out of simulations of particularly great importance to Tesco, particularly if modelled on real data - which is of course in abundance.
Easter and Christmas trading models are a particular focus when modelling new simulations, in order to make sure even the smallest changes don't impact these key shopping periods.
"In a company the size of Tesco, if we design some new logic and there's a problem that happens less than half a percent of a time, but we're delivering 36 million cases (of product) a week, [an error] is going to happen several times a day," explains Yeoman.
"So if that [glitch is], say, a hundred times more of one product than usual, you give a supplier a headache trying to fulfil an order that doesn't even make sense, and if it happens to one store, that store will be in for one of the worst weeks in their career.
"So simulation lets us find all those weird edge cases and fix those before they make a real impact," he says.
Transaction data, meanwhile, is another way for Tesco to find out which items are selling, and whether to stock more or less of them.
"The insight is basically; if something is moving through the till, then the customer must have been able to find it recently and if it's not going through the till, then maybe there's something wrong," says Yeoman, who adds that CCTV is also utilised as another avenue of data to track the movement of items on the shop floor.
Next page: "Expect Braeburn Apples to sell regularly throughout the day"
How do you like them apples? The data analytics powerhouse behind Tesco groceries
A given example involves the movement of Braeburn Apples.
"If there's regular sales of the apples through the day until 3.30pm, and then no more, you ask why," says Yeoman.
"So if we look through transaction data for the rest of the store, and we see the rest of the store is selling fine you can also look at that across the other stores, and see that people DO buy apples after 3.30 in the afternoon.
"After that - with some statistical analysis, we can assign a likelihood that the product is off sale [rather than unpopular] and give ourselves an availability number from that transaction data."
In other words, the team learns that the store probably needs more apples for that store, while confirming that stocking apples - generally - across mid-afternoon is clearly a good idea, as demand exists.
"At this point, we haven't really impacted the customer at all," says Yeoman.
"So the data is rolled out through the stores - to all 3000 who have Braeburn Apples, which shows where off-sale products are going, so they can be replenished [at the original store]. Other teams in the head office can also use this transaction so spot other trends."
As proof of concept, Yeoman points out that Christmas 2015 "broke all records" off the back of this idea for availability measurement.
"There are also now 40 per cent less customers saying they're disappointed since the new measure came in," adds Yeoman.
Another analytics-led process change is what Yeoman calls "targeted counts" which has also replaced the previous method of staff simply walking round the store counting stock and seeing if it adds up to what's on record.
"They'd go up and see what the system said about how many cucumbers there were. If it said 60, they'd actually have to physically count 60 cucumbers, and confirm that. But a lot of the time they were mostly just confirming what was in the system," says Yeoman.
"We thought we could maybe find products that only had stock errors, so we had a think about what events might trigger stock record to be [inaccurate] - it's down to availability, or waste basically."
Comparing the "size" of the stock error event to when it happened, Yeoman's team were able to produce a percentage chance of when a product may subsequently produce a stocking error, combine it with products that were in most demand, and in doing so lessen the work the employee needed to do in terms of counting individual vegetables.
"So now, if the number given by the analytics is 40, only 40 need to be counted, which makes the staff member's availability better for the customer, as well as being more accurate. as It's a win-win," says Yeoman.
It seems almost hard to believe that such a minutiae of events surround the simple process of selling fruit and vegetables, but Yeoman's team is determined to leave no stone unturned in using big data analytics to produce a grocery aisle for the ages.