A matter of scale: How this World Heritage site is getting a handle on big data
'In two years it will be 45 million rows, easily'
The Blenheim Estate had to find a solution to store and analyse the data from its sensor network, which stretches across West Oxfordshire.
Big data needs big infrastructure, especially when the physical estate you're responsible for is, not to put too fine a point on it...big.
The Blenheim Estate in Woodstock, Oxfordshire covers 12,000 acres – seven at the Palace site alone – and its holdings stretch even further.
The estate's growing sensor network, established and run by the Innovation team, generates massive amounts of data on everything from air quality to lichen growth. Head of innovation David Green is using the network to build a digital twin of the site, which will have a huge range of benefits – if the team can get the naming right.
"The digital twin database doesn't have to be a full 3D spinny, whizzy model – though we've got one of those. It's more about the assets of the building: the naming conventions, the structure.
"We've nearly finalised the sort of data schema we think is going to work for us, in terms of holding the naming convention: you know, the doorknob that relates to the door that relates to the wall that relates to the elevation that relates to the structure, etc, etc.
"That sort of hierarchy of information is something we've been working on. There's an ISO standard for things like naming, but no one seems to have really nailed it."
The naming is important because it not only tells you exactly what an object is, but where it's supposed to be.
"I want to be able to pinpoint that monitoring location in the Palace, let's say. That monitoring location has attributes... By pinpointing that location, I can tell you the history of that stone. What work was carried out on that stone, what type of algae or lichen do we have on that stone? Which way does it face? How much rainfall does it have? What's the surface temperature of that stone?
"Then you bring that big data piece over the top of it as well, and suddenly you're building up that sort of normalised and structured data set that says, ‘I can tell you everything that's happened here.'"
Storing all that data requires a bit more than a simple Excel spreadsheet. David says, "I need to see how it will join up in a structure in a normalised, scalable, hierarchical, relational database. That's not always easy, but I think we've got there."
To get information to the main data hub, the sensors around the Blenheim estate transmit back to a receiver on the Palace roof. It's then encoded and sent to the database, where it's decoded and sorted into data sets, mapped against locations, "and suddenly I've got this incredibly robust data set, which in two years will be 45 million rows, easily."
Where is that data hub?
Cloud, of course
Trying to run this sort of data set on-prem would be a mammoth undertaking for a World Heritage site. Luckily, the cloud is here to help.
"It's all in Azure at the moment... I use the Things Network. A combination of the Things Network for the IoT stuff, which is basically the local network server, and it also takes the information from the sensor, but then I bring that into Azure."
Despite making heavy use of Azure, the team has made sure it keeps track of data ownership – something David says is "really important."
"One thing I've been caught with before is you use other platforms, and you get tied into these platforms, and you can't always get them to work as you want them to work."
Solutionising is a common issue in IT, especially among teams that lack experience, or a leader that understands technology. It means organisations try to buy their way out of problems and end up with a confusing, jumbled mesh of tools.
David credits his experience - he has worked at Blenheim for more than a decade, and understands both technology and World Heritage sites - with how he has avoided solutionising.
"[My experience] means it's easier when I'm working on technical projects with external contractors who don't understand the Gift Aid process, or don't understand the complexities around ticketing and ticketing upgrades and things. They try and fit you into their solution.
"What we can do is actually mould a solution for a World Heritage site that will fit every other World Heritage site... The technology is kind of separate to the problem, in some respects."
So, how does Blenheim use the big data (from a big sensor network, stored on big infrastructure...) it collects?
The estate works closely with academia, sharing its data sets for analysis and interpretation. Working together, using AI, they look for patterns across buildings, lakes and woodlands to predict failure and protect this world-famous site.
"I'm building, I hope robust data sets to allow people like Tawhid [Shahrior] and other data scientists to learn more about this natural environment."