DevHub North - May 2026
Seriös Group - Data Engineering for the AI Era - Why metadata-drive DataOps is suddenly the most exciting job in tech - Tom Knight & Jake Wedge
Introduction
So, why are we all here? AI will transform your business. AI is only as smart as the data that sits beneath it. Right now, most estates are completely illegible to do it. You need good data to make sure an AI initiative is successful. Need to have good metadata, metadata is the things that describe data, you don't need to have eight-character field names or dozens of stored procedures but need strong metadata around datasets.
Metadata is the language AI speaks
The label on the jar, not what's on the jar - it is what something means, where it came from, what it means. LLLs love meta data as can point an LLM at a database but what it actually means it won't really know, even a column with a good name can mean more than raw data could.
What do we mean by metadata config, this is JSON config where each entity has an entity and each follows the same shape where can have modelling patterns such as Data Vault, they have pipelines that track changes and can define other rules such as data quality rules, column mappings and business rules. Documentation isn't a favourite task of engineers but with these you can self-document your entire data platform.
Fully DRY where have best of deterministic and non-deterministic flows to create the best of both of creating config with fully hardened human reviewed code. Have SCD type 2 framework which needs to know primary key and values for change tracking without needing to write code for each transformation.
What metadata + A unlock? A semantic layer is not a new thing in the AI world, where have one source of meaning where marketing's active customer and finance's active customer are not the same number and they never have been, so to fix declare every metric, dimension and business rule once in meta data and let every tool, dashboard and AI agent inherit it.
One by Seriös Group
Engine that drives the metadata that has been built over the last two years which was Seriös ONE Launchpad where can deliver a data platform in eight weeks where take data from source and process in the framework to be used for many use cases such as Reporting, AI / LLM and more. They take the data into a data lake use create metadata files where take raw data and getting business relationships together using functions that do all the things without needing to write multiple pipelines and the engine can take raw data into data vault format then this can be used in a Kimball model at the end after being processed.
They have data quality rules to go at the end of any metadata file and can have documentation automation with linage diagram from Bronze, Silver and Gold and can even have generated entity overview or entity relationship diagrams. You would take a smaller flow when designing a system like this you would design a metadata approach which is read by an external tested library which can be helped with an AI to create metadata, which can help understand and write documentation which can understand the business context you have stored inside.
Metadata is the real work of the AI are, AI doesn't fail because models aren't clever enough but because data estates don't explain themselves, use metadata to enhance your datasets to let AI understand your estate and any thing.
Questions
I asked about having an open standard for Metadata? This would not be beyond the realms of possibility, and they could open-source something with regards to this.
Someone else in the audience asked where can you get started? There are some blog posts and best advice is to start small.
They talked about the approach they are taking is they have all their internal documentation for schemes and processes and give this to a knowledge base, so the metadata schemes and limitations are understood and the LLM can reason with itself which works better with a system prompt so it could ignore something so can augment with separate tools with a set validator which is not AI, so the AI can pass this to the validator and then the validator can inform where this has gone wrong.
Another question was, if you were going to a client with just a database full of stuff, what is the approach where someone doesn't have the metadata. They would do data profiling with an LLM and look at structure of database with a conversation between the business context of the humans that aren't in the database are there, so a manual approach to design a system with aid of AI to build groundwork for the metadata.
Someone asked how much metadata is needed to create a domain-specific LLM? At the moment they are experimenting with these with a balance profiling approach as these database have billions of rows, don't feed row data but look at column averages and trends within the data, don't need to have AI read whole dataset, but feed information around the data.
Tommy asked how to work around confidential data and GDPR? So an organisation will have AI policy, they don't use their own LLM and particular LLM so organisation will use AI Foundry or similar, so it is taking an old-fashioned approach such does it contain Personally Identifiable Information or not.
How can software developers take advantage of being able to use huge data across multiple platforms, this is not just firing prompts but using agentic AI and having multiple agents with a specific set of rules of engagement where they have a CLI and custom VS Code plugin that they can use.
Another statement from audience was customers are obsessed with Copilot, so which is very prevalent, so need the agentic rules of engagement and capability in the background to work with these.
Another asked how to know about observability is used for token consumption? So this is monitored, some organisations want you to spend more tokens, but they believe in outcomes and what you get out of it, once you get the metadata you aren't consuming any more tokens and aren't sending a billion rows to an LLM.
Final question as have they worked with government organisations? They don't have any use cases for this sector.
Seriös Group - Panel Discussion - Clair Hillier (Head of Professional Service), Steve Shaw (Principal Data Solutions Engineer), Jake Wedge (Senior Data Engineer), Mohammad Reza Sharifi / Mo (Data Solutions Engineer) & Chaired by Megan Newby (People Experience Advisor)
Main Questions
Megan asked what is one assumption about agentic AI you want to challenge? Steve mentioned that the assumption is that it is needed, sometimes something more primitive is more suitable and even an LLM is far beyond their capabilities, would like to see a downplay and pragmatism on agentic AI. For Jake people look at agentic AI and it say it stops people thinking, you can let it teach and plan which can accelerate your learning, but you have to use it properly. Mo said personal challenge is over trustness where agentic AI can be helpful but trusting the outcome is to have a second thought about it being correct or not, don't just accept the output. Jake mentioned you always have to validate the output where it can seem correct but haven't validated it and it could have done something wrong.
Question for Mo, what advice would you give to people in this stage in their career to stay relevant? Could talk about this for hours and in this era of AI and where big players are pushing, so being agile and adaptable is something they would recommend, so is good to learn things quickly, and be championing learning cloud, they did two masters degrees and felt they were ready for a job but when in a job they needed to upskill in cloud literacy as you build things actively in the cloud and be able to deliver an enterprise solution and networking events are important and be out there and build things in open platforms to showcase your talent such as GitHub.
Question for Jake which is have played a key part of the team, where Serios started six years ago but did they expect their career to go in this direction? They started five years ago and had been hand rolling data platforms and has some SQL experience and learned about traditional data platform and then a few years in AI started stirring and learned AI in their own time, their own day to day development changed course. With AI you become a sort of manager for AI and conduct in same way as junior consultants and would talk to these Ais to build things in a certain way and build up knowledge on how to use them the best. They will review what the AI did, and see what it did wrong ,and it feels like they are managing a group of AI agents rather than doing the development.
Question for Jake was how have seen change at a more micro level on the rate of change and landscape in the engineering space? When first couple of years before AI it was investigating IoT but the rate of AI is nothing like this where a new model comes out every month which leapfrogs previous one and the ways it can integrate custom agents wasn't something they had seen before.
Question for Clair, where would you see impact of AI on delivery management? Delivery manager don't concentrate on documents being updated, those things are important in not making mistakes but don't want to spend their time on this but need skills to solve problems, but still need meetings to be absorbed and digested and to be able to take away things such as how to mitigate risks and give alternative routes has been really useful, one of the things is making sure teams are using AI responsibility and within the remits on how they need to work, it is taking away the mundane to focus on things on the skills they want to use.
Back to data engineering, how do you see your roles going forward, data engineers might turn into managers looking after little bots, how to see work change. Clair would like to see her come in and the AI bots have done the work, Steve mentioned about line manager sees role of principal engineer sees and monitors output of agents, Jake mentioned the same and it depends on how AI develops at the current rate but there will always been a need for human input, AI goes into reasoning loops but there will always be a need for business context where humans provide context. Mo mentioned that AI has definitely disrupted the market, it helps you become more productive, helps push to learn things quickly and motivates to upskill quickly and they are optimistic on how productive they can be and how easily they can do things. Clair says it is easier to get budgets signed off with AI, data needs to be in a decent state and wouldn't have got budget for this in the past, you need to get data in the right state.
What advice you would give to yourself when starting career? Steve mentioned they think about this often and have two young boys starting out and looking at future careers, so first thing is focus on the fundamentals where they have seen people in junior and senior roles where they lack core fundamentals and things are missing, so the base foundation is important, look at the things that around decades around and still around today, also look at the periphery of your estate, understand where you are and how you can work with rest of your business. Understand how to work with rest of business as different between good engineer and great engineer is saying something isn't a good idea - they would waste months on things that weren't good, so learn to ask the right questions, challenge a brief and take on a bit more.
Audience Questions
Audience member asked Steve if they still like their job? They are still solving interesting problems and the mechanisms are different, they want to deeply understand the way something works the way it does, they like solving problems they are just in a different location. Jake mentioned when solving a problem in code felt good but that is no longer a thing so it is now that architectural thing where they are directing agents a certain way in a certain spec and know how to connect things to another thing, the agents don't have the larger context window of whole brain so have to solve problems, they are just problems in a different form.
Audience member asked about Steve and telling more about fundamentals, they are in the data engineering space and the fundamentals are those core concepts, with things around forty fifty and sixty years such as talking about Kimball Models and atomicity of things, when they joined they had heard little about Data Vault but it has transformed how they look at data structures and data warehousing. Mo mentioned thinking about things from scratch even coming from a data science masters at University including data literacy, GDPR, security layers in a data system are fundamentals and for AI there are a lot of concepts it is not just ChatGPT and Claude as there are a lot of core concepts around that.
Audience member asked Clair if can do more in less time what does that mean? This means they can deliver more for the customers and can look at the amount of time they can ingest data and make sure everything is tested and can help take risk out of a project.
Alex their CTO, mentioned they are not like a traditional agency, they can add more value for customers in the engagements they can do, they can do same amount of work in less time or do more amount of work in the same time, it is a big opportunity to help customers get value from their data in their AI solutions. Clair added it is that knowledge is not being lost, it is important that the data needs to be good and spend less time on the drudgery and more time on solving problems.
Audience member asked about what qualities they look for? Critical thinking, ability to communicate idea to non-technical stakeholders and ability to communicate is key, AI is a great leveller and knowledge in your head doesn't matter but ability to critically think is important. Clair mentioned that they don't really have those backroom nerds who don't mix with customers, all theirs get involved with clients and can work in the outside world and work together where recruitment is important to get the right team mix, if have the right attitude we can teach skills but if have skills and not right attitude this is something you can't get rid of.
Audience member asked about challenges with AI? Steve mentioned everyone knows they need to do something with data and have spent money of this or build or used platforms that are difficult to maintain, so they want to deliver value which is closely tied to the business and how they can sell themselves to the business and embedding this within the organisation. Clair mentioned once data is in a data lake or data warehouse they will then see what they can do with this data and they will ask for more and more and scope can be a challenge but can get them on that journey really fast and projects can just keep on going.
Audience member asked about cost, they had something that did something for an hour but ran out of tokens, but asked that providers are running at a lost so how do they manage costs, where do they see that cost? Steve mentioned that this is something they don't have that huge a problem, they have experienced that can't affect other usage by burning out credit usage, they are using it for the mundane and metadata, understand things about the metadata and most tools understand enterprise regarding burndown rate and get sense of what something is going to do before it executes it, many clients will understand what they are doing but not where the cost impact of this. Jake mentioned in their business they have two types of spend which is developer spend which is developing product where they get value but the AI used to interface to data so use smaller metadata and have significant guard rales and prompts to restrict things were needed.
Final question was, how do you manage the environmental impact? Alex their CTO mentioned there is not a lot of transparency about impact but they have committed in their AI policy to consider this impact and as they become transparent, so they have a carbon impact assessment, their usage is not massive from a cloud consultant point of view, so with cloud if you reduce your spend you reduce your carbon output, so if stick to this as a principal which is in their AI policy and AI impact assessments.