We undertook a targeted Discovery phase to understand the full complexity of the technical, IG and governance challenges we would have to overcome to deliver a population health platform for the ICS.
What was in scope?
Integrations
We want to bring in clean, structured, reliable data from each of the source systems across the ICS. This needs to be automated via APIs, initially on an overnight frequency, with a ‘new’ and ‘updated’ pull request.
Standards
Different systems record attributes like height, weight, ethnicity and address details in different ways, so over the 10 weeks we have agreed clear standards to make sure that when data comes together, it tells one coherent story. By bringing this consistency, we make it possible to compare like-for-like across Somerset.
The following two principles will guide the LDP’s approach to terminology (and hierarchy):
- Flexibility - support configurable data structures to meet diverse user needs.
- Alignment with Standards - align with national standards and data dictionaries where applicable, while allowing for custom configurations. The below details the tests we undertook on how to handle terminology and hierarchy.
Understanding location
This test provided a consistent method for mapping resident postcodes and locations to NHS geographies.
Our aim is to consistently link postcodes to appropriate geography layers—GP Practice, PCN, ICB, Region, and local structures such as Neighbourhood, Locality, and Place. This enables us to analyse inequalities, deprivation, and service accessibility while supporting commissioning and delivery functions.
NHS geography operates from two distinct perspectives: where people live (residence path) and where they are registered (registration path). We model these as two parallel hierarchies, connected only by the NHS Number at patient level. This separation prevents confusion between population health analysis and service delivery analysis.
Location has also required us to solve the problem of adding context prior to storing the pseudonymised record. So in the initial processing of the data we look for or calculate the UPRN (Unique Property Reference Number), this UPRN exists in a temporary readable state whilst we apply context to an individuals location. This could be a calculated distance metric, like the distance to a nearest A&E or nearest bus stop, or it could be a provision metric like whether there is broadband coverage. It then gets psuedonymised like all other (PII) Person Identifiable Information.
Creating a model for ethnicity
We aim to represent ethnicity as a hierarchical structure to enable aggregate-level population health analysis with ethnicity-linked data.
Instead of creating our own model, we are adopting an established one. We evaluated two existing models: ONS 2021 England & Wales Census Ethnic group classifications and NHS England Data Dictionary Ethnic Category Code 2001a. We compared them using these criteria:
- Coverage & granularity – detail level captured
- Interoperability and Integration – alignment with NHS standards and external compatibility
- Usability for Data Consumers – analyst-friendly design and clarity for public reporting
Height, weight & BMI
In this test we aim to set a single, enforceable standard for height, weight, and BMI in the Somerset LDP so data from GP/PCN, Trusts, Councils, Hospices, Ambulance, and ICBs have comparable, reproducible, and safe data for PHM and inequalities analysis.
- Standardise inputs, codes, and storage using UK Core FHIR and SNOMED.
- Define selection rules for multiple measures and handling of missing or non-reproducible BMI.
- Enable analysis though common validation, provenance, and recency rules.
Conditions
We need to offer a consistent, flexible framework for representing conditions across providers (GPs, hospitals, community, council, hospice).
Approach:
- Inputs = SNOMED, ICD-10, and others.
- ICD-10 = backbone for consistency.
- NICE = reporting categories.
- QOF = practical priorities.
- HRCS = deprioritised.
Our model supports three levels:
- Raw codes (SNOMED/ICD-10/others).
- Grouped categories (NICE/QOF).
- Plain-language labels (e.g. “Memory problems”).
User needs
API-first
This approach promotes better collaboration, allows for parallel development processes, and enhances the overall scalability and flexibility of the system, making it easier to integrate with third-party services and adapt to changing business needs.
DIIS & FDP Dashboards
Alongside the API-first strategy will will work with colleagues across the cluster to maximise the value of prebuilt DIIS (Dorset Intelligence & Insight Service) and FDP Dashboards.
Risk Stratification
The LDP will also allow us to plug into and roll out risk stratification tools at scale, be they Brave AI or Dorset’s in-house tool.
Data as a Product
A ‘data as a product’ strategy involves treating data as a core asset that can be leveraged to create value for the ICB and partners. This approach focuses on collecting, analysing, and packaging data in a way that it can be easily consumed and utilised by its users (via our API platform). By developing data-driven products, we can provide insights, enhance decision-making, and offer greater research opportunities. Ultimately, data as a product fosters innovation and can lead to a thriving partner ecosystem.
What is an API?
Similar to a librarian, an API can assist patrons in finding and accessing information. In the case of LDP our patrons can be people or they can be other computer systems. Where a library holds well curated books, LDP holds well curated data sets. The API is the gateway to those data sets.
Engagement
To process the special category data in the platform we will need to get section 251 approval from CAG (The Confidentiality Advisory Group). To submit the application we need evidence that we have engaged our communities on the approach we’re taking to data sharing. Helping them understand their rights around opting out, the approach we’re taking to ethical decision making and how they can get involved. Engagement started at the end of September, we will be submitting for Section 251 approval in January but will continue community engagement until March.
We have asked for help from Healthwatch Somerset, who have created an online survey and have sent it out to their contact list.