SPREAD's Technology: How We Build Trusted, Scalable Engineering AI on Product Data and Semantics

ALL Blogs

SPREAD's Technology: How We Build Trusted, Scalable Engineering AI on Product Data and Semantics

by SPREAD Team on March 3, 2025

At SPREAD, our mission is to make product data accessible, intuitive, and actionable for engineering teams, empowering them to lead the next era of software-defined product innovations through creativity and Engineering Intelligence.

If you're interested in a quick deep-dive into how SPREAD turns data chaos into clarity, please watch this video - it only takes two minutes.

Engineering Intelligence in 3 steps

'Garbage in, garbage out' isn't just a saying - it's a fundamental truth. Valuable (Gen)AI applications can only exist in manufacturing industries when built on a strong and consistent data foundation - structured and unstructured product data which is integrated across various systems and tools to give engineers a holistic understanding.

We enable engineers to gain deep insights into their products and make data-driven decisions throughout the product lifecycle by leveraging technologies like a knowledge graph, schema, graph federation, machine learning, AI agents, GraphQL, and a multi-tenant cloud architecture.

Our end-to-end solution is built on an architecture that ensures flexibility & scalability, security and powerful data integration for our entire Engineering Intelligence solution suite. On top, it offers multiple ways of integration into your existing infrastructure.

Our Engineering Intelligence platform works with 3 steps:

Rapid Data Ingestion
Product Twin empowered by the Engineering Intelligence Network
Action Cloud with Engineering Intelligence Solutions

The following article provides a detailed overview of these steps (1-3) and further foundations of our platform (4).

Blog_1920_1

1 Rapid Data Ingestion

First step: At the foundation of SPREAD’s architecture is the Data Ingestion & Mapping layer, which integrates diverse data sources via out-of-the-box Smart Connectors for specific data formats (such as spreadsheets, databases, 3D formats, CAD formats), a configurable Data Mapper and a Data Sourcing Layer. These harmonize both structured and unstructured data into a knowledge graph called SPREAD’s Engineering Intelligence Network (EIN).

Efficient data ingestion and mapping are key to integrating diverse data sources:

Files, systems, experts, API calls: Data from various origins, such as system databases, expert inputs, and API calls, are brought into a unified ecosystem.
DB, services, streams: Supports databases, web services, and real-time data streams, allowing SPREAD to handle a wide variety of data types.
Declarative mapping: Simplifies data mapping by translating diverse data formats into our cloud environment.
Systems data provenance: Ensures reliability by tracking the origin and history of data.
Quality gate: Maintains data quality by ensuring that only high-integrity data enters the system.

1.1 Out-of-the-box Smart Connectors

Connectors are the bridges that link various data sources to the knowledge graph. They ensure seamless data flow from different systems and tools into our graph. For instance, a connector might pull design data from a CAD system and PLM engineering systems, feeding both into the graph for a unified view.

PLM systems are integral to managing this data. However, without proper integration into a broader data ecosystem, even the best PLM tools fall short in delivering the visibility needed for today’s fast-paced vehicle projects.

By offering pre-configured connectors to these systems, SPREAD eliminates the barriers to integrating crucial PLM data with other engineering and enterprise systems. It aggregates this data automatically, ensuring accurate, synchronized insights into product maturity and development progress.

Integrating PLM systems with SPREAD brings several key advantages:

Source integration: Connectors integrate data from CAD systems, PLM systems, ERP systems, system databases, and more.
Data normalization: They normalize data into a common format, making it compatible with our schema.
Real-time syncing: Connectors can sync data in real-time, ensuring that the graph always has the most up-to-date information.
Unified product view: Data from multiple sources — PLM, ERP, and custom engineering databases — is unified into a single, accessible dashboard.
Reduced integration costs and complexity: The out-of-the-box connectors are designed for quick deployment, allowing teams to benefit from seamless integrations without the need for significant technical resources.

1.2 Data Source Connecting Layer

The data source connecting layer is a middleware layer that facilitates the connection and integration of multiple data sources into a unified system. This layer acts as an intermediary between the data sources and the target system, ensuring that data from disparate origins can be aggregated, harmonized, and made accessible for analysis and processing. It typically handles tasks such as data extraction, transformation, loading (ETL), and data synchronization.

1.3 Data Mapper

The Data Mapper translates data from one format or structure to another. In the context of the knowledge graph, it allows data from various sources, such as PLM systems, ERP systems, or IoT devices, to be harmonized with our schema and imported to be accessible by the knowledge graph. This uniquely powerful harmonization process is crucial for enabling seamless data interoperability and analysis across different systems and domains. In detail, harmonization in this context means understanding and transforming complex hierarchical source data formats into the schema of the knowledge graph - a capability that sets our solution apart from other mapping approaches.

1.4 AI-assisted Mapping

AI-assisted mapping involves using artificial intelligence to co-pilot and thus enhance the process of mapping data from various sources to a unified data model. This can include identifying and matching data entities, detecting patterns, and making recommendations for data transformations. In the context of Engineering Intelligence, AI-assisted mapping helps in efficiently integrating large volumes of complex data by reducing the manual effort required and improving the accuracy and consistency of the data mapping process.

This layer supports data ingestion and transformation from CAD, PLM, ERP, and IoT sources, enabling seamless data integration with SPREAD’s Engineering Intelligence Network. It lays the groundwork for SPREAD’s AI-driven ecosystem, ensuring that all data is ready to fuel the Product Twin.

2 Product Twin powered by the Engineering Intelligence NETWORK

The second step and the heart of SPREAD’s stack is the Product Twin. This is a sophisticated, dynamic digital representation of each product that captures and contextualizes data across the product lifecycle. The Engineering Intelligence Network serves as the backbone of the Product Twin, organizing and interconnecting structured and unstructured data—such as component specifications, error reports, release plans—into a cohesive, actionable model.

2.1 Product Twin

The Product Twin consists of two critical layers:

Engineering Intelligence Network: Our federated supergraph is the core of our solution and unifies domain- and task specific subgraphs into one uniform API. The subgraphs can be optimized for the data they serve and the problems they solve without adding complexity to the applications consuming the data.
Schema: Our schema, also known as the Information Model, is the glue between the domains and their data. It defines the meaning of entities and their context in terms of attributes and relationships. This results into our ever-evolving Engineering Intelligence Network. Automated processes guarantee the integrity of your data, ensuring that engineers have a single source of truth for all product-related data.

By mapping intricate relationships across components, lifecycles, and engineering domains, the Product Twin simplifies complex analyses and enhances decision-making. SPREAD’s Product Twin provides engineers with a holistic view of dependencies and system functionality, making it a powerful foundation for all Engineering Intelligence solutions.

The following section will take a detailed look at the fundamental concepts underlying the Product Twin.

Blog_1920_2

2.2 Engineering Intelligence Network

The Engineering Intelligence Network (a knowledge graph for Engineering) is our backbone, that interlinks and stores structured product data and relationships. It is a GraphQL-based system, that uses graph structures with nodes, edges and properties to represent and store the data.

Think of it like this: complex mechatronic systems have become so interconnected that humans can no longer understand and manage the systems on their own. In the best case, a graph-like representation of mechatronics systems could also be found in siloed systems of an R&D department, but the data usually does not follow a schema, the schema is not documented, or it is not defined in a manufacturer- or product-agnostic way. Moreover, the data is not linked to related data in other corporate systems. This forces practitioners to manually cross-reference exports of this data, which is not a maintainable way.

Our Engineering Intelligence Network solves this problem with the schema and its two core layers:

The information layer is an evolving data schema combining multiple contexts (R&D, Production, Aftermarket, etc.) into one system with interrelationships. Our schema is proprietary, and the heart of our Engineering Intelligence IP built over the last 5 years.
The data layer is comprised of all the instances of data following the schema in our system. It can be thought of as a network of predefined objects over all domains hosting all product-related data in canonical representation – our single source of truth for all downstream applications.

The Engineering Intelligence Network allows for a holistic view of complex systems across the whole product lifecycle and across all domains. This enables better decision-making and more efficient problem-solving. It further empowers technologies like (Gen)AI by providing the necessary knowledge and ground truth to understand system functionality. This boosts AI efficiency and performance, driving greater productivity, quality, and faster product innovation.

2.3 Schema

A schema, also known as the Information Model, is essentially the blueprint for organizing and structuring data within the Engineering Intelligence Network. Here’s a closer look at its role:

Structure definition: The schema defines entities (e.g., components of a vehicle) and their attributes (e.g., engine specifications, software versions).
Relationship mapping: It maps how these entities relate to each other (e.g., how the engine connects to the transmission system).
Integration foundation: By providing a common structure, the schema ensures that disparate data from various sources can be integrated seamlessly.

For example, in an automotive context, our schema defines relationships between CAD design elements, features, variants, cable, or hardware, ensuring they all fit together cohesively within the graph. This makes synergies in data manifest and enables to easily build dashboards or other applications on top of it. For it to scale, we have a lot of mechanism in place to derive service implementations, application maintenance and data migrations from the content of the schema itself.

Blog_1920_3

2.4 Graph technology

A graph is a formal structure used to represent relationships between entities. It consists of two main components: nodes and edges. Nodes represent entities, while edges represent the relationships or connections between these entities.

This structure is incredibly versatile and can model various scenarios, such as social networks or mechatronic systems. For instance, in mechatronic system, nodes can represent ECUs and software, while edges can model all feature dependencies. The power of graphs lies in their ability to visually and structurally map out complex, interconnected and interdependent systems, making it easier to analyze and understand the underlying relationships and patterns.

2.5 Graph database

A graph database is a type of database management system designed to store, manage, and query graph-structured data. Unlike traditional relational databases that use tables and rows, graph databases use nodes and edges and properties on either of them to represent, store, and link data.

These databases excel at handling data with complex relationships, as they inherently model the connections between data points. They are optimized for tasks involving extensive data traversal and complex relationship queries. Graph technology plays a crucial role in managing highly complex mechatronic systems by providing a robust framework for representing, analyzing, and optimizing the intricate relationships and interactions within these systems. Such as, components of a car’s wiring harness, their connection via cables down to pins of connectors of the components and their connection via single wire strands of cables.

2.6 Supergraph and federated subgraphs

Subgraphs and supergraphs are integral concepts in scaling graph systems that help to manage and analyze complex datasets. A subgraph is a subset of a larger graph (supergraph). Subgraphs are typically used to represent and manage specific domains or areas of interest within the broader dataset.

The supergraph, on the other hand, is the overarching graph that contains all the subgraphs. It provides unified querying capabilities to the entire system, integrating data from various subgraphs. This integration allows for holistic analysis and insights across different domains. The relationships and data within subgraphs are often interconnected, enabling the supergraph to facilitate cross-domain queries and analyses. The supergraph and all its federated subgraphs together form the Engineering Intelligence Network.

For instance, a comprehensive graph represents the entire company's product data, subgraphs might represent specific domains, such as R&D, Production, Aftermarket, Procurement, and further customizable domains as subgraphs.

The supergraph is the culmination of our efforts, representing the entire product ecosystem:

Unified view: These subgraphs are integrated into a supergraph, providing a unified view of the entire system.
Cross-domain insights: By integrating data from various subgraphs, the supergraph enables insights that span multiple domains.
Independent management: Subgraphs can be managed and updated independently, allowing for focused and efficient data handling.
Enhanced decision-making: This comprehensive view is crucial for informed decision-making, allowing engineers to see the impact of changes across the entire system.

It serves as a single source of truth, showing in an adaptable manner how the system works. This enables efficient queries, e.g. for error isolation and dependency tracing – providing significantly enhanced querying capabilities compared to traditional relational databases.

For instance, assuming a type in the schema dedicated to engine specification exists in the subgraph governed by the R&D department, then addition of fields related to estimated power output could immediately be consumed in context of an overall vehicle performance analysis, thanks to the supergraph.

2.7 GraphQL

GraphQL is the query language used for interacting with the Engineering Intelligence Network. It provides a flexible and efficient way to request data:

Selective data retrieval: Users can specify exactly what data they need, reducing over- and under-fetching. For instance, a user can query for specific attributes like engine temperature and software version without retrieving unnecessary information.
Single endpoint access: All queries are sent to a single GraphQL endpoint, simplifying the interaction process. If a query requests fields hosted on different subgraphs, federation takes care of distributing requests under the hood.
Nested queries: GraphQL allows querying related data in a single request. For example, you can fetch a car’s engine details along with its maintenance history in one go.

This precise and efficient data retrieval mechanism is crucial for engineering teams who need quick access to specific data sets. Or differently put: the Engineering Intelligence Network is a unified access over federated infrastructure of subgraphs enabling and leveraging advanced analytics and the knowledge base for LLMs as further capabilities.

2.8 Flexible platform access and usage

As our system’s architecture is highly modular and all interactions pass through the federated graph layer, our client’s cases can be solved in diverse scenarios:

They can use our API as a product, building their own applications on top (in our Action Cloud layer or with any other framework that provides a GraphQL client).
They can contribute to a common modelling initiative aiming for standardization.
In the future, they can provide their own subgraphs to be federated with our pre-built domains, where such subgraphs can be proxies to already existing REST APIs or databases. Further, they can use our schema-editing and code generation tools to build their own subgraphs for the supergraph.

To give complete interaction capabilities with any number of external systems, our graph layer can be configured to process external events or calls, as well as emit events to external systems, and be connected with other graphs.

3 Action Cloud with Engineering Intelligence

Third step: The Action Cloud is the representation and interaction layer, where the structured and contextualized data in the Product Twin is leveraged to enable access, analysis and authoring of product data. This enables the usage of pre-configured Engineering AGE Solutions, as well as the low-code building of customized applications in the Studio. The Action Cloud is further enhanced by Engineering Intelligence Agents that operate to streamline workflows and automate engineering tasks.

3.1 Engineering Intelligence Solutions & Studio Customizations

The Action Cloud provides users with multiple pre-configured Engineering Intelligence Solutions, while also enabling the users to easily build custom applications in a low-code environment within the SPREAD Studio. The solutions and applications enable users to visualize and interact with their product data but also analyze and author the data within the Engineering Intelligence Network. Furthermore, users can easily share the applications and visualization with selected external partners, while ensuring compliance with regulatory standards.

3.2 Engineering Intelligence Solutions

The Engineering Intelligence Solutions offer user-friendly and specialized applications in which engineers can easily interact with their data across the complete product lifecycle, from R&D and Production to Aftersales. These out-of-the-box solutions are designed to support engineers at every phase of the product lifecycle, improving collaboration and enabling faster, data-driven decisions.

SPREAD currently offers the following out-of-the-box solutions, which are based on our experience with our clients in the past 5 years:

Requirements Manager: Structure and connect requirements across domains to enable fast quoting, reuse, and comparison.
Product Explorer: Gain full transparency into product logic and functional chains across logical and physical architectures in one connected view, across variants.
Error Inspector: Identify and resolve errors early by linking errors and tickets to system dependencies and uncovering recurring patterns.
Action Tower: Receive clear insights into product maturity, and error curves to assess impact on function readiness and take targeted action.

3.3 SPREAD Studio Customizations

The SPREAD Studio provides engineers with a low-code environment where they can intuitively build custom applications and dashboards, automate complex tasks and retrieve data. The user-friendly drag-and-drop interface and low-code environment empowers engineers and users with minimal or no programming knowledge to build and deploy applications. This enables a broad set of users to generate value in a short amount of time. Here are some examples of use cases:

Business process automation: Non-technical staff can automate workflows, improving efficiency and productivity.
Data analysis and visualization: Users can generate reports and dashboards by simply describing their data and desired insights.
Prototype development: Rapidly create and test new ideas without the need for extensive coding resources.

Blog_1920_5

How does it work in practice: The users can intuitively build the application within a simple visual interface. Within the interface they can easily drag-and-drop pre-built widgets (e.g. 2D/3D visualizations, graphs, tables) to build their application or dashboard. The function, reference or logic of these widgets can then easily be customized and specified through low-code adjustments within the same interface. All of this is further supported by real-time visual feedback, enabling the user to directly see the result of their changes.

Additionally, users can build Flows within the visual drag-and-drop and low-code environment of the SPREAD Studio. Flows empower users to process data, execute data tasks and define data rules within the SPREAD ecosystem. The Flows empower users to fetch data from external sources, transform data from the platform to an external application or improve data handling within applications and solutions. The flows themselves can be triggered by an API, time or other initiators.

Blog_1920_6

3.4 Engineering AI

Engineering AI only works when it is grounded in explicit product logic. SPREAD provides this foundation through the Engineering Intelligence Network (EIN) and Functional Product Twins, which make the structure, dependencies, and variant logic of a product fully traceable across systems. Within this environment, Engineering AI Agents operate as governed executors on top of product truth.

Engineering AI Agents act directly on this structured, validated product context. They traverse dependency chains, evaluate variant scope, analyze change propagation, and execute tasks within clearly defined permissions and traceability boundaries. Every action is constrained by product logic. Every result can be traced back to underlying engineering objects and relationships. This governed execution model is what differentiates SPREAD from generic AI approaches in engineering environments. SPREAD provides specialized agents built for concrete engineering workflows. These agents operate natively on the Product Twin and Engineering Intelligence Network.

Requirements Assistant

Validates requirement consistency across domains, flags structural gaps, and identifies reusable patterns from historical product data. It connects new requirements to existing functions, components, and configurations, ensuring traceable coverage and preventing redundant variants.

Issues Analyzer Assistant

Summarizes and structures tickets across systems, detects recurring issue patterns, and links anomalies directly to affected components and variants. It reduces manual cross-system investigation by grounding issue signals in the Product Twin.

Root Cause Assistant

Traces anomalies across signals, ECUs, software modules, and hardware configurations. It evaluates dependency chains in the Engineering Intelligence Network to isolate probable failure sources and recommend targeted corrective actions.

Steering Assistant

Prioritizes alerts based on maturity signals, error density, variant exposure, and lifecycle impact. It ranks actions and provides structured decision support for program leadership to steer development toward SOP without blind escalation.

Upskilling Assistant

Surfaces deltas between product versions, explains architectural changes, and provides contextual learning paths based on real engineering artifacts. It reduces onboarding time and preserves domain knowledge across programs.

Each agent is scoped to a clearly defined engineering responsibility and acts on structured product logic, cross-domain dependencies, and variant-aware context. These agents are using engineering-grade reasoning systems constrained by structured product logic and access control. All SPREAD Domain-Specific Engineering Agents leverage core LLM capabilities such as summarization, translation, similarity and clustering, and forecasting. These LLM capabilities are tuned and continuously evaluated against real engineering workflows and function as reasoning modules inside a governed engineering system.

Platform Agents ensure that AI operates on validated product logic.

Knowledge Graph Agent

Executes graph queries, traverses dependencies, and retrieves cross-domain product context from the Engineering Intelligence Network.

Data Ingestion Agent

Connects structured and unstructured data from PLM, ALM, ERP, MES, service systems, and documents. Performs AI-assisted mapping into the engineering ontology.

Data Quality & Sanity Checker

Validates structural consistency, detects conflicts, and enforces engineering logic rules inside the Product Twin.

These agents maintain integrity of the product model and enable reliable AI reasoning.

3.5 MCP (Model Context Protocol)

The Model Context Protocol defines how the Engineering Intelligence Network becomes accessible beyond SPREAD’s own applications. It provides a standardized, governed interface that allows external tools and AI systems to query structured product context without replicating the underlying product graph.

Engineering organizations operate across multiple systems. PLM manages product structures and configurations. ALM manages requirements and validation. ERP holds cost and sourcing data. Service systems track field issues and warranty cases. Each system contains only a partial view of the product. MCP enables these systems, and the AI features embedded within them, to retrieve cross-domain context from the Engineering Intelligence Network at the point of decision.

When a PLM workflow evaluates a change request, MCP allows it to retrieve requirement coverage, variant exposure, and software dependencies from the Product Twin. When an ALM tool analyzes validation gaps, it can query real configuration logic instead of relying on isolated requirement hierarchies. When a service analytics platform investigates recurring field failures, it can resolve tickets against exact hardware and software configurations.

MCP also supports organizations building their own AI solutions. Internal LLM or RAG pipelines can use the Engineering Intelligence Network as a structured retrieval layer. Instead of indexing disconnected PDFs, logs, or exported BOMs, these systems retrieve engineering objects with their dependency relationships, variant scope, and lifecycle state intact.

Access through MCP is controlled and auditable. Queries are permission-scoped and resolved against the governed product graph rather than against raw source systems. External consumers do not hold a persistent copy of the Engineering Intelligence Network. They retrieve only the context required for a defined workflow under defined access rules.

This architecture positions the Engineering Intelligence Network as shared engineering infrastructure. Domain assistants operate inside SPREAD. MCP extends the same structured product intelligence into the broader engineering tool ecosystem without compromising traceability or governance.

4 Foundations of our platform

Our platform foundations support secure data hosting, sharing and multi-tenant management. These layers ensure user authentication, data integrity, and compliance across global teams and partners, allowing SPREAD’s solutions to be securely shared within organizations and with external stakeholders.

4.1 Hosting

Supporting SPREAD’s core layers is the Hosting layer, which provides scalable, secure deployment options, including customer private cloud, on-premises, and SPREAD’s multi-tenant cloud.

Key features include:

Data isolation and security: Each client’s data is securely isolated with strict access controls.
Scalability: Resources are dynamically allocated, adapting to changing demand.
Flexible deployment: Options for private cloud, on-premises, or multi-tenant cloud, ensuring each client’s security and infrastructure needs are met.

4.2 User Management

Our User Management application enables the seamless management of user authentication, authorization, and access control within the platform. This is essential in maintaining a secure and efficient environment. Administrators can easily define user roles, assign permissions, and ensure that individuals have appropriate access to data and tools based on their responsibilities.

4.3 Data Management

The Data Management application in SPREAD’s ecosystem oversees the governance, storage, and retrieval of product data. It provides accessible tools for organizing, querying, and managing the lifecycle of engineering and product-related information. This ensures data integrity and compliance with both internal and external requirements.

4.4 Tenant Management

The layer enables the management of tenants on a multi-tenant SPREAD deployment, maintaining isolation and flexibility. It allows administrators to easily segregate data, configurations, and user environments for different clients or teams. This ensures secure access and tailored functionality for each tenant.

Get to see it for yourself

It's one thing to read about it, it's a completely different thing to see it in action. If you'd like to know more about our solution and how it works, please don't hesitate to contact one of our experts. Just book an individual session - with no strings attached.

Talk to an expert