This approach shifts the perception of data from a byproduct to an independent asset capable of creating value. By treating data as an asset in iets own right, you introduce the need to manage it as if it were a product. This means you can apply product management principles to data products. Product management is defined as "the strategic process of an organization to manage every step of the product lifecycle, taking into account both business and consumer needs."
Data Product Attributes
When applying product management to data products, you're essentially focusing on data management and governance at every stage of the product lifecycle, with the ultimate goal of meeting the needs of data users.
Dehghani outlines the following attributes for data products to ensure optimal use:
- Findable: The data product must be easy to find by both humans and machines via a well-structured catalog or metadata register.
- Addressable: The data product must be accessible to the user with clear instructions.
- Reliable: The data product must be accurate and consistent, with mechanisms in place to ensure quality and integrity.
- Self-describing: The data product should include clear metadata and documentation that explains its content and use, making it reusable.
- Interoperable: The data product must use standardized formats and interfaces to work seamlessly with other systems and data.
- Secure: The data product must be protected through measures like encryption or access control to prevent unauthorized access.
The ownership of data products and the responsibility to meet the above attributes should rest with the areas where they are created. That’s where the most domain knowledge exists to make the most informed decisions. This aligns well with a federated decision-making strategy. Formalizing this allows for better responses to user needs and maximizes the intrinsic value of the data product.
Using a Data Product
The use of a data product can vary greatly and, like any other product, depends on the needs of different customers. This makes the "optimized for use" aspect (see Dehghani's definition) particularly complex. For example, an internal user may use the dataset for a dashboard, while an external user may want the same dataset to benchmark organizations. Both users, and possibly many others, may have different requirements and expectations for the optimal use of the dataset.
To manage this complexity, you can compare your data product to a manufactured car. Do you produce a standard car that meets general requirements and expectations, or do you offer a customized solution? Both approaches have their pros and cons: a standard solution is more efficient, while customization requires more effort for each customer, justifying a higher price. This is a matter of cost calculation, which can then be translated into a price for external customers. For internal customers, the data product can be treated as a "service," with agreements made about what is realistic and achievable.
The Data Process of a Data Product
To see data as a product, it's crucial to make the process of creating the data product transparent. A process is defined as "a series of actions or steps taken to achieve a specific goal" (Tempelman & Schildmeijer, authors of Lean & Six Sigma in Practice). The data product process involves the various steps needed to achieve the ultimate goal: an optimized-for-use dataset. We previously established that this is achieved when a data product is findable, addressable, reliable, self-describing, interoperable, and secure. To realize this, you either follow an operational process or a development process. If the product already exists in the catalog (a central storage point with metadata about the datasets), similar to an existing product in a store, you're dealing with operational process steps such as generating, editing, and making datasets available. If it’s a new product, you go through different steps, such as defining and modeling datasets to meet new needs. This falls under product development, where you must carefully inquire about requirements and make agreements on product attributes like latency, security, privacy, and quality. These agreements can be recorded in a Service Level Agreement (SLA), a contract outlining the terms between provider and customer. The result is a new product in the catalog, which is then made available to the customer through the operational process.