Data Quality with Business Objects Data Services and SAP NetWeaver BI
- A strong team -
Poor Data Quality can impact an organization in many ways. It can result in misguided marketing promotions being sent to the wrong address with incorrect information. Surveys revealed that up to 75% of wrong business decisions are made due to flawed data. Hence, investing in Data Quality does not only improve the quality of decision making, but also lowers significantly Total Cost of Ownership (TCO).
Data Quality becomes also an increasingly important topic for Enterprise Data Warehousing (EDW). Data for Reporting is retrieved from all types of sources, including SAP and Non-SAP sources. Some of the data, especially from Non-SAP sources, do need various ETL (Extraction Transformation and Loading) processing and Data Quality measures before they can be considered as trusted data.
Business Objects Data Services provides a broad set of tools in the area of ETL and Data Quality. Especially for Data Quality, Data Services goes way beyond the capabilities available in SAP NetWeaver BI. Hence, using it in conjunction with SAP NetWeaver BI does leverage the quality of the data in the enterprise enormously.
Analyzing the potential for Business Objects Data Services features to improve the Data Quality for the extraction and loading process in SAP NetWeaver BI, we identified the following scenarios as most suitable.
Scenario 1 - Profiling and Cleansing (on Non-SAP data to be loaded into SAP NetWeaver BI)
- DataSource - Flat File (or source application) with customer data
- Use Data Services for profiling + cleansing
o Domain values (occurrence of specific values)
o Plausibility check (e.g. reasonable date range, existing region)
o String function (Wildcard search)
o Pattern recognition (for the structure of phone numbers, postal codes, etc.)
o Addresses (based on country or parsing directories)
o Matching of duplicate records (like Smith, John and John Smith))
- Upload to SAP NetWeaver BI - Invalid data will be excluded for further correction
Scenario 2 - Address cleansing of already loaded SAP NetWeaver BI data (SAP)
- Download data via Open Hub Service (no license needed)
- Cleanse addresses / data via Data Services (or perform any other Data Quality measures)
- Upload cleansed data to SAP NetWeaver BI (closed-loop)
Scenario 3 - Incorporate Data Services features in SAP NetWeaver BI Transformation (ETL)
- WebService call to Data Services (e.g. to Universal Data Cleanse)
Resulting from the investigations and the above described scenarios, we decided to provide two publications. We focused on the usage of Data Services as a standalone solution (and uploading the cleansed data set into SAP NetWeaver BI), hence not covering Scenario 3 for the time being.
The publications are HowTo guides which should provide an introduction into the topic and the main features of Data Services, but also an assessment of which product (SAP NetWeaver BI or Data Services) can provide a solution for a specific requirement. In addition, the main Data Services features are explained in a step-by-step fashion with easy to understand examples. The HowTo documents are targeted towards people working with SAP NetWeaver BI, who want to learn about Business Objects Data Services and look into using its' features to improve the BI data. They do not provide an introduction to SAP NetWeaver BI.
The publications can be found in HowTo area of the SDN (SDN alias ‘howtoguides' or SAP NetWeaver Capabiliities --> SAP How-to Guides --> Business Information Management )
- How To Use Data Services I - Data Quality Made Easy
Link to HowTo guide
This HowTo guide motivates the usage of Data Quality measures for the data within an organization and the decision making process. It introduces the Data Services architecture and its' components and features. The document provides also decision support on which product (SAP NetWeaver BI or Data Services) to use to fulfill a specific requirement.
The document describes the required steps to connect Data Services with SAP NetWeaver BI (and vice versa). It shows the usage of basic Data Services features like Profiling, Domain Value (Plausibility) Check, Pattern Matching and String Matching on sample customer data. Eventually, the cleansed data is loaded into the SAP NetWeaver BI system.
How To Use Data Services II - Data Quality For Experts
Link to HowTo guide
The second HowTo guide takes Data Quality measures to the next level. The introduced Data / Address Cleansing, Matching and Auditing features allow for powerful analysis and massaging of the data. Data Services delivers pre-defined versions (customizing) for these features, yet allows the user also to define its own strategies based on custom-defined dictionaries and rules. The HowTo explains the main features and provides a step-by-step guide to use them in Data Services based on sample customer data.
In order to implement Scenario 2 (Closed-loop approach of cleansing SAP BI data with Data Services), the SAP BI data can easily be extracted from the SAP BI system with the OpenHub feature into a flat file. After applying Data Quality measures according to the two HowTo guides above, the data is reloaded to the SAP BI system (also described in detail in the first HowTo guide).
In the next Business Objects Data Services release the existing OpenHub service APIs (Application Programming Interfaces) are called by Data Services. This enables the initiation and processing of the OpenHub data directly from / within Data Services. Hence, the process can be completely automated for a closed-loop scenario.