TECNICAL DATA ANALYSIS
Note: The Screen Capture throughout this document are for demonstration purposes only.
The purpose of a BW technical data analysis is to provide an initial data analysis assessment of a BW environment to identify potential areas of concern and assist with managing the growth of BW productive data.
It is part of an overall data management strategy which involves the analysis of complex dependencies between the R/3 and BW systems. These dependencies need to be taken into account to avoid data inconsistencies and loss of data.
The BW technical data analysis should cover details of all the main BW components that are largest in volume and/or size.
The following general areas will be covered by the technical data analysis.
Ø Overview of Objects
Ø Largest Objects
Ø Top Growing Objects
Ø Vertical Data Flow
Ø Age of Requests
Ø Ratio between F- and E-Fact Table SIZES
Ø Content Analysis
Attention should be given to the following:
Sometimes the disk space allocated by data tables and indexes is only 60% of the disk space the whole BW installation takes. There might be large parts allocated by PSAPROLL, PSAPTEMP and the filesystem that is used by BW.
In addition to the aspect of DB-Size, the Data Model is checked for the top 5 largest Data Flows. Especially regarding archiving, the DB model should be prepared, i.e. ODS and InfoCubes should be divided in smaller entities that are joined by a MultiProvider so that data archiving can work smoothly on these smaller partitions.
Overview of Objects - Determination of allocated disk space with the most important object types and objects
The disk space that is covered by the different BW objects should be identified.
In this step the data are mainly looked at from a technical point of view with regard to the allocated disk space. The most important object types and objects are identified.
Largest objects - Identifying the largest object types (PSA, ODS-Active, Cube – E-fact, Cube F-Fact and Aggregates
The largest 10 objects for the following object types should be identified:
§ PSA
§ ODS – Active Tables
§ Cube – E-Fact Tables
§ Cube – F-Fact Tables
§ Aggregates
Use Transaction ST14 to derive information.
Alternatively, download data from DB02 and upload in Excel. Sort by table name (with the naming convention the object types can be separated, but keep in mind that /BI0 and /BIC objects have to be looked at.) Separate the tables for the different objects types on different worksheet (copy & paste) and sort by size.
PSA
ODS
INFOCUBES
F-FACT
E-FACT
AGGREGATES
Top Growing Objects – Identify fastest growing objects
Besides the largest objects the fastest growing objects should be identified. Schedule ST14 regularly and write data to a transparent table so that a generic extractor could be created to pull it into a cube. Then BEX queries could be created to report on the DB growth, fastest growing tables, etc.
Vertical Data Flow - Identifying largest Data Flows for (PSA, ODS, InfoCubes) in relation to largest object types
PSA, Cubes, ODS objects are connected in a Data Flow. The 10 largest Data Flows regarding the data volume on disk space should be identified. Therefore the total size of all objects belonging to one data flow should be identified.
As long as there is no tool support available to identify the largest Data Flows you can proceed as follows: For the 10 largest BW application objects of each object type (PSA, ODS, InfoCube) their relation to other objects in the data flow should be analyzed and described. The idea is to get an idea of the complete data volume that is caused by a certain application.
EXAMPLE:
The data volume that is caused by GL line items is the total of the size of the PSA table, the ODS table, the InfoCube and the aggregates. This total should be listed to identify the top application objects that should be concentrated on when reducing data growth and data volume.
Age of Requests - The aging analysis of PSA requests
The age of the requests in the PSA tables can be analysed by a join of the RSTSODSPART and RSREQICODS table.
Instructions to analyze the age of the requests:
Download SE16 output for tables RSTSODSPART and RSREQICODS and upload them to a database sofwtare.
Join of Table RSTSODSPART and RSREQICODS with field “Request Number” (RSTSODSPART-REQUEST = RSREQICODS-RNR).
Conditions used:
RSTSODSPART-DELETE = ““(blank, i.e. not deleted yet)
RSREQICODS-TIMESTAMP < “20,031,001*” (i.e. older than Oct, 1st, 2003)
RSREQICODS-TYP = “O” (this is necessary to avoid, that requests are listed twice).
EXAMPLE:
Generated SQL-Statement
SELECT Rstsodspart.Delete, Rsreqicods. [UTC Time Stamp in Short Form (YYYYMMDDhhmmss)], Rsreqicods. [InfoCube (=I) or PSA (=O) of change log (C)], Rstsodspart. [Data record number], Rstsodspart. [Technical ODS name], Rstsodspart. [Request number], Rsreqicods.InfoCube
FROM Rsreqicods INNER JOIN Rstsodspart ON Rsreqicods. [Request number] = Rstsodspart. [Request number]
WHERE (((Rstsodspart.Delete) =" ") AND ((Rsreqicods. [UTC Time Stamp in Short Form (YYYYMMDDhhmmss)])<"20,031,001*") AND ((Rsreqicods. [InfoCube (=I) or PSA (=O) of change log (C)])="O"))
ORDER BY Rstsodspart. [Data record number] DESC;
Ratio between F-FACT and E-Fact Table SIZES – Identifying Info Cubes candidates for compression
Cubes that are candidates for compression should be identified. Whether a compression should be recommended depends on the one hand on the ratio between F- and E-Fact Tables (i.e. large F-Fact table compared to a small E-Fact table) but also on the definition of the InfoPackage. Is the InfoPackage set up in a way that former requests are deleted automatically, and then a compression is of no use and will just consume system resources?
EXAMPLE:
The current F Fact objects listed in the table below are larger than the E Fact objects. This is an indication that compression needs to take place more frequently to consolidate records.
F-FACT
E-FACT
Content Analysis – Analysis of the business content
For the most important objects in terms of size and/or growth, it might be helpful to prepare an analysis of the business content. It might be especially interesting to analyze the distribution over time. For ODS transactions TAANA and SE16N could be used for that. For InfoCubes the analysis is much more complicated, when there are no date-values in the dimensions available.
Comments