|
 |
 |
 |
 |
 |
WILLIAM G. SMITH & ASSOCIATES
INFORMATION RESOURCE MANAGEMENT SEMINARS AND CONSULTING
"CLEANING UP THE WORLD'S DATA MESSES"
|
DATA MODELING FOR DATA WAREHOUSE
Many enterprises today are building one or more "data warehouses" as a means of dealing with the underlying data mess resulting from their dis-integrated systems. Although the building of a data warehouse will not remedy the underlying systems and data mess, it can be an effective interim strategy to allow the appearance of consistent and integrated data, and provide a data store where different business processes can at least share the same data at the same time.
The principles of correct data organization are every bit as applicable (if not more so) to a data warehouse as they are to designing any shared database. This seminar combines the discipline of good data modeling practice with the additional concerns of identifying the best source data in existing systems, and planning the movement of data from the source system(s) to the data warehouse.
Data Modeling for Data Warehouse is a skill-building seminar which teaches the familiar three-level approach to data modeling (Conceptual, Logical and Physical) applied to the problem of designing a data warehouse. It integrates the disciplines of entity/relationship modeling, data element analysis and standardization, normalization, and physical database design to fully equip the student to design a stable, flexible, highly sharable, effective, non-redundant data warehouse. Also, the necessary metadata to support the data warehouse (where did this data come from?) is addressed in detail with models and discussion of a repository in which to hold the appropriate metadata. The seminar material is completely independent of any particular vendor's CASE tool or physical data base management system (DBMS).
Because of the large amount of material covered, this seminar is appropriate for experienced data modelers who are/will be engaged in the modeling and construction of data warehouse. It focuses on the special and often confusing issues and differences between a data model independent of the existing systems/data, and a data model for a data warehouse which is constrained by the legacy data/systems which already exist in the business and which are the sources of the data which will be imported into the data warehouse.
TOPICAL OUTLINE
- Background Concepts
- IRM Environment
- Traditional "Systems" Approach
- The IRM View
- Data Warehouse Definitions and Concepts
- Data Warehouse as Path Forward to IRM Environment
- The Eight Greatest Misconceptions about Data Warehousing
- Why Data Models are Needed
- Differences in Data Modeling for True Data Resource and Data Warehouse
- Three Data Model Continuity
- Data Modeling Sessions
- Defining the Scope of the Data Warehouse
- Ways of Expressing/Defining Scope
- Determinants of the Scope
- Steps to Define Scope
- Guidelines for Scope Definition
- Conceptual Data Modeling of Data Warehouse
- Purpose, Form and Content
- Conceptual Data Modeling Definitions
- Conceptual Data Modeling Steps
- 1. Detecting and Qualifying Entities
- 2. Diagramming Entities and Relationships
- Relationship Rules
- E/R Diagram Symbols
- Reading the E/R Diagram
- Instance Diagram
- Normal Forms in the E/R Model
- Best-Fit Decisions
- E/R Modeler's Useful Tips and Questions
- Special E/R Constructs
- N-ary Relationship
- Recursive Relationship
- Subtype/Supertype Construct
- Characteristic Entity
- Associative Entity
- Relationship Roles
- Role Entity
- 3. Analyzing and Defining E/R States
- Multiple Entity vs. State Decision
- Multiple Relationship vs. State Decision
- State/Transition Analysis
- Example State/Transition Diagram
- General State Rules
- State Variations
- True Data Resource vs. Data Warehouse Perspective
- Capturing State Information in the Dictionary
- Example Questions for State/Transition Analysis
- 4. Fully Defining Entities and Relationships
- Entity Definition Pro Forma
- Notes on Primary Keys
- Relationship Definition Pro Forma
- Recording In-Between Rules
- Example Questions for Writing Definitions
- 5. Matching Entities to Candidate Legacy Source Data Stores
- 6. Reviewing and Stabilizing the E/R Model
- Stability Review
- Example Questions for Reviewing Model
- Dictionary Meta-E/R Model to Support Conceptual Modeling and Source Data Mapping
- Logical Data Modeling of Data Warehouse
- Purpose, Form and Content
- Logical Data Modeling Definitions
- Alternative Approaches
- Logical Model Diagrams
- Logical Modeling Steps
- 1. Identifying Pertinent Dataviews/Transactions
- 2. Analyzing Dataviews/Transactions
- 3. Standardizing and Defining Data Elements
- Detecting and Understanding Need and Meaning
- Determining Entity or Relationship of Residence
- Choosing Best Representation
- Rules for Data Elements
- Handling Different Types of Data Elements
- Data Elements to Represent States
- Fully Defining the Data Element
- Data Element Definition Pro Forma
- Data Element Domains and Synonyms
- Naming the Data Element
- Standard Keyword/Abbreviation Glossary
- Checking the Dictionary for Redundancy
- Matching Data Elements to Candidate Legacy Source Fields/Columns
- Documenting New Data Elements in Dictionary
- Data Element Modeler's Useful Tips and Questions
- 4. Diagramming and Normalizing the Data for the Transaction
- Bubblecharting Symbols
- Bubblechart/Logical Structure Rules
- Building the Bubblechart
- The Normal Forms
- First Normal Form Examples
- Handling Time-Variant Attributes
- Bubblecharting Notes
- Recursive Relationship/Multiple Relationships
- N-ary Relationship
- Subtype/Supertype
- Characteristic Entity vs. Subordinate Data Group
- Relationship with Data vs. Associative Entity
- Relationship Roles
- Questions to Assist Normalizing Data Elements
- 5. Fully Defining New Entities, Relationships, Logical Data Groups, Associations
- 6. Verifying the Dataview Bubblechart
- 7. Synthesizing into Logical Model
- Synthesis Example
- Dictionary Meta-E/R Model to Support Logical Data Modeling
- 8. Reviewing and Stabilizing the Logical Data Model
- Example Questions for Reviewing/Stabilizing
- Physical Data Warehouse Modeling
- Purpose, Form and Content
- Physical Data Modeling Definitions
- Physical Design Issues
- Physical Modeling Steps
- 1. Formalizing and Weighting Design Objectives
- 2. Defining Physical and Technological Environment
- 3. Laying Out First Cut Physical Design(s)
- General Logical to Physical Transform
- General Foreign Key Rules
- Example Logical to Physical Transforms
- 4. Deciding Stored vs. Virtual Derived Data
- 5. Analyzing and Adjusting for Volume and Growth
- 6. Analyzing and Adjusting for Security Requirements
- 7. Analyzing and Adjusting for Transaction Performance
- Modeling Predominant Patterns of Use
- Focusing on Critical Transaction Performance
- Collecting or Separating
- Denormalizing
- Deciding Secondary Key Indices
- 8. Analyzing and Adjusting for Ease of Use
- 9. Assessing Design Objectives
- 10. Finalizing the Physical Model
- 11. Specifing the Design in DBMS DDL
- Completing Source to Data Warehouse Field/Column Transforms
- Dictionary Meta-E/R Model to Support Physical Design and Source DAta Mapping
- Summary
- Workshop
- Identifying Candidate Entities
- E/R Diagramming
- Defining Entity States/Transitions
- Defining Entities/Relationships
- Data Element Analysis and Bubblecharting
DURATION: 5 days
TARGET AUDIENCES: (recommended maximum number of attendees - 25)
- CIO/IS/IRM Management
- Conceptual, Logical and Physical Data Modelers
- Process Analysts/Modelers
- Development Project Managers
- Business persons who participate in data warehouse development projects
PREREQUISITES: (strongly recommended, not mandatory)