An Approach for Automatically Generating Star Schema from Natural Language

Abstract: The star schema is a form of data warehouse modelling, which acts primary storage for dimensional data that enables efficient retrieval of business information for decision making. Star schemas can be generated from business needs that we refer to as a user business key or from a relational schema of the operational system. There are many tools available to automatically generate star schema from relational schema, such as BIRST and SAMSTAR; however, there is no application that can automatically generate it from a user business key that is represented in the form of human language. In this paper, we offered an approach for automatically generating star schema from user business key(s). It begins by processing the user business key using a syntactical parsing process to identify noun words. Those identified words will be used to generate dimension table candidates and a fact table. The evaluation result indicates that the tool can generate star schema based on the inputted user business key(s) with some limitations in that the star schema will not be formed if the dimensional tables do not have a direct relationship.
Keywords: star schema, user business key, parsing process, fact table, dimensional table
Author: Rosni Lumbantoruan, Elisa Margareth Sibarani, Monica Verawaty Sitorus, Ayunisa Mindari, Suhendrowan Putra Sinaga
Journal Code: jptkomputergg140063

