Database Design and Management


6.1. CONCEPT OF DATABASE

Databases are vital to many firms. Without them, some companies simply cannot function. Poorly managed and maintained data can threaten the very existence of an organization; while well-managed data systems can provide a significant edge in the marketplace. Accurate and timely data are the backbone of good decisions, regardless of the business. A manager must decide on the price of a firm’s product, based on cost factors and market conditions. A stockbroker must decide, based on investment data, how and where to invest. A banker must decide, based on credit reports, whether to approve a loan.

Data are the driving force behind good decisions, and therefore the ability to gather, store, process, and retrieve data in a timely manner is vital to the health of an organization. Data that are well managed can –

  • Save an organization time and money
  • Increase market share
  • Capture new markets
  • Improve customer service
  • Increase productivity, and
  • Enhance decision making

6.2. Data Hierarchy

Data and information are stored in a computer files for processing, retrieval, and dissemination. If data files are not carefully organized and managed, decision makers will not be able to find data when they need it. While many organizations are good at collecting data, only few organizations are good at manufacturing their data and making it accessible to decision makers in a timely and useful manner.

The data in a computer system is organized in a hierarchy, referred to as the data hierarchy chain. The hierarchy, in ascending order, includes:

  • Bits
  • Bytes
  • Fields
  • Records
  • Files, and
  • Databases

i. Bits: A bit represents the smallest unit of data a computer can handle or a value that represents the presence or absence of an electronic signal. It represented as a 1 or 0 – an on-off switch.


ii. Byte: A group of bits, called a byte, represents a single character, which can be a letter, a number or another symbol. (Example: We need four bytes to represent the name Mary in a computer. Since each character, such as M, is represented using 8 bits, the word Mary requires 32 bits.)

iii. Field: A meaningful grouping of characters or bytes or a group of words or a complete number is referred to as a field.(Example: Mary’s last name, first name, and phone number are three fields.)

iv. Record: A group of interrelated fields is called a record. (Example: Mary’s last name, first name, social security number, phone number, and home address can make up a record because they constitute relevant data about Mary.)

v. File: A collection of records of the same type that are grouped together, such as the collection of employee records, is called a file.

vi. Database: A collection of interrelated files is a Database. The personnel data files, the employee benefits file, and the employees’ salary file are a group of interrelated files that provide information about employees, and hence can be brought together as a database. On the other hand, an organization’s inventory file, employee benefits file, and supplier’s address file would not make a good database because these files are not interrelated.

A simple definition of database given by James Martin,

    “A database is a collection of data that is shared and used for multiple purposes.”

Data are organized around file arranged so that duplication and redundancy are avoided. Information concerning ongoing activities is captured once, validated and entered in to the proper location in the database. The key element in a database is that each subsystem utilizes the same database in satisfying its information needs.

6.3. Methods of Organizing Data in Files

Computer system store files on secondary storage devices. There are a number of methods to organize data in files. The choice of method depends on factors such as storage media, access methods, processing techniques, and so on and the arrangement determine the manner in which individual records can be accessed or retrieved.


There are basically three ways to organize files:

  • Sequential file organization
  • Direct or Random file organization
  • Indexed sequential file organization

i. Sequential File Organization

A method of storing data records in which the records must be retrieved in the same physical sequence in which they are stored -Laudon and Laudon

In sequential file organization, records are written and stored on a secondary storage device (such as a magnetic tape or magnetic disk) in same sequence in which they were collected. The records are arranged in order using a unique key (such as Exam Roll) and are physically adjacent to one another. For example, student records can be arranged in ascending order by their examination roll number.

Read

Read                         Read and Retrieve

Record

1

Record

2

Record

3

Record

4

Record

5

…….

Record

199

Record

200

Figure 6.2. : Sequential File Organization

In order to retrieve or process a record, the file is sequentially processed from the beginning, using the record in which it was recorded, say examination roll number, until the desired record is located. This implies the data sequencing and ordering must be done before the file is created, because the ordering of the data cannot be changed when the file is processed. Further, when records are modified, the entire file must be rearranged. This can be a time-consuming task.

A typical application using sequential files is payroll, where all employees in a firm must be paid one by one and issued a check.

“Method of storing data records in a file so that they can access in any sequence without regard to their actual physical order on the storage media.”-Laudon and Laudon

Direct file organization allows data to be retrieved quickly in a random manner, regardless of the way in which the data was originally stored. In this method, the unique key that is used to organize files (such as roll number) is converted directly to a memory address, using a mathematical formula called a hashing algorithm.


Record

122

Record

372

Record

199

Record

412

…..

Figure 6.3. : Direct File Organization

The magnetic disk is ideally suited to this type of file organization and many applications today use some form of direct file organization. They are popularly well suited to quickly retrieve a single record, although sequential processing of direct files can be time-consuming.

ii. Indexed Sequential File Organization

“A file in which data are stored in a sequence (similar to the sequential method), but in addition an index is created that shows the memory address of each piece of data.”-Gupta

Like a book index, which shows the page location of each given topic, a file index shows the memory address or physical location of each piece of data, making it easier and quicker to access data.

INDEX

Record

Memory Address

.

.

.

199

200

.

.

.

03624

….

Record

1

Record

2

Record

3

Record

4

Record

5

…..

Record

199

Record

200

Figure6.4. : Indexed-Sequential File Organization

The difference between direct file organization and this method is that direct files use a hashing algorithm to directly retrieve the record, whereas the direct-sequential system refers to an index to determine the location of a record in memory.

Data Base Management Systems

The development of databases and database management software is the foundation of modern methods of managing organizational data. In the database management approach, data records are consolidated into databases that can be accessed by many different application programs. In addition, an important software package called a database management system (DBMS) serves as a software interface between users and databases. Thus database management involves the use of database management software to control how databases are created, interrogated, and maintained to provide information needed by end users and their organizations.

Some definitions of DBMS

“A DBMS is viewed as a system software package that controls the development, use and maintenance of the databases of computer-using organizations.”- James A. O’Brien

“A set of programs that act as an interface between application programs and the data in the database. Support programs that work with the operating system to create, process, and manage data.”-Gupta

“BMS is special software to create and maintain a database and enable individual business applications to extract the data they need without having to create separate files or data definitions in their computer programs.”- Laudon and Laudon


According to Robert and Mary

A DBMS is a collection of software programs that-

  • Stores data in a uniform way
  • Organizes the data into records in a uniform way
  • Allows access to the data in a uniform way

Figure 6.5 : The Contemporary database environment.

DBMS program helps organizations use their integrated collection of data records and files known as databases. It allows different user application programs to easily access the same database. A DBMS also simplifies the process of retrieving information from databases in the form of displays and reports. Instead of having to write computer programs to extract information, end users can ask simple questions in a query language.

AS illustrated in the figure, the DBMS acts as an interface between application programs and the physical data files. When the application program calls for a data item such as gross pay, the DBMS finds the item in the database and present it to the application program. Using, traditional data files, the programmer would have to define the data and then tell the computer where they are. A DBMS eliminates most of the data definition statements found in traditional programs.

6.5. Components of a Database Management Systems

A DBMS has three main components. They are:

  • Data Definition Language (DDI)
  • Data Manipulation Language (DML)
  • Data Dictionary

i. Data Definition Language

It defines the relationship between different data elements and serves as an interface for application programs that use the data. It is the formal language used by programmers to specify the content and structure of the database. For example, if a payroll program needs the net pay of an employee, the DDL defines the logical relationship between the net pay and the other data in the database and acts as an interface between the payroll program and the files that contain the net pay.

ii. Data Manipulation Language

Data is processed and updated using a language called the data manipulation language (DML), whose commands process, update, and retrieve data. It allows a user to query a database and receive summary reports and/or customized reports, for instant, a list of the top five salespeople in the company.

iii. Data Dictionary

The third component of a DBMS is the data dictionary, which describe the data and its characteristics, such as location, size, and type of data. It is an electronic document. The DBMS uses the data dictionary to address all questions pertaining to data, such as definitions, storage locations, use, and access privileges. Data is created, stored, and updated in the data dictionary using DDL.

Models of Database

A database model is a way of organizing data and its relationships. There are three such models. They are:

  • Hierarchical Model
  • Network Model, and
  • Relational Model

Before we look at the different types of models, let us look at three types of relationships.

A 1-1 (one-to-one) relationship indicates a unique relationship between two entities. For example, the relationship between name and class roll and vice versa. Other 1-1 relationships are that between a president and a country, and that a child and its biological mother.

In a 1-M (one-to-many) relationship, an entry can have multiple relationships with other entities in the database. For example, a mother can have many children, but each child can be traced to only one mother; in a given course, a teacher has many students, but each student has only one teacher for that course. These are examples of one-to-many relationship.

Finally, in an M-M (many-to-many) relationship, every entity can be related to a number of entities. Each airport has many airlines that use its facilities and each airline has access to many airports; each credit card company has many customers and each customer have many credit cards; each teacher has many students and each student may have many teachers.

i. Hierarchical Data Model

In a hierarchical model, the logical relationship among various data elements are represented as a hierarchy using 1-M relationships, similar to an organizational chart, and a given data element can be accessed only by going through the proper hierarchy. Each “Box” in the hierarchical model is a record, sometimes referred to as a node; the top most nodes are referred to as the root node. The relationship between different nodes is sometimes referred to as a parent-child relationship; in a hierarchical model, each node (except the root node) has exactly one parent. The data elements in a hierarchical model are well suited for a one-to-many (1-M) relationship with other data elements in the database, because each parent can have a number of children.

Figure shows a hierarchical structure that might be used for a human resources database. The root segment is employee, which contains basic employee information such as name, address and identification number. Immediately bellow it are three child segments: Compensation (containing salary and production data), Job Assignment (containing data about positional and departments), and Benefits (containing data about beneficiaries and various benefit options.) The Compensation segment has two children below it: Performance Ratings (containing data about employee’s job performance evaluations) and Salary History (containing historical data about employee’s past salaries). Below the benefits segment are child segments for Pension, Life Insurance, and Health Care, containing data about these various benefit plans.




How does the computer retrieve records from the hierarchy? This is done through pointers, which are pieces of data that identify the links between different records. The data nodes in a hierarchical database are linked to one another through a series of pointers, one of which is attached to the end of each record in the database.

ii. Network Model

In a network model, each record in a database can have multiple parents-that is, the relationship among data can be many-to-many (M-M relationship). For example, each student can attend many classes and each class can have many students, so that there is a M-M relationship between students and classes.



The network model is a variation of the hierarchical model, databases can be translated from hierarchical form to network form and vice versa. Like those in hierarchical data model, data elements in a network model are also linked through pointers. The main difference between the network model and the hierarchical model is that in a network model a child can have a number of parents, whereas in a hierarchical model a child can have only one parent.

iii. Relational Model

The third type of data base model is the relational model, in which data is represented using two-dimensional tables, called relations or flat files, which are made up of columns and rows represents a record, also referred to as a tuple. Figure shows an example of relational database with three tables: School, Faculty, and Course. This example will be used to discuss some basic concepts. These are three basic operations in a relational database:

  • select
  • project
  • join

The select operation selects all records in a table that meet a certain criterion, such as selecting core courses (Core Course = “Yes”). The join operation joins, or links, two or more tables, if the information required by the user is not found in one table.

School

Department

Dept. Chairman

Course No.

Core Course

Business Marketing Sheila Stuart MAR 4530

Yes

Business Management Angela Williams MAN 4140

No

Engineering Materials Peter Chan EMA 3110

No

Arts Music Michael Jones AMI 2140

Yes

Faculty Member Name

Social Security Number

Faculty Member ID

Course No.

Wheatly

940-00-1111

3624

MAR 4530

Jones

362-62-6222

1014

MAN 4140

Bailey

111-11-1111

4636

EMA 3110

Vanden

333-33-0000

1099

AMI 2140

Course No.

Teacher Name

Meeting Time

Place

MAR 4530

Wheatly

6:00 p.m.

BH 310

MAN 4140

Jones

9:30 a.m.

GCB 220

EMA 3110

Bailey

10:00 a.m.

ART 149

Figure 6.8: A relational database with three tables: School, Faculty, and Course. These tables are sometimes called relations. Each column (e.g. Course No.) is a field and each row (e.g. MAN 4140) is a record, or tuple.)

For example, in order to determine all core courses taught by a certain faculty member, the information has to be gleaned from two tables: School and Faculty. The third operation, project, creates a subset of columns (or a new table) design to meet the information needs of the user, such as the time and meeting place of every core course. This is one of the most powerful features of relational databases, since it allows data to be easily retrieved and analyzed from a number of tables at the same time.

6.7. Database Trends

Organizations are installing powerful data analysis tools and data warehouse to make better use of the information stored in their databases and are taking advantage of database technology linked to the World Wide Web. Let’s explore these developments.

Online Analytical Processing (OLAP)

Online analytical processing enables managers to interactively examine and manipulate large amounts of detailed and consolidated data from much perspective. OLAP involves analyzing complex relationships among millions of data items stored in multidimensional database to discover patterns, trends. An OLAP session takes place online in real time, with rapid response to manager’s queries, so that their analytical and decision making process is undisturbed.

The following figure illustrates that a sales manager made a complex query (by using OLAP interface) to the database and he wanted to compare sales of two products in three different regions.


            Figure 6.10: An OLAP report comparing sales in various regions.

Data Warehouses

A data warehouse is a database that stores current and historical data of potential interest to managers throughout the company. The data originate in many core operational systems and external sources, including Web site transactions, each with different data models. They may include legacy systems, relational o object-oriented DBMS applications, and systems based on Hypertext Markup Language (HTML) or Extensible Markup Language (XML) documents. The data from these diverse applications are copied into the data warehouse as often as needed-hourly, daily, weekly, monthly. The data are standardized into common data model and consolidated so that they can be used across the enterprise for management analysis and decision making. The data available for anyone to access as needed but cannot be altered.

Figure 6.11 illustrates the data warehouse concept. The data warehouse must be carefully designed by both business and technical specialists to ensure it can provide the right information for critical business decisions. The firm may need to change its business process to benefit from the information in the warehouse.


Figure 6.11: Data Warehouse

Data Mining

Data mining is a major use of data warehouse database and the static data they contain. In data mining, the data in a data warehouse are analyzed to reveal hidden patterns and trends in historical business activity. This can be used to help managers make decisions about strategic changes in business operations to gain competitive advantages in the marketplace (Figure 6.12).


Figure 6.1.: Data Mining

Data mining can discover new correlations, patterns, and trends in vast amounts of business data (frequently several terabytes of data) stored in data warehouse. Data mining software uses advanced pattern recognition algorithm as well as a variety of mathematical and statistical techniques to sift though mountains of data extract previously unknown strategic business information. For example, many companies use data mining to:

  • Perform “market-basket analysis” to identify new product bundles.
  • Find root causes to quality or manufacturing problems.
  • Prevent customer attrition and acquire new customers.
  • Cross-sell to existing customers.
  • Profile customer with more accuracy.


26 Responses

  1. Hello There. I found your blog using msn. This
    is a very well written article. I’ll be sure to bookmark it and return to read more of your useful info.
    Thanks for the post. I’ll definitely comeback.

    Like

  2. You should take part in a contest for one of the
    most useful sites on the web. I am going to recommend this site!

    Like

  3. Unquestionably believe that which you stated. Your favorite
    reason appeared to be on the internet the simplest thing to
    be aware of. I say to you, I definitely get
    annoyed while people consider worries that they just don’t
    know about. You managed to hit the nail upon the top as well as defined out the whole thing without having
    side effect , people can take a signal. Will probably be back to get more.
    Thanks

    Like

  4. certainly like your web site however you have to check the
    spelling on quite a few of your posts. Many of them are rife with spelling issues and I in finding it very bothersome to tell the reality
    however I’ll surely come again again.

    Like

  5. Thank you a bunch for sharing this with all of us you really recognise what you’re
    talking approximately! Bookmarked. Please additionally talk over with my
    web site =). We may have a hyperlink exchange arrangement between us

    Like

  6. I am genuinely pleased to glance at this weblog
    posts which carries lots of helpful data,
    thanks for providing such information.

    Like

Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: