TyroCity

Discussion on: Big Data: ethical implications & example

Collapse
 
ncitujjwal profile image
ncitujjwal

Database System:
A database is a collection of related data necessary to manage an organization. It includes transient data such as input documents, reports, and intermediate results obtained during processing. Database is collection of logically interrelated data and description of this data, designed to meet the information needs for organization. Database system is an integrated collections of related files along with the detail about their definition, interpretation, manipulation and maintenance. A Database Management system is a set of procedures that manage the database and provide the access to the database in a form required by any application program. It effectively ensures that necessary data in the designed form is available for diverse application of different organization.

Characteristics of a Database

  • Structure: Data types, data behavior

  • Persistence: Store data on secondary storage

  • Retrieval: a declarative query language,

A procedural database programming language.

  • Performance: Retrieve and store data quickly correctness

  • Sharing: concurrency

  • Reliability and resilient

  • Large Volumes

Data Warehouse:

A data warehouse is a single, complete and consistence store of data obtain from a verity of different sources made available to end users in what they are understand and use in a business context is called data warehouse. A data warehouse is subject - oriented, integrated, time - variant, nonvolatile, collection of data in support of management’s decision making process (Wallance, 2015).

Subject - Oriented:

  • Data is arranged and optimized to provide answer to questions from diverse functional areas.

  • Data is organized and summarized by topic.
    – Sales/ Marketing / Finance/ Distribution/ Etc.

  • It focuses on modeling and analysis of data for decisionmakers.

  • Excludes data not useful in decision support process.

Integrated:

  • Data Warehouse is constructed by integrating multiple heterogeneous sources.

  • Data preprocessing are applied to ensure consistency.

  • The data warehouse is a centralized, consolidated database that integrated data derived from the entire organization.
    – Multiple Sources
    – Divers Sources
    – Divers Formats

Time - Variant

  • The Data Warehouse represents the flow of data through time.

  • Can contain projected data from statistical modes

  • Data is periodically uploaded then time - dependent data is recomputed

  • Provides information from historical perspective e.g. past 5 - 10 years.

  • Every key structure contain either implicitly or explicitly an element of time

Non volatile

  • Once data NEVER removed

  • Represents the company’s entire history

    • Near term history is continually added to it.
    • Always Growing
    • Must support terabyte databases and multiprocessor
  • Read - Only database for data analysis and query processing

  • Data Warehouse requires two operations in data accessing

    • Initial loading of data
    • Accessing of data

How Database are related to data Warehouse?

"All data warehouses are databases, not all databases are data warehouses” by ANSI/X3/SPARC Database System Study Group (2014).A data warehouse is an especially setup database designed to hold large amounts of data for reporting purposes. While a normal database is a structured collection of records or data that is stored in a computer system. A normal database is optimized for transactional activity (while keeping a small amount of history) a data warehouse will be optimized for large scale reporting. Within a data warehouse data from several systems will typically be merged together to present a global enterprise view. Data warehouses will also typically keep a very long history from several years to the entire life of the company so that very long term trends can be viewed. Finally we can say that, Data warehouse identifies a number of characteristics that differentiate warehouses and marts from conventional operational databases.

References

ANSI/X3/SPARC Database System Study Group. (2014). Reference Model for DBMS Standardisation. ACM SIGMOD Record, Vol. 15, No. 1, 15-17.

Wallance, P. (2015). Introduction to Database Management System. UK: PEARSON.