Big Data and Data Warehousing: Two Topics
Autor: SandyK • September 9, 2014 • Research Paper • 1,906 Words (8 Pages) • 1,327 Views
Big Data and Data Warehousing: Two topics
Intrigued by the debate between Inmon and Kimball regarding their views on data warehouse and Big Data, as well as Cloudera’s approach to advertizing the products, I have chosen these two topics in hopes of coming to a better understanding of the two terms.
Technological advancements catapulted the world into the age of data. Once called simply ‘data,’ data now goes by many names. It’s now referred to as open source data, machine data, real time data, multi-structured data, and analytic data. A prevalent name thrown around to date is big data. Yielding about 5 exabytes of data at the start of the 2003 (Schmidt, 2010, as cited in Lucchetti, 2010), data has been growing aggressively ever since. By the end of 2012, it is estimated that over 500 times that number of data will have flooded the systems . It’s suggested that by the end of 2016 the volume of data will reach over 6.6 zettabytes (Cisco, 2012). The major source attributed to currently generating data at this crazy speed is the Internet. Data infiltrating the Net comes in the forms of videos, photos, comments, and web reviews, to name a few. Much of the data is a product of unstructured sources, composed of volumes of figures, text, numbers, facts, and dates, freeformed in nature, and not found in tables (Williams, 2012). Intel refers to this complex data as Big Data, which is defined by some as “enormous data sets and the technologies available to help successfully deal with and use the data deluge” (Lucas, Lahl, Jonker, Bandey, Upchurch, Imhoff, Sinha, Tsai, Schitka, Claussen, Awadallah, Grimes, Das, Banks, Echerson, Khjan, Delahaye, Zeller, Rich, Dunn, & Hirsh, 2012, p. 60). The data sets are so large that the software tools generally used are not able to “capture, manage, and process the data in a timely fashion” (Stone, Hudgins, Tokerud, & Burrill, 2014; PricewaterhouseCoopers, 2007). To be able to efficiently glean the benefits of the huge volumes of data with speed, special technologies have to be used. (Russom, 2012). Davenport and Dyche (n.d. p. 1) explained that organizations linked with the Internet such as “Google, eBay, LinkedIn, and Facebook were built around big data from the beginning.” These two authors suggest that these companies understood the breadth of their undertaking and knew that “Big Data analytics could be the only focus of analytics, and Big Data technology architectures could be the only architecture” (Davenport and Dyche, n.d., p. 1).
A Gartner study (Forsyth, 2012; Intel, 2013) acknowledged the importance of big data analytics and noted that only about 10-15 percent of organizations will actually take advantage of the benefits that Big Data offer. Gartner Research (2013, as cited in Intel, 2013) estimated that those who do reap the rewards of Big Data will out perform competitors by 20%. “Big Data can bring about dramatic cost
...