Amazon Storms the Cloud with Move into Data Warehousing Services

Amazon announced a major expansion of their Amazon Web Services (AWS) division last week. It turns out the long-standing rumor is true -- AWS is moving into data warehousing and analytics in a big way.

AWS, Amazon’s six-year old Cloud infrastructure division, offers data storage, computing and other IT services from a number of locations worldwide, and has been one of the leaders in the commercialization of the Cloud.

Analysts have noted that Amazon has been spending heavily on infrastructure the last few quarters, and this announcement regarding new ultra-low-cost data services seems to explain where the money was spent. AWS’s new Redshift relational data warehousing services compete directly with tech giants including Oracle Corp, International Business Machines Corp, Hewlett-Packard Co, and Teradata.


Image by ccox888

Andy Jassy, Senior VP and Head of AWS, introduced the new data warehousing service last week, and stunned the competition by announcing Redshift will only cost about a tenth of existing solutions.

Jassy was blunt in his criticism of competitors, speaking during the introductory conference in Las Vegas. "The old world of technology has a pricing model which is to charge as much as customers can pay. Customers are tired of it."

Jassy went on to say that AWS could be Amazon's biggest business, even larger than its original online retail operations, and that AWS will accomplish this using the same low-margin, high-volume strategy that has made Amazon the world's largest Internet retailer.

AWS Growth

AWS is growing relatively rapidly because its services are inexpensive, simple to use and can be rapidly deployed and expanded based on a company’s needs. Most analysts expect gross revenue at AWS to increase by 40 to 50 percent a year for the next four or five years, from around $2 billion in 2012 to more than $20 billion in 2018.

Aggressive Pricing and Slim Margins

Redshift on-demand pricing starts at $0.85 per hour for a 2-terabyte data warehouse, which can be scaled up to a petabyte. Customers who choose to purchase in advance to guarantee access will only pay $0.228 per hour, which translates to under $1,000 per terabyte per year (compared to around $20k per terabyte per year for current relational data warehousing solutions).

By the same token, Amazon is counting on significant growth for Redshift as the new division continues the company’s tradition of low margins. Analyst Ken Sena of Evercore is projecting net margins of 10% or less for Redshift, compared to 60% to 80% margins for the relational data warehouse services of current industry heavyweights such as IBM and Teradata.

IT Industry Implications

While Redshift’s entry into the data warehousing industry will undoubtedly be “disruptive” in the classic business sense of the word, it might just represent the leading wave of a sea change in the industry. AWS and other vendors currently provide outsourced solutions for data warehousing. Many of these solutions are managed services or infrastructure-as-a-service, which allow enterprises to deploy data warehouses with commercially available database management systems.  Given that Redshift operates through the AWS data management cloud, it will obviously be attractive to enterprises that are already working in the AWS Cloud.

Using a ParAccel Analytic Database (PADB) differentiates Redshift from traditional DBMSs hosted on infrastructure as a service and other large data storage/analysis alternatives. Using PADB allows AWS to utilize cost-based optimization algorithms in the physical environment, and also allows Redshift to provide full compatibility with and support for current business intelligence products such as MicroStrategy and Jaspersoft.

Analysts Mark A. Beyer and Merv Adrian of Gartner emphasize the point that cloud data warehousing offers the advantage of scalability with amount of data or analytic processing demand. Furthermore, having your data warehouse in the cloud dramatically simplifies issues with moving large amounts of data and reduces storage costs.

However, many large enterprises are committed to on-premises data management solutions for security reasons, and that is simply not going to change in the near future. Beyer and Adrian also point out Amazon’s Redshift does not yet address a number of frustrating data warehouse-related issues such as moving data between cloud providers and moving on-premises data to the cloud. Their overall conclusion is that while Redshift will cause significant disruption in industry pricing models, on a macro level, Redshift adds to, rather than supplants, current data warehouse deployment practices.