Currently browsing

June 2008

CloudCamp: Cloud Definition, SLAs, Security and Others

Current results from Data Survey #1: Data Scientists. Thanks to everyone for helping the world understand Big Data better!

Please take the following 2-minute survey to help us understand your hadoop environment better.

Reuven Cohen, Dave Nielsen, Sam Charrington and a group of awesome volunteers organized a very successful CloudCamp event last night. This was organized in 3.5 weeks, which is an amazing feat. The event probably attracted 200-300 people. You can see some of the pictures of the event on flickr. The format was an unconference. There were 20+ sessions proposed and they were all very interesting. The topics range from cloud computing definition to transactions processing.

Here are some of the topics that I gathered based on the sessions I attended and people I’ve talked to.

The definition is very cloudy!

There’s no agreement on the definition of Cloud Computing. Reuven Cohen held a very popular session on “What is Cloud Computing?” There were at least 40 people in the room that was supposed to hold only 20. There were a wide variant of definitions, going from Reuven’s very open definition (internet centric software) to another person’s very restrictive definition (cloud computing must use web services, XML, SOAP, etc).

There were also discussions (and disagreements) on whether Google App engine is considered a cloud or not. Interesting enough, some of the people there didn’t consider GAE as a cloud. In one of the sessions, someone put an even more restrictive constraint on cloud computing. He said that a cloud MUST run any existing application without modification. So in that case, GAE would not be a cloud by his definition. I am definitely in the camp of that GAE is a cloud.

Some interesting questions were asked as well, such as the question from a Microsoft guy, “Does the operating system still matter, if the the application is running in the cloud. My answer to that was it depends on the type of application. If it’s a web centric application that has a web front end, uses a database for storage, and doesn’t use any of the low level file IO, then really there’s no need to know what the OS is. In that case, the OS doesn’t matter.

The term that’s used most to describe cloud computing is elasticity: the ability to quickly provision and de-provision computing resources on demand. Almost everyone I’ve talked to or listened to agrees to that. Some of the enterprise attendees also noted this as one of the biggest benefits of the cloud. When business units come to IT with new application requirements, IT now has a way to quickly spin up resources without having to wait weeks or months to procure equipment. The other thing that everyone agrees on is the utility model: the ability to pay for what you use.

Service level agreements

This topic was heavily discussed in the “No Cure for Cancer: Manage the Expectations of Cloud Computing” session. To summarize, there’s almost no SLAs provided by the cloud providers today. Even Jeff Barr from Amazon said that AWS only provides SLA for their S3 service. I haven’t researched the SLA issue so not sure how true that is. But if it’s true, I think this will be one of the biggest factor, if not the biggest factor, in enterprise adoption. Can you imagine enterprises signing up cloud computing contracts without SLAs clearly defined? It’s like going to host their business critical infrastructure in a data center that doesn’t have clearly defined SLA.

We all know that SLAs really doesn’t buy you much. In most cases, enterprises get refunded for the amount of time that the network was down. No SLA will cover business loss. However, as one of the CSOs I met said, it’s about risk transfer. As long as there’s a defined SLA on paper, when the network/site goes down, they can go after somebody. If there’s no SLA, it will be the CIO/CSO’s head that’s on the chopping block.

Security

Another topic that was discussed in Sam Charrington’s “How Cloud Impacts Enterprise Computing” session is security in the cloud. When Sam asked the group what are the factors that prevent enterprise from adopting the cloud, Ben Charian from ServiceCloud empathically said “security.” He talked about that the clouds must be certified or audited against standards or frameworks such as PCI. I’ve written about cloud security requirements here and here so I won’t elaborate on this topic. Needless to say, I am in total agreement with Ben. What I didn’t agree with Ben on is the need to rewrite these frameworks or standards specifically for the cloud. I believe many of the controls such as identity management and segregation of duties are the same in the cloud or out of the cloud.

Other observations and interesting tidbits

  • As the enterprise use more cloud resources, there will be a point where it may make sense to bring things back in house rather than continuing to use the cloud.
  • The cloud computing discussions are focused mainly on the infrastructure/platform-in-the-cloud. Applications-in-the-cloud or SaaS was hardly discussed. I get the feeling that most of the attendees don’t consider SaaS to be cloud computing, rather, it’s applications running on top of (or in) the clouds.
  • Cloud computing spending is opex instead of capex, allowing business units to make their own decisions.
  • Make sure you partner with someone who you trust and work with you on deploying to the cloud.

Google I/O Session Videos and Slides

Current results from Data Survey #1: Data Scientists. Thanks to everyone for helping the world understand Big Data better!

Please take the following 2-minute survey to help us understand your hadoop environment better.

Google has all their two day developer gathering, Google I/O, in San Francisco on May 28-29 2008. Google described it as “Two days of in-depth, technical sessions on how to build the next generation of web applications with Google and open technologies.”

All of the session slides and videos have been posted to Google sites. The sessions are separated in 6 categories:

  • AJAX & JavaScript
  • APIs & Tools
  • Maps & Geo
  • Mobile
  • Social
  • Tech Talks

Cloud-computing thread: Issues of data in the cloud

Current results from Data Survey #1: Data Scientists. Thanks to everyone for helping the world understand Big Data better!

Please take the following 2-minute survey to help us understand your hadoop environment better.

Another very interesting and popular discussion thread in the cloud-computing Google group on the Issues of data in the cloud.

There are really two main topics in the discussion:

  • Security and privacy issues around data in the cloud, which I have some detailed write up on here and here
  • Moving the data into the cloud or moving the programs to the data

JBoss in the Cloud

Current results from Data Survey #1: Data Scientists. Thanks to everyone for helping the world understand Big Data better!

Please take the following 2-minute survey to help us understand your hadoop environment better.

RedHat announced this morning that they are partnering with Amazon to provide JBoss on EC2. Pricing is

JBoss Enterprise Application Platform: includes Red Hat Enterprise Linux and allows you to leverage the leading open source platform for Java applications in the could. Available starting price of $119/month per customer plus $1.21 per hour for every deployed server, plus additional bandwidth and storage fees.

Salesforce Summer ’08 Now Live

Current results from Data Survey #1: Data Scientists. Thanks to everyone for helping the world understand Big Data better!

Please take the following 2-minute survey to help us understand your hadoop environment better.

A bit late in posting this since this was announced yesterday:

Salesforce.com (NYSE: CRM), the market and technology leader in Software-as-a-Service (SaaS) and Platform-as-a-Service (PaaS), today announced that Salesforce Summer ’08 is live to all 43,600 salesforce.com customers. Salesforce Summer ’08 represents salesforce.com’s 26th release in 9 years and delivers the power of cloud computing to the enterprise. Part of the Force.com platform, Visualforce is now live in every edition of Salesforce, enabling users to develop any interface entirely in the cloud. Salesforce Content and Salesforce Ideas are also delivering new levels of customer success in Salesforce Summer ’08 including the support of external communities, delivering true Web 2.0 collaboration capabilities to users globally. In total, more than 50 new CRM features are also live with Salesforce Summer ’08, raising the bar for industry innovation.

Tough Security Questions for SaaS Providers – Part 2

Current results from Data Survey #1: Data Scientists. Thanks to everyone for helping the world understand Big Data better!

Please take the following 2-minute survey to help us understand your hadoop environment better.

This is part 2 of the tough security questions for SaaS providers. In part 1 of the series, we asked the following questions:

1. Data Locality – Where’s my data?
2. Data Segregation – How is my data segregated with other customers, potentially my competitors?
3. Data Access – Who can access my data in your company?
4. Access Audit – Who has accessed my data and where’s my access logs?

We are continuing this discussion with the following questions in part 2.

5. How are the users authenticated and authorized?
6. Web Application Security – How secure is the SaaS provider’s web application?
7. Data Breaches – How do you protect my data from insider breaches?
8. PCI DSS – Are you compliant with PCI DSS?

5. How are the users authenticated and authorized?

Companies have spent hundreds of man years and millions of dollars trying to setup single-sign-on systems inside the corporate firewalls. Most companies, if not all, are storing their employee information in some type of LDAP servers. In the case of SMB companies, a segment that has the highest SaaS adoption rate, Active Directory seems to be the most popular tool for managing users. In many cases, companies have designed their IT infrastructure so that all authentication, including VPN, web proxy, file server, and others will go through this single infrastructure. The process of employee onboarding and termination is much easier this way.

Just as companies start to have some success, the advent of the SaaS model changes the scenario again. With SaaS, the software is hosted outside of the corporate firewall. Many times user credentials are stored in the SaaS providers’ databases and not part of the corporate IT infrastructure. This means SaaS customers must remember to remove/disable accounts as employees leave the company and create/enable accounts as come onboard. In essence, having multiple SaaS products will increase IT management overhead.

SaaS customers will start asking questions on identity and access integration and providers would be wise to design such features in early on. For example, SaaS providers can provide delegate the authentication process to the customer’s internal LDAP/AD server so that companies can retain control over the management of users.

6. Web Application Security – How secure is the SaaS provider’s web application?

One of the “must-have” requirements for a SaaS application is that it has to be used and managed over the web (in a browser.) This creates an interesting scenario. In the on-premise scenario, when a vulnerability is found, at least you have your firewall protecting the application so you may get a bit more time to patch it (assuming the application vendor provides the patch in a timely fashion.) However, in the SaaS world, there is no such luxury. Any vulnerability identified can potentially have detrimental impact on all of the customers. Even leading security companies aren’t immune to security holes in their web applications.

Web application security is quite a hot topic these days and it’s discussed by many security researchers such as rmogull and RSnake. Here’s an interesting article on “What web application security really is“.

Verizon Business recently released their Verizon Business 2008 Data Breach Investigations Report. Of all the breaches, 59% of the breaches involve hacking, with the following breakdown:

  • Application/Service layer -39%
  • OS/Platform layer – 23%
  • Exploit known vulnerability -18%
  • Exploit unknown vulnerability – 5%
  • Use of back door -15%

Attacks targeting applications, software, and services were by far the most common technique, representing 39 percent of all hacking activity leading to data compromise. This follows a trend in recent years of attacks moving up the stack. Far from passé, operating system, platform, and server-level attacks accounted for a sizable portion of breaches. Eighteen percent of hacks exploited a specific known vulnerability while 5 percent exploited unknown vulnerabilities for which a patch was not available at the time of the attack. Evidence of re-entry via backdoors, which enable prolonged access to and control of compromised systems, was found in 15 percent of hacking-related breaches. The attractiveness of this to criminals desiring large quantities of information is obvious.

Currently there’s really no mandate or requirement for SaaS providers to provide detailed security analysis of the SaaS application. However, it would be wise for the SaaS providers to start considering something similar to what PCI DSS has required of the merchants:

  1. 6.5 Develop all web applications based on secure coding guidelines such as the Open Web Application Security Project guidelines. Review custom application code to identify coding vulnerabilities. Cover prevention of common coding vulnerabilities in software development processes, to include the following:
    1. 6.5.1 Unvalidated input
    2. 6.5.2 Broken access control (for example, malicious use of user IDs)
    3. 6.5.3 Broken authentication and session management (use of account credentials and session
      cookies)

    4. 6.5.4 Cross-site scripting (XSS) attacks
    5. 6.5.5 Buffer overflows
    6. 6.5.6 Injection flaws (for example, structured query language (SQL) injection)
    7. 6.5.7 Improper error handling
    8. 6.5.8 Insecure storage
    9. 6.5.9 Denial of service
    10. 6.5.10 Insecure configuration management
  2. 6.6 Ensure that all web-facing applications are protected against known attacks by applying either of the following methods:
    • Having all custom application code reviewed for common vulnerabilities by an organization
      that specializes in application security

    • Installing an application layer firewall in front of web-facing applications.

Additional sources of information provided as a starting point for more information on web application security would include

  • OWASP Top Ten
  • OWASP Countermeasures Reference
  • OWASP Application Security FAQ
  • Build Security In (Dept. of Homeland Security, National Cyber Security Division)
  • Web Application Vulnerability Scanners (National Institute of Standards and Technology)
  • Web Application Firewall Evaluation Criteria (Web Application Security Consortium)

Trey Ford of Security Spin Control has a fairly good explanation of the recently released PCI information supplement on requirement 6.6.

SC Magazine also has an article on Deconstructing PCI 6.6 for the management folks.

7. Data Breaches – How do you protect my data from insider breaches?

In the Verizon Business breach report blog, Verizon Business stated that

While criminals more often came from external sources, and insider attacks result in the greatest losses, criminals at, or via partner connections actually represent the greatest risk. This is due to our risk equation: Threat X Impact = Risk

  • External criminals pose the greatest threat (73%), but achieve the least impact (30,000 compromised records), resulting in a Psuedo Risk Score of 21,900
  • Insiders pose the least threat (18%), and achieve the greatest impact (375,000 compromised records), resulting in a Pseudo Risk Score of 67,500
  • Partners are middle in both (73 39% and 187,500), resulting in a Pseudo Risk Score of 73,125

Many SaaS advocates claim that SaaS providers can do a better job at protecting the customers’ data. Unfortunately, just because the data is now in the cloud, it does not reduce the risk of insider breaches. Insiders still have access to the data, they are just accessing it a different way. Just because the data is in the cloud, the responsibility of segregation of duties and access authorization still fall on the customers, not the SaaS or cloud computing providers. So yes, it may reduce the chance of insiders getting direct access to, say, a database, it does not in any way reduce the risk of insider breaches. In fact, it may even increase the possibility as you now have to take into consideration of the cloud or SaaS providers’ employees. They have access to a lot more information and a single incident could expose information from many customers.

SaaS providers should be prepared to answer questions on what tools and processes are utilized to ensure segregation of duties and protect from insider breaches. Remember, in the case of the mult-billion dollar insider incident at Société Générale, IT management had implemented all of the controls recommended by auditors, but nobody was monitoring them. So it’s extremely critical to be able to show the processes around these security controls.

8. PCI DSS – Are you compliant with PCI DSS?

PCI DSS has a specific section for hosting providers (including SaaS providers):

Requirement A.1: Hosting providers protect cardholder data environment

As referenced in Requirement 12.8, all service providers with access to cardholder data (including hosting providers) must adhere to the PCI DSS. In addition, Requirement 2.4 states that hosting providers must protect each entity’s hosted environment and data. Therefore, hosting providers must give special consideration to the following:

A.1 Protect each entity’s (that is merchant, service provider, or other entity) hosted environment and data, as in A.1.1 through A.1.4:

  1. A.1.1 Ensure that each entity only has access to own cardholder data environment
  2. A.1.2 Restrict each entity’s access and privileges to own cardholder data environment only
  3. A.1.3 Ensure logging and audit trails are enabled and unique to each entity’s cardholder data environment and consistent with PCI DSS Requirement 10
  4. A.1.4 Enable processes to provide for timely forensic investigation in the event of a compromise to any hosted merchant or service provider.

A hosting provider must fulfill these requirements as well as all other relevant sections of the PCI DSS. Note: Even though a hosting provider may meet these requirements, the compliance of the entity that uses the hosting provider is not necessarily guaranteed. Each entity must comply with the PCI DSS and validate compliance as applicable.

Simply put, SaaS providers must be compliant with PCI DSS in order to host merchants that must comply with PCI DSS.

We will continue with our tough security questions in part 3 of this series.