Showing posts with label cloud. Show all posts
Showing posts with label cloud. Show all posts

25 August 2013

Replacing Big SaaS - How to cut the Google, Apple, Dropbox, Microsoft, ... cords

With a Prism and Snowden inspired kick in the backside I finally got around to establishing some autonomy from the Big Boys with respect to email, contacts, calendar, network storage/sync and other common personal use SaaSs.  No rocket science here, just a consolidation of lots of "which one is best for me" research, "follow the tutorial" efforts and Google and log file problems/solutions to explain how to install, configure and maintain the types of services you get "for free" from Google, Apple, Dropbox and the rest.

This article is an overview of how to accomplish replacing the important Big SaaS, it is not a detailed step-by-step with every command listed.  I reference a number of other web pages and tutorials to help with the harder parts.

Overview

Here is a basic overview of the substitutions:

ServiceBeforeAfter
Hosting and OSGoogle, Apple, Microsoft, Yahoo, ...Digital Ocean "Droplets"
Linux
EmailGoogle, Apple, Microsoft, Yahoo, ...postfix, dovecot
ContactsGoogle, Appledavical
CalendarGoogle, Appledavical
Network storage and syncDropbox, Copy, Google DriveownCloud

The aspirational criteria I had for the substitutions were:
  • Open source
  • Supported with apt-get or similar installer with an up-to-date stable version available
  • At least some recent community activity and support
  • Positive reviews, particularly as versus their popular commercial alternatives
  • Free or close to it
  • Targeted solutions, not one package that is providing many services (e.g., MS Exchange vs Postfix)
It's also important to keep in mind that these solutions generally won't be as good as their popular commercial alternatives where armies of developers and systems administrators support them and taking advantage of big economies of scale and underpricing.  To take this path you're going to forfeit convenience, better usability, rock solid systems and uptime, macro level security, and "free" pricing for greater privacy and control.

Lastly, there are many more areas that could be substituted and I've not done or written these up yet - I note at least some of them at the bottom of article.

What's Required From You

You have to be able to do the following to get this working:
  • Basic Unix shell commands and configuration file editing
  • Willingness to read various tutorials and how-tos and be able to google for the rest
  • Willingness to pay $5 per month for hosting and another $1 per month for backups
  • Accept having a total data footprint of 15GB or less (or be willing to pay for more storage)
  • A basic understanding of SSL certificates is useful

1. Create an SSH key

Follow Digital Ocean's tutorial to create your own key.

2. Have a domain name ready to use

There are many companies that offer domain registration.

3. Hosting

Set up an account with Digital Ocean (digitalocean.com).  Their basic IaaS virtual server ("Droplet") is cheap, plenty performant for our uses here and their management and provisioning interface is pleasantly usable.

Buy the cheapest cheapest droplet at $5 per month (1 CPU, 512MB RAM, 20GB Disk, 1TB transfer).  This will provide plenty of horsepower and space for the average user.

You might select "Amsterdam" as your region if you thought that might provide a safer environment for your data as opposed to hosting that is based in the USA (Digital Ocean's other sites are in New York and San Francisco).

Select OS "Ubuntu 12.04 x64".  You could probably safely use the newer versions, I've just not moved up to them yet.

Install the SSH certificate you created in step 1.

Enable "VirtIO" if you want.  Whatever it is.

After your new virtual server is created, activate automatic backups for it.  They may only be taken about once per week but they're a bargain at $1 per month.

Set up your new domain name to point to your new droplet IP address.  Digital Ocean's DNS interface is easier than godaddy's.  Configure your domain to use Digital Ocean's DNS.

NOTE: The only thing I don't like about Digital Ocean for hosting is there is no apparent way to cost effectively scale just disk size.  I'd like to keep the memory and CPU of the smallest instance but then easily scale up disk space.  Replacing network storage and big IMAP email archives will exceed the 20GB limit for "power" users.  There are plenty of other providers and some allow a low-performance-high-disk-space specification.  However, among the usual suspects like Amazon and Rackspace along with a number of others I found googling around, I didn't find any in the same price range as Digital Ocean.  Maybe Digital Ocean will add the feature of cost effectively adding disk space only in the future.

4. Basics

Verify you can log in as root using ssh and the ssh certificate you created.

Restrict root login to only allow certificate based logins.

Create a new user that you'll use to do most work from here forward.

Enable new user for sudo use.

Install zsh (or your preferred shell if its not already present) and make it your default shell.  Update your login shell preferences.

Create/deploy another ssh certificate for the new user you've created.

Install ntp.

Install iptables as your firewall.  Digital Ocean has a good tutorial

5. Supporting applications

Before we get to the applications we want, we have to install their supporting applications.

Install postgres - used by davical

Install MySQL - used by ownCloud

Install Apache and PHP - used by almost everything

Install phppgadmin - used to administer the Postgres / davical database

Install phpmyadmin - used to administer the MySQL / ownCloud database

6. Create a free SSL Certificate and install it

The certificate will be used by a number of services we install.

Use this tutorial at arstechnica to create a free Class 1 SSL certificate with startssl.com.

Tips:
  • startssl.com creates an S/MIME and authentication certificate and automatically installs in your browser.  You might want to save the authentication certificate someplace secure.
  • Certificate only good for one year - just remember you need to renew it each year (all your services dependant on a valid SSL cert will stop working when cert expires)

7. Email

Note: I don't typically use webmail, so I didn't bother installing a webmail service.

Install postfix - see Digital Ocean tutorial

Install dovecot - also see Digital Ocean tutorial, my user comments on dovecot

Update DNS MX record.

Adjust iptables firewall settings - see Digital Ocean tutorial

Tips:
  • I found "apt-get install mail-stack-delivery" did the heavy lifting for me here.
  • Make sure you un/comment out exactly what you want in /etc/postfix/master.cf
  • Increased value of mail_max_userip_connections from 10 to 30 in /etc/dovecot/conf.d/01-mail-stack-delivery.conf due to an IMAP error limit popping up in OS X mail.
  • Digital Ocean has subsequently created a tutorial for iRedMail - looks easier to set up and includes a webmail interface
Note: not added in spam filtering yet.

8. Contacts and Calendar

Install davical.

I looked at and discounted the following:
  • calendarserver - depends on extended file attributes; apt-get exists but doesn't appear to be maintained
  • radicale - no backoffice, feels too barebones
  • baikai - No apt-get; synology's choice for their sync app
  • ownCloud - ownCloud already looks bloated

9. Network storage and sync

Install ownCloud.

The goal here is secure and pervasively available files.  Like Dropbox and the paid version of BoxCryptor - both of which are closed source and therefore non-starters with my stated criteria.

You can create an encrypted filesystem on your main OS, ideally once that can be used by several OSs and place the system in ownCloud network synced storage.  When choosing an filesystem, it's important that the encrypted filesystem is in separate files or some type of chunks, not one big blob (like truecrypt) as big blobs don't sync well when you have concurrent clients syncing.  Ideally you want a filesystem that encrypts file names, content, and inode structures separately in small efficient pieces.  While interesting, I'm seeing enough limitations and sync problems with OS X's encrypted sparse bundle approach that I don't recommend it (use EncFS if you can; else use BoxCryptor even though its closed source).

iOS and Android Support

The above approach is fully supported by iOS and Android devices using standard protocols:
  • Managing email via Secure IMAP
  • Sending mail via Secure SMTP
  • Calendar via calDav over https
  • Contacts via cardDav over https
  • Network storage and sync via ownCloud iOS/Android apps; runs over over https
This probably goes without saying, but assume you'll lose your device at some point.  Think about what is on the device and how easy it is to access it.  Do you use a PIN with a self-destruct after so many incorrect entries?  Do you have logins and passwords in Contacts or Notes files?

Maintenance Notes

You will have to renew your startssl.com security certificate each year.

Spin up the occasional backup on another droplet to verify backups and the restore process works.

Security Notes

Nothing is 100% secure.  The approach I've presented here has two big problems:
  • Hoards of security specialists at the big companies will collectively know more about security than you or I ever will.  Security exploits of fairly new and not widely used applications like ownCloud and davical are possible.  You're therefore effectively trading off having thousands of staff at the big SaaS providers or the government having access to your data vs relying on common sense security basics to stay safe.  In this case, we've done the basics:
    • We're running the iptables firewall with only the bare minimum of ports open
    • All coms over SSL
  • We're not storing the actual data on the server in an encrypted format.  Ideally we'd use an encrypted filesystem on the server so that the hosting provider couldn't snoop disk data.  Of course, decrypting "on the fly" as applications access the encrypted disk is also a risk, but without using your own secured physical server you are stuck with that problem.
I've not yet installed openvpn.  Could switch access to potentially vulnerable apps like Davical's backoffice, phpmyadmin, phppgadmin to VPN only access.  I did add in .htaccess/.htpasswd files across the backoffices for slightly better security.

Lastly, this is pretty obvious, but use long passwords with lots of variation between passwords and a mix of letters (upper/lower), numbers, and symbols.

Conclusion 

Google, Apple, Dropbox and others provide a great no/low cost option for services like email, personal information management and network storage.  Signing up for an account with Google is a lot easier and cheaper than the approach outlined above.  You get most of these services "for free".  So if the thought of Google, Apple, Dropbox and others reading your emails and documents and enabling governments to do likewise doesn't bother you at all, then by all means use their free services.

However, if you think you have a right to personal information privacy without business and governments having the ability to read it then you might want to consider implementation of the approach in this tutorial.

What have I missed and what has worked well for you?

19 March 2011

Tightening the Definition of SaaS and Cloud

I've recently been exposed to two vendors offering "cloud" and "SaaS" options to replace two in-house legacy enterprise/corporate (not customer facing production) systems.

In this process, I connected some mental dots that there are really a few flavors of SaaS, and the distinction is quite important with respect to enterprise architecture.

The two service offerings can be roughly thought of in this way:
  • The offerings were touted as SaaS and cloud
  • New software that is better than our current in-house legacy systems (regardless of whether we host or they are "in the cloud")
  • The software is hosted by the software provider, unknown what type of "cloud" IaaS is under that provider, if any (perhaps just virtualization in their own DC).
  • The software instance is spun up by the provider specifically for us.  It is a copy of the software, dedicated to us.
  • The software can be extended a lot - add-on modules can be activated through configuration changes, bespoke modules/code can be added.  Kinda-sorta like a pick-and-mix or evolving PaaS model
  • Software upgrades must be rolled out with associated consideration of any bespoke changes that have been made.
  • Security restricted to only be available within your corporate intranet
  • Flat monthly rate per user charging model with volume (# of users) price breaks
As the two service reviews went on, the dots finally connected, and I realized I had been *marketed* too more effectively than I'd like to admit.

The above isn't "cloud" or SaaS, at least not with the definition I'm going to take here.  It is actually a hosted managed service offering (MSP or ASP).  At best it's a halfway-house to cloud and SaaS.  All you've really done with this approach is shift some techops and infrastructure responsibilities from in-house to the service provider and reduced your in-house economies of scale (assuming you have to maintain those skills).

For something to be a cloud/SaaS offering in my terms, here is what it needs to be:
  • Public Internet facing
  • One centralized installation shared by many customers
    • Powering the service is an IaaS
    • Can quickly scale up/down with virtually no cost to make the change (costs changing proportional to increased/decreased usage)
    • Horizontal fault tolerance design (HW redundancy becomes irrelevant)
  • Focused offering
    • Service addresses a specific functional requirement, it isn't an omnibus offering
    • Vibrant user community making suggestions of how to improve the product
    • Quick time to market for new features
    • Strong product management and vision
  • Product improvements put live appear immediately for all customers
    • One exception: "beta" version may be option in by the customer, but certainly under the customer, not vendor, control  
    • No rolling upgrades for each customer once a new release is ready
  • A complete set of APIs ("API as a storefront")
    • Almost all functionality available via the application is available via API
    • Well documented
    • Hardened (API security, rate limits, et al)
    • Ready for mash-up integration with other focused offerings
  • Usage based billing
    • Proportional to amount of computation, storage, and connectivity you use (IaaS transparency)
    • Additionally factoring in the value of the SaaS itself
    • No billing related to seats, users, or CPU cores
In noting the difference between the two, I'm not advocating one or the other.  The choice of course depends on circumstances and strategy.  I'm also making no effort to address the common enterprise concerns of cloud such as security, data ownership, and business continuity.  However, I do have a very strong view which way the IT world is going and given the choice, I know which I'd select.

17 March 2010

QCon London 2010 - Cloud Computing

Cloud computing and virtualization was a popular topic at QCon London 2010.

Background/primer/proposition:
  • Cloud marketing suggests that hardware and/or systems administration is now a commodity that you shouldn't have to think about too much and can safely outsource. 
  • Just like TDD (Test Driven Development) decreases the need for QA, CI (Continuous Integration) with direct deployments into an operational environment will decrease the need for systems administration.
  • Outsourced pay-as-you-use cloud propositions will likely cause costs to switch from capex to opex to budget for computing capacity (was traditionally HW and SW in capex)
  • Grossly simplifying, there are four interesting cloud propositions available:
    • In-house hardware virtualization - cloud under your control, in your data centre (e.g., VMware, Xen, Solaris Zones)
    • Outsourced hardware virtualization (IaaS - Infrastructure as a Service) - cloud as an "infinite capacity" of generic computing and you define the systems from the OS up (e.g., Amazon's AWS EC3)
    • Outsource compute capacity (PaaS - Platform as a Service) - cloud as a place to deploy software components into a fairly tightly defined (constrained) operating environment (e.g., Google's App Engine)
    • Pure services (SaaS - Software as a Service) - cloud as a source of "commoditized" services to be used when you construct an application (e.g., Google's web analytics, Facebook OAuth API for user credential management, AWS's S3 for storage)
  • Cloud means that you can cost effectively create and delete computing resources as needed for parts of your IT environment that don't require regular use.  For example testing and in particular load testing.
  • Non-tech business types get excited by cloud because:
    • If your an entrepreneur type, you get bonus points for running your infrastructure from the cloud when looking for funding (more-so in the last two years, this is declining some now)
    • Finance and P&L owners get exited any time they can commoditize something to drive down costs.  Tech has mixed feels about this as "drive down costs" tends to imply redundancies.
    • Easier to justify upfront costs for a new business case if you only pay for what you use (a failure is easy to delete, no sunk capex expenditures)
  • Both tech and non-tech types get excited about not having to generate a lot of paperwork then wait for authorizations and shipping times to get new kit.  Assuming company bureaucracy doesn't shackle down cloud controls too vigorously, a new virtual platform can made available very quickly and at low costs.
  • If you can maximize utilization of HW you buy, then it's no different than buying cloud resources (likely cheaper)
General Observations on Cloud and Virtualization

Virtualization enables us to achieve that solutions architecture ideal of "one box one purpose", it just that it's become "one virtual box one purpose".

Virtualization enables us to take applications that don't have a good threading model to take advantage of boxes with many cores and use up all the cores (application per VM; VMs added until all cores are utilized)

Cloud does imply a lack of control over your core infrastructure.  Do you need this control?

The cloud is still just a bunch of hardware systems in a data centre.  There is no magic.  Their DC and systems admins will have their share of problems as well.  If the cloud sysadmins can provide more uptime than your own techops can provide at a similar cost point, the argument for cloud increases.

Similarly, there is debate over how good the SLAs are for cloud.  But really, how enforceable are the SLAs you have anyway?

Your choice of virtualization or cloud will enforce a way of creating applications and handling services.  You may not like it.  Conversely, it may force you to be disciplined in a new way otherwise missing when you create applications.

You will make an investment to learn the systems and make your applications work in the cloud environment.  This will cost and create some lock-in.  This is more true for PaaS than IaaS.

The cloud is being used to "long tail" a number of services.  Service "particles" are appearing you can use to provide an aspect of functionality in your overall solution.  The more of these partners you use that are in the same cloud with you, the greater the efficiencies and hence lower costs.  Combined with first mover advantage and vendor lock-ins, this is a network effect that should drive toward having just a few cloud suppliers in a few years.

Relating Cloud to Internet Gambling Business

The use of an in-house cloud like VMWare makes good sense.  We're regularly adding in new products that need to undergo development and test yet we don't need permanent capacity to service these requirements.  While a VMWare setup can't fully proxy a production environment (unless you use VMWare in production as well), it is very suitable for most types of functional verification other than load and low level device compatibility.

Being able to hand the keys over to a set of virtualized servers enables more entrepreneurial behavior.  For example, if you have a larger business that has a heavy layer of process, you can still work effectively with start-up partners.  Give them the keys to their own set of systems and they can do whatever they want with them without impacting your core systems.  At which time they're proven successful, their revenue stream can justify improved risk management.

Handling flash crowds with cloud probably isn't possible for our industry today.  In-house clouds don't really handle flash crowds (Why not just have the capacity there anyway? What do you want to cripple to support that big marketing campaign?).  Outsourced cloud generally isn't possible as the bigger cloud providers may not allow internet gambling to be run within their clouds (AWS restriction anyway; and yes, this will likely ease up at some point, just look at Akamai's behavior on Internet Gambling).  Also a CDN (Content Distribution network; an SaaS of a sorts) will take care of a lot of the flash crowd load we experience.

Using an outsourced cloud PaaS for data analytics doesn't seem likely.  Data analytics crunching benefits from close proximity to the data set being crunched.  Bandwidth to upload big data sets into the cloud from higher connectivity costs locations (lots of internet gambling in offshore locations with expensive ISP costs) doesn't make sense.

SaaS however is quite interesting.  Services like Google Analytics that enable almost real-time data analysis are clearly the way to go for an Internet gambling site.  Highly bespoke business analytics will likely stay inside the business or use a SaaS for commodity analytics.  

Depending on who you ask, the following may be real risks or just FUD:
  • Taxation - as services are sourced from someplace other than the tax advantaged place you have your business in, you are at risk of emerging taxation implications
  • Centralized point for governments to enforce legal compliance.  By hosting in the cloud (which is actually going to be one or more physical data centres), you've given the governments that have oversight of those data centres a good choke point to use against you.  They could use taxation, inappropriate content, or services not in compliance with regulation.
Conclusion

Virtualization makes complete sense for Internet gambling companies, all the way from development through to production.  That's not news, most in our sector have been using virtualization for a few years now.

On Cloud/IaaS provisions, AWS (a clear IaaS market leader) have flatly disallowed any internet gambling related operations inside their service.  While it is likely you could get away with internal use (dev, test) of cloud in these services, do you want to create a dependency and then have it suddenly shut off on you?  AWS of course isn't the only show in town for IaaS  There are other providers -  you would have to evaluate them versus related risk factors and re-development costs to integrate their use into your environment.

There is no clear use yet of Cloud/PaaS for standard Internet gambling products.

There are plenty of emergent opportunities to use Cloud/SaaS for Internet gambling.

(Index of emergent technologies applied to Internet Gambling)