Showing posts with label how to. Show all posts
Showing posts with label how to. Show all posts

01 April 2018

How to extract a contact list from BambooHR and import it into Google Contacts

BambooHR is an HR SaaS used to manage employees.  It is typically used as a system of record for employee details.  Unfortunately, BambooHR doesn't make it easy for your digital address book to use the contact information it contains.  Two ways the product could be improved to help out its users in this respect are:
  • Offer CardDAV service.  This is by far the best option.  This would enable you to connect iOS or OS X directly to BambooHR leaving Bamboo in charge of employee data and minimal messing about.
  • Export Contacts a friendly format:
    • Google Contact's CSV format (which implies very specific column requirements, naming and order)
    • Apple's VCF (contact card) format
As these options aren't possible we'll fake our way into the second option by using BambooHR's ad-hoc reporting tool to create a CSV file we'll eventually import (with changes) to Google Contacts.

1. Create a bamboohr report that includes the employees you want along with the fields you want.  Download it in CSV format.

Here is an example of some of fields I selected using BambooHR's custom reporting tool:



2. Enter Google Contacts.  Create a single contact that has sample data in all the fields you want have available for your contacts.   It is possible to create custom fields for data that doesn't directly map from BambooHR to Contacts using Custom fields in Contacts.  For example, I created a custom Contacts field for employee start date:


I used other custom fields for line manager and office location.

Now export a CSV version of the one contact record you've created.

3. Open the Contact CSV file in OS X Numbers (Excel, at least on OS X, doesn't work with UTF-8 encoding and trashes special characters).  You'll note many unused column headers.  Their inclusion, names and order are important - don't change any of them.  Google Contacts import is really picky when it imports data and generally won't work if you don't have the exact columns it expects.

Some of the columns will have static values.  For instance I ended up with a "Relation 1 - Type" column that I had to repeat "Line Manager" for all rows.

4. Open BambooHR CSV file in OS X numbers.  Remove the one test record.  Drag the columns from the BambooHR CSV file to the Google Contacts CSV file making sure you preserve column names and order.

Note: You MUST preserve column inclusion, names and order for importing to work.  Most of the problems and unexpected results I had with this process was because I hadn't done so.

If you've exported any dates like employee start date from BambooHR you'll need to change them from the default MM/DD/YYYY to YYYY-MM-DD which is what Google Contacts import requires.

5. Export the results from Numbers as a new CSV file.  At this point you should have a new CSV file that is a merge of Google Contacts columns and all contact data from BambooHR.  Many of the columns will be empty.

6. Before you import anything, make sure you take a backup of any existing contacts you have.  You may also to decide to delete the contacts that you already have in place before you import anything.  Google Contacts provides a merge function which works pretty well but I've found its better to backup/delete my current contact set and start with an empty upload area.  In particular we might hand off a mobile phone number from a previous employee to a new one meaning that two employees will have the same number if I don't clear out the old ones.  If you make changes to the original contacts or add notes to them you'll be stuck - you'll need to rely on a merge.

At this point you have a backup (if you want one) and you've decided whether you're starting with a clean slate or will merge.

7. Import your new CSV contacts list into Google Contacts.

Run Google Contacts merge if you need to.

8. Mark the newly imported set to be included in “My Contacts” (by default it isn’t).

9. Assuming you've connected your Google Contacts to OS X and iOS Contacts apps it may take a few minutes for them to appear.

And that's it.  Depending on how often you add/change/delete employees in BambooHR you'll have to repeat this every few months to make all the new changes available.  You can help your colleagues who would benefit from your fancy new contact list by exporting and sharing VCF (for OS X, iOS) or Google CSV format files from Google Contacts so others can import them.

---

Postnote: BambooHR has updated some of their default report column header names to match Google's Contacts header naming convention.  If they would also allow for user-defined column names and an optional static value for these custom columns in their ad-hoc reporting tool that would be really useful as well as you could basically build the CSV file in exactly the format needed by Google Contacts import.

25 August 2013

Replacing Big SaaS - How to cut the Google, Apple, Dropbox, Microsoft, ... cords

With a Prism and Snowden inspired kick in the backside I finally got around to establishing some autonomy from the Big Boys with respect to email, contacts, calendar, network storage/sync and other common personal use SaaSs.  No rocket science here, just a consolidation of lots of "which one is best for me" research, "follow the tutorial" efforts and Google and log file problems/solutions to explain how to install, configure and maintain the types of services you get "for free" from Google, Apple, Dropbox and the rest.

This article is an overview of how to accomplish replacing the important Big SaaS, it is not a detailed step-by-step with every command listed.  I reference a number of other web pages and tutorials to help with the harder parts.

Overview

Here is a basic overview of the substitutions:

ServiceBeforeAfter
Hosting and OSGoogle, Apple, Microsoft, Yahoo, ...Digital Ocean "Droplets"
Linux
EmailGoogle, Apple, Microsoft, Yahoo, ...postfix, dovecot
ContactsGoogle, Appledavical
CalendarGoogle, Appledavical
Network storage and syncDropbox, Copy, Google DriveownCloud

The aspirational criteria I had for the substitutions were:
  • Open source
  • Supported with apt-get or similar installer with an up-to-date stable version available
  • At least some recent community activity and support
  • Positive reviews, particularly as versus their popular commercial alternatives
  • Free or close to it
  • Targeted solutions, not one package that is providing many services (e.g., MS Exchange vs Postfix)
It's also important to keep in mind that these solutions generally won't be as good as their popular commercial alternatives where armies of developers and systems administrators support them and taking advantage of big economies of scale and underpricing.  To take this path you're going to forfeit convenience, better usability, rock solid systems and uptime, macro level security, and "free" pricing for greater privacy and control.

Lastly, there are many more areas that could be substituted and I've not done or written these up yet - I note at least some of them at the bottom of article.

What's Required From You

You have to be able to do the following to get this working:
  • Basic Unix shell commands and configuration file editing
  • Willingness to read various tutorials and how-tos and be able to google for the rest
  • Willingness to pay $5 per month for hosting and another $1 per month for backups
  • Accept having a total data footprint of 15GB or less (or be willing to pay for more storage)
  • A basic understanding of SSL certificates is useful

1. Create an SSH key

Follow Digital Ocean's tutorial to create your own key.

2. Have a domain name ready to use

There are many companies that offer domain registration.

3. Hosting

Set up an account with Digital Ocean (digitalocean.com).  Their basic IaaS virtual server ("Droplet") is cheap, plenty performant for our uses here and their management and provisioning interface is pleasantly usable.

Buy the cheapest cheapest droplet at $5 per month (1 CPU, 512MB RAM, 20GB Disk, 1TB transfer).  This will provide plenty of horsepower and space for the average user.

You might select "Amsterdam" as your region if you thought that might provide a safer environment for your data as opposed to hosting that is based in the USA (Digital Ocean's other sites are in New York and San Francisco).

Select OS "Ubuntu 12.04 x64".  You could probably safely use the newer versions, I've just not moved up to them yet.

Install the SSH certificate you created in step 1.

Enable "VirtIO" if you want.  Whatever it is.

After your new virtual server is created, activate automatic backups for it.  They may only be taken about once per week but they're a bargain at $1 per month.

Set up your new domain name to point to your new droplet IP address.  Digital Ocean's DNS interface is easier than godaddy's.  Configure your domain to use Digital Ocean's DNS.

NOTE: The only thing I don't like about Digital Ocean for hosting is there is no apparent way to cost effectively scale just disk size.  I'd like to keep the memory and CPU of the smallest instance but then easily scale up disk space.  Replacing network storage and big IMAP email archives will exceed the 20GB limit for "power" users.  There are plenty of other providers and some allow a low-performance-high-disk-space specification.  However, among the usual suspects like Amazon and Rackspace along with a number of others I found googling around, I didn't find any in the same price range as Digital Ocean.  Maybe Digital Ocean will add the feature of cost effectively adding disk space only in the future.

4. Basics

Verify you can log in as root using ssh and the ssh certificate you created.

Restrict root login to only allow certificate based logins.

Create a new user that you'll use to do most work from here forward.

Enable new user for sudo use.

Install zsh (or your preferred shell if its not already present) and make it your default shell.  Update your login shell preferences.

Create/deploy another ssh certificate for the new user you've created.

Install ntp.

Install iptables as your firewall.  Digital Ocean has a good tutorial

5. Supporting applications

Before we get to the applications we want, we have to install their supporting applications.

Install postgres - used by davical

Install MySQL - used by ownCloud

Install Apache and PHP - used by almost everything

Install phppgadmin - used to administer the Postgres / davical database

Install phpmyadmin - used to administer the MySQL / ownCloud database

6. Create a free SSL Certificate and install it

The certificate will be used by a number of services we install.

Use this tutorial at arstechnica to create a free Class 1 SSL certificate with startssl.com.

Tips:
  • startssl.com creates an S/MIME and authentication certificate and automatically installs in your browser.  You might want to save the authentication certificate someplace secure.
  • Certificate only good for one year - just remember you need to renew it each year (all your services dependant on a valid SSL cert will stop working when cert expires)

7. Email

Note: I don't typically use webmail, so I didn't bother installing a webmail service.

Install postfix - see Digital Ocean tutorial

Install dovecot - also see Digital Ocean tutorial, my user comments on dovecot

Update DNS MX record.

Adjust iptables firewall settings - see Digital Ocean tutorial

Tips:
  • I found "apt-get install mail-stack-delivery" did the heavy lifting for me here.
  • Make sure you un/comment out exactly what you want in /etc/postfix/master.cf
  • Increased value of mail_max_userip_connections from 10 to 30 in /etc/dovecot/conf.d/01-mail-stack-delivery.conf due to an IMAP error limit popping up in OS X mail.
  • Digital Ocean has subsequently created a tutorial for iRedMail - looks easier to set up and includes a webmail interface
Note: not added in spam filtering yet.

8. Contacts and Calendar

Install davical.

I looked at and discounted the following:
  • calendarserver - depends on extended file attributes; apt-get exists but doesn't appear to be maintained
  • radicale - no backoffice, feels too barebones
  • baikai - No apt-get; synology's choice for their sync app
  • ownCloud - ownCloud already looks bloated

9. Network storage and sync

Install ownCloud.

The goal here is secure and pervasively available files.  Like Dropbox and the paid version of BoxCryptor - both of which are closed source and therefore non-starters with my stated criteria.

You can create an encrypted filesystem on your main OS, ideally once that can be used by several OSs and place the system in ownCloud network synced storage.  When choosing an filesystem, it's important that the encrypted filesystem is in separate files or some type of chunks, not one big blob (like truecrypt) as big blobs don't sync well when you have concurrent clients syncing.  Ideally you want a filesystem that encrypts file names, content, and inode structures separately in small efficient pieces.  While interesting, I'm seeing enough limitations and sync problems with OS X's encrypted sparse bundle approach that I don't recommend it (use EncFS if you can; else use BoxCryptor even though its closed source).

iOS and Android Support

The above approach is fully supported by iOS and Android devices using standard protocols:
  • Managing email via Secure IMAP
  • Sending mail via Secure SMTP
  • Calendar via calDav over https
  • Contacts via cardDav over https
  • Network storage and sync via ownCloud iOS/Android apps; runs over over https
This probably goes without saying, but assume you'll lose your device at some point.  Think about what is on the device and how easy it is to access it.  Do you use a PIN with a self-destruct after so many incorrect entries?  Do you have logins and passwords in Contacts or Notes files?

Maintenance Notes

You will have to renew your startssl.com security certificate each year.

Spin up the occasional backup on another droplet to verify backups and the restore process works.

Security Notes

Nothing is 100% secure.  The approach I've presented here has two big problems:
  • Hoards of security specialists at the big companies will collectively know more about security than you or I ever will.  Security exploits of fairly new and not widely used applications like ownCloud and davical are possible.  You're therefore effectively trading off having thousands of staff at the big SaaS providers or the government having access to your data vs relying on common sense security basics to stay safe.  In this case, we've done the basics:
    • We're running the iptables firewall with only the bare minimum of ports open
    • All coms over SSL
  • We're not storing the actual data on the server in an encrypted format.  Ideally we'd use an encrypted filesystem on the server so that the hosting provider couldn't snoop disk data.  Of course, decrypting "on the fly" as applications access the encrypted disk is also a risk, but without using your own secured physical server you are stuck with that problem.
I've not yet installed openvpn.  Could switch access to potentially vulnerable apps like Davical's backoffice, phpmyadmin, phppgadmin to VPN only access.  I did add in .htaccess/.htpasswd files across the backoffices for slightly better security.

Lastly, this is pretty obvious, but use long passwords with lots of variation between passwords and a mix of letters (upper/lower), numbers, and symbols.

Conclusion 

Google, Apple, Dropbox and others provide a great no/low cost option for services like email, personal information management and network storage.  Signing up for an account with Google is a lot easier and cheaper than the approach outlined above.  You get most of these services "for free".  So if the thought of Google, Apple, Dropbox and others reading your emails and documents and enabling governments to do likewise doesn't bother you at all, then by all means use their free services.

However, if you think you have a right to personal information privacy without business and governments having the ability to read it then you might want to consider implementation of the approach in this tutorial.

What have I missed and what has worked well for you?

25 March 2013

Employee Contact, Profile and Directory Information (via Google Apps for Business)

After spending the usual unexpectedly long period of time to figure out the structure of Google Directory, Contact and Profile management for Google Apps for Business, I thought I'd share a summary of how it ties together.  I also provide references to help you figure out synchronising contact info and a few thoughts about the risks of using Google Plus Profiles.

(Please note this blog entry is about using Contacts, Profiles, and Directory in a business context, not for personal use.)

The Basics

If you just want to make Google Apps and Contacts work, read this section.  Basically, you will need to learn about Google Contacts and Google Plus (Profiles/About).

  1. Log in to your Google company account (YourName@yourcompany.com) to get started (e.g., via your Contacts page)
  2. If you're not already there, select Contacts from the top horizontal Google services menu.  You'll note "My Contacts" in the left-side navigation.  Use the "My Contacts" group to manage your own personal and private contact information.  By default you see a list of contacts that you have sent/received email (TBC). Anything you add to default employee contact information is private just to you and is not shared.  Google Contact information can be synchronised with other contact management software/platforms like Apple iOS, Google Android, and Microsoft Outlook.  The level/quality of sync is good and it's consistent across at least OS X, iOS and Android (see below for some helpful links on Contacts sync).
  3. Use Google Plus (select "Profile" in left-side navigation, then select "About" from horizontal menu) to set up your own Google Profile that can be shared with others.  The information in your profile should be work focused (use your own private Google account for personal information).  By default some of the less sensitive information you enter into your Google Plus Profile will be publicly shared - this can be limited through Profile settings.  You should take a look at the public version of your Google Plus Profile to make sure you are comfortable with the level of sharing.  Your Google Plus Profile is not well synchronised with other devices and contact software via standard sync mechanisms.  
  4. Unfortunately, there is no sharing of Google Contact information between people (employees of a company), i.e., a centralised and robust employee directory with employee details.  However, you can use your Google Plus Profile to add and share your contact information with others via Google Plus via web browser access or using Google Plus apps on phones/pads.

Further Details

Google's way of managing employee ("Contact") information is confusing.  Three fundamental collections of contact information (Directory, Contacts, Profile) are distributed across three separate yet partially-integrated access points (Control Panel, Contacts, Plus).  Here is how they tie together:

1. Google Contacts (aka "My Contacts").  Available to all employees.  Contacts are created (I think) in two ways.  First: In the "Directory" list of employees, you change or add any field of information (except Circles).  Second: you send or receive an email to someone on the Directory list (TBC).  By putting someone in the Directory list into a Circle, I think this sends a "join circles" invite to the employee but doesn't create a Contact.  Contacts are used by employees to store their own private information about other Google Apps users in the company or any contact information about anyone (employee or external) the employee wants to add.  Employee created contact information is not seen by or shared with other employees.  The information can be sync'ed with other devices through at least CardDav (Apple iOS), Android, and Outlook.  Sync quality is good.

2. Google Plus Profile (select "Profile" from left navigation and "About" from horizontal navigation).  Employee can self-maintain information about themselves in their Google Plus Profile.  Profiles are connected to Google Directory and Contacts.  Particularly useful is the "Contact Information" section in the Profile to enter phone numbers.  Google Plus Profiles are not fully integrated with Google Apps.  It appears that Google is planning to use Google Plus Profile (rather than Google Contacts) to share contact information between people, including employees.  Unfortunately, profile information doesn't sync into your other non-Google contact management software or devices.  Instead you just get a link to the user's profile.

3. Directory (aka Directory Profile).  Accessed from the left-side navigation of Contacts.  Available to all employees.  Directory is a list of all users associated with the company/domain in the Google Apps.  The Directory shows the same user list managed by the Administrator in the Google Admin Control Panel.  By default a user's Directory form shows name, physical address, email address, Notes and attached user profiles although you can add any type of contact field you want.  If you enter or change any information on this form (other than a Circles addition/change), a personal private contact in the My Contact list is created as a derivative of the Directory entry.  Directory entries also connect company employees to their Google Plus Profile.

4. Google Admin Control panel, User Management.  For Google Apps administrators only. This is where employees are first added to Google Apps by administrators for the business.  The only only useful contact information stored here is the employee's name, primary email address and email address aliases (aka "nicknames").  This function is primarily used to add, delete and otherwise administer employees.  Administrators also use the control panel to enable/disable contact sharing (which is enabled by default) and specify related permissions.  Note that "Contact Sharing" is really misleading - it should actually be called "Directory Sharing" because all it does is expose Directory entries (and related limited information) to your users via the Contacts function.

Using "Search" in Contacts makes this really clear.  You can see when and how employees and their information is split between the three areas: My Contacts (Contacts); Circles (Profile); Domain Contacts (aka Directory - more confused naming!).

Tips on Sync

The following are references to instructions to set up sync on various platforms:

  • iOS/iPhone/iPad - Contacts syncing with CardDav.  Don't use the older MS-Exchange way of syncing.  If you have problems, see also:
    • http://support.google.com/mail/bin/answer.py?hl=en&answer=2753077
    • http://support.apple.com/kb/HT4872?viewlocale=en_US
  • OS X
    • Make sure you're running the latest version of OS X
    •  http://www.tuaw.com/2012/09/28/google-now-supports-carddav-making-it-easier-than-ever-to-import/
  • Android phones and tablets - as the others, works fine for contacts but not for Google Plus Profile information
  • Outlook and Blackberry - not tested, but I assume works it works fine for Contacts and not for Plus profile information

Risks

Using Google Plus in the enterprise is somewhat risky for several reasons:

  1. By default much of the profile and other information is publicly shared by default.  While this can be restricted by users, it can't be restricted at a global administrative level except by turning off all Google Plus access for all users in the domain.
  2. It appears that Google is using Plus and Apps 2013 to create an environment where you may be required to upgrade/buy proper management of the two in the future (so-called Google Plus "Premium" features).
  3. Google Plus itself and the integration of Google Plus into Google Apps is fairly new, not formally part of Google Apps for Business and therefore unsupported at the same level as Google Apps.  You can also expect it to change as the problems outlined here are sorted out over time.
  4. Managing contact information using typical tools (Google Contacts, MS Outlook, OS X and iOS Contacts) and Google Plus Profiles is clumsy, fragmented and requires users to learn a new tool (Google Plus).  Also users are being asked to self-maintain their shared contact information instead of having someone else do it for them.  As a result users may not see sufficient benefits to start using Google Plus.
However, without looking at a bolt-on Google Apps extension for contact sharing, there is no other option other than using Google Plus profiles to share contact information between business employees. 

You may also have other reasons to press your users toward Google Plus such as enhanced employee collaboration that will encourage the adoption of Google Plus Profiles.

Summary

Google Contacts does not provide a shared/centralised employee information management tool for your business.  CardDav and other types of contacts integration between Google and applications like OS X's Contacts doesn't propagate the Google Plus Profile information.

If your business can tolerate the risks, you can use Google Plus Profiles to have your employees self-manage, centralise and share their details.  You can use conventional browser access, Android, or the Google Plus app on phones/pads to access the resultant shared contact (profile) information.

Until Google has improved employee/contact management, products like MS-Exchange will continue to be the de facto choice for employee information management for larger businesses.

14 August 2012

Dropbox Security, From TrueCrypt to BoxCryptor and 1Password

(If you want to skip the below and just get the recommended answer, go buy Boxcryptor and 1Password on all your platforms.  Job done.)

When Dropbox had various security issues last year (the no passwords required for some hours was the kick I needed to sort my security out), I started using Truecrypt to contain all sensitive material I was keeping in Dropbox.  Truecrypt felt good as it was opensource, free, stable, secure, and reasonably usable on OS X and MS-Win.

While I felt a 1000x better about my security situation, I also lost a lot of the convenience of Dropbox by moving to Truecrypt:
  • File sync.  Truecrypt stores its filesystem in a single file.  While Dropbox is efficient at syncing big files at a block level, it doesn't cope well with changes to that file happening roughly concurrently from two or more locations.  If you mount your Truecrypt filesystem from two or more machines and make even vaguely concurrent changes (within a sync activity for example), you end up with two conflicted Truecrypt files.  One quickly learns to only open the Truecrypt volume on one machine at a time.
  • Multi-platform access.  One thing Dropbox did well was to have clients available on all major platforms.  I could access my Dropbox files from OS X, MS-Win, iOS, Android and Linux.  When I switched to TrueCrypt, I was limited to PC, Linux and Mac only (and one at a time at that), no mobile/tablet access.
  • Password management.  I won't say much about this other than it became harder using Truecrypt.
That was last year.  One of the great things about tech is that problems that need solving tend to get solved if you're patient enough.
Enter Boxcryptor for file security and improvements to 1Password for password management.
While there are a number of solutions available to encrypt what you store in Dropbox, I consolidated onto Boxcryptor:
  • Secure.  Uses AES-256.  No cloud aspect to Boxcryptor and therefore no third party has my master key and can take a peak at my data.
  • Plays nice with Dropbox.  Boxcryptor uses a folder+file structure (aka "package" on OS X) with each file encrypted separately enabling Dropbox efficiently sync.
  • Multi-platform access.  Working clients on all major OSs.  At least read access on iOS and Android.
  • Stable.  I've not had a single crash or corruption yet (although I'm still backing up more frequently than I might otherwise).
  • No major delays in supporting the major OS upgrades.
  • It allows for up to 2GB for free and more if you license it.  2GB is a lot.  Once I got comfortable with it I bought a license to get rid of the 2GB restriction.  I feel the license is a nominal cost versus the upside of more user friendly security and vendor support.
I considered Datalocker, Cloudfogger, Hyperdrive, and encrypted zip files.  All of them failed in one or more of the above.
An aside on Dropbox and sharing files:  I don't retain Dropbox's easy sharing of (encrypted) files using Boxcryptor.  Encrypted zip files still perfectly acceptable and secure way to e.g. share a single file in Dropbox with colleagues so long long as you unzip into a secure location and not into Dropbox.  Then you have to zip+encrypt and move the result back into the shared folder in Dropbox.  Zipfile usability compared to regular Dropbox sharing and syncing is poor as a result.  Note that today Boxcryptor doesn't appear to (easily) support multiple concurrently-open Boxcryptor filesystems.  When it does I could see having a Boxcryptor filesystem dedicated to sharing a set of folders/files with a specific workgroup.  Each group to have its own Boxcryptor filesystem - still somewhat painful but better than zip files.
Moving on to password management.  I have to admit my previous method wasn't overly secure and certainly TrueCrypt decreased it's usability.  As I was digging into secure storage, I also had a hunt around for how to improve password management.
Enter 1Password.  Yes, it's been around awhile, but used to be very OS X centric.  I don't know when they went multi-platform but they have.  While they've been the premium (i.e. expensive!) choice for OS X password management for awhile, the lack of support for other platforms had always been a showstopper for me.
Here is the thinking that led me to 1Password:
  • Multi-platform: MS-Win, OS X, iOS, Android.  It's not on Linux, but I don't use a Linux desktop for the 1Password primary use case anyway.
  • Secure.  While I can't keep 1Password's database in Boxcryptor's filesystem (I could, but I lose mobile/tablet access), the 1Password security approach is fine.  My passwords don't go to another third party password service to maintain them.  While Dropbox has my password files, they are encrypted.
  • Plays nice with Dropbox.  The 1Password DB is also a folder+file (package) structure, just like Boxcryptor.  As a result, Dropbox syncing works well.
  • Well supported browser plugins.  I use Chrome and Safari and both are well supported.  Support isn't quite so good on mobile/tablet platforms, but it's better than what I had before.
  • Widely used.  The tech community seems to widely use it.  While not a particularly scientific measure, it seems to be on its way to being a "best practice" solution in my peer group.
I've now deployed 1Password's database into Dropbox.  It'll take me awhile to load all my credentials into 1Password but I think it's a durable investment.
One downside is that 1Password isn't overly cheap.  You have to pay for licenses for each platform (Android still free).  However, just like with Boxcryptor, I think it's worth the cost for the stability, support, and commitment to keep up with OS changes.
I did have a serious look at and play with Keepass for password management.  I like that it's free and opensource.  I liked aspects of it's design and usability.  However there were a few factors that put me off:
  • Fiddly.  There are two different and somewhat competing database and application tracks, 1.x and 2.x.  Both are under active development.  There are various "unofficial" platform ports of each track to various OSs.  You have to pay attention to what version you use on e.g., OS X to make sure it's compatible with the version you use on iOS.  
  • Not keeping up with OS upgrades.  The main OS X port indicated support for OS X 10.6 as most recent and today OS X is at 10.8.  I don't want to be the beta tester for new Keepass releases - what I'm securing is too critical to mess about with.
  • The Keepass database is a single file, meaning that like with TrueCrypt you might have to deal with Dropbox sync collisions.
As a result, I'm an even happier Dropbox user now that I have secured files and passwords and reasonable usability to access both.  All in the licenses across all the platforms for both Boxcryptor and 1Password cost me about $125 (£80).  Yes, this is a lot, but conversely I now feel like I have the best of both worlds - the convenience of Dropbox and the comfort of strong security where it's needed.

04 September 2011

iPhoto and durable photo management


When managing your digital photos, there are three things you really should do:
  • Make backups of your backups of your backups.  These are your photos, don't mess about here - make backups regularly and store one of your backups someplace remote.
  • Use JPG for your file formats.  If you end up with a camera saving in some goofy format either convert to JPG or get a new camera.  JPG is like mp3 is for music - it'll be a durable photo format that will be around for a very long time and is supported by lots of tools.
  • Don't use software to organize your photos for you.  Use a simple directory format.  Software and their proprietary organization approaches will come and go but simple directories and folders will be around for a long time to come.  The following blog article is on this last point.
In 1999 when I bought my HP Photosmart C30 I started organizing my photos in a simple directory structure: Photos/YYYY/YYYYMMDD.  Something like this:


Over the years I would try whatever photo management software came with new cameras or emerging opensource packages.  l always regretted it as the photos would end up being hidden inside of some database and difficult to extract.  I would end up maintaining photos in two locations - the photo software and my simple archive.  That is, until I would get rid of the software and proprietary database files.

All that changed when I drank the Apple, iPhone and iPhoto koolaid when the iPhone came out.  The iPhoto software hits the magic "good enough" point Apple does so well and the integration with the iPhone Just Worked.  For the first few months I maintained iPhone photos in two locations - iPhoto and my simple archive.  But as I started using iPhoto more and more, using it to maintain photo albums, I just got lazy and one day stopped copying files from iPhoto into the archive.

I did continue to use my simple archive for regular camera photos but of course over time I started using the phone more and more for photos and conventional cameras only for special occasions like holidays.  I would pull in other camera pictures from the my simple archive into iPhoto to create albums.

I kept telling myself I could extract the photos any time - you can enter the iPhoto database structure as a filesystem (~/Pictures/iPhoto Library - "Show Package Contents"), similar to my simple archive approach.  iPhoto's File->Export function was there of course, but it would not properly set file modified date meaning it was more difficult to get into the right folder without looking at header data in the JPG.  Meanwhile, time goes on, and nothing lasts forever, maybe not even iPhoto.

That brings us to today and here's my strategy:  I've accepted that iPhoto will hold the "master" version of all iPhone (and more recently Samsung Galaxy s2 photos) and I'll occasionally extract these photos from iPhoto and fold them into my simple archives.

Here is how I did it:

1. Within iPhoto, Command-F search all your Photos for "iPhone".  Assuming you haven't otherwise marked your photos (title, tags, event names, ...) with "iPhone", iPhoto gives you a subset of your Photos created with the iPhone.

2. Within iPhoto, use File-Export to save these photos to the filesystem.  They'll be saved as one big set in whatever directory you specify.  I named the files with prefix "iphone" and selected the option to sequentially number them.  That way I can always go searching in my simple archive for iPhone photos by filename if I need to.

3. Use MacPort.org's "port" command to install jhead.  Run "jhead -ft *.JPG *.jpg" at the shell to correct all the modified dates so your files are date/time stamped with the date/time the picture was taken.  I also had a handful of .PNG (screen captures) and .MOV (movies) files and I just left these dates as they were.  Probably wrong, but I only had a few.

4.  I wrote a small shell script to organze the extracted pictures by date to match my simple archive format.  Here it is:


# orgpix.sh - organize iPhoto exported photos by date 
# Files will be moved to into a directory structure like this: 
#
# YYYY 
#    YYYYMMDD
#    YYYYMMDD 
#    ... 
# YYYY 
# ... 

# Place this file in the directory containing all the files you want 
# to organize and run it from there. 


ls -1 *.jpg | while read fn; do 
   eval set `stat -f "%Sa" -t "%Y %m %d" "$fn"` 
   export year=$1 
   export month=$2 
   export day=$3 
   echo "$fn : $year $month $day" 
   mkdir $year 2> /dev/null 
   mkdir ${year}/"${year}${month}${day}" 2> /dev/null 
   mv "$fn" ${year}/"${year}${month}${day}" 
done

5. Used cp to merge the newly organized files and directories into my main photo archive:

cp -Rvpn 2008/ /Path/to/Photo/Archives/2008

(I was tentative.  I ran it for each year by hand as I wanted to make sure it was working properly by checking a few file and directory creations as I went.  This could be scripted as well.)

So there you go.  All photos taken by the iPhone extracted from iPhoto and merged into my simple but very durable photo archives structure.

I repeated the above to extract my Samsung Galaxy s2 (Command-F search for "GT-I9100") photos as well.

Postnote: If you can't subset out the photos taken by the iPhone within iPhoto, you could just extract everything and use jhead's EXIF header reading to determine what camera took a photo.

07 June 2010

Enabling GNUPG (PGP) with Apple OS X mail.app

(Postnote 2011-03-05: Don't waste your time on the below.  Just go directly to gpgtools mail, read the instructions, and get on with it.  It's been updated to work with OS X 10.6 and Mail 4.4.  Just tested it, works great.)

I am so not an expert on PGP, GNUPG (GNU Privacy Guard) or OS X's mail.app.  But what I can do is explain how I got the basics of PGP working with Mac mail and some resources that helped.

If you don't know anything about PGP or want more detail, see "Learn More" section at the end of this post.

The following worked for Mac OS X 10.6.3 and mail.app 4.2.

1. Install GNU's Privacy Guard (gnupg).

You need to have Macports installed.  Install it if you don't have it.

sudo port install gnupg

2. Generate your encryption key.

gpg --gen-key

Here are the options I used:

1. Option 2: DSA and Elgamal
2. Keysize: 3072 (that was the biggest keyvalue offered)
3. 0, key does not expire
4. Key identification
Real name: Jeff Blogs
email address: jeffblogs@dodgymail.com
No comment
5. Passphrase "something memorable yet complicated and long, don't share it with anyone, and don't forget it"


Your ~/.gnupg directory of configuration and databases gets set up.

3. Install the magic mail.app bundle

The bundle contains a version of GPGMail that works with OS X 10.6.3.

Exit mail.app.

mkdir ~/Library/Mail/Bundles  # if it doesn't exist already - mine didn't

Be thankful for clever, helpful and giving people and Download the bundle.

Extract from zip download and deposit GPGMail.mailbundle into ~/Library/Mail/Bundles

From the command line as the user you run mail with (not root!):

defaults write com.apple.mail EnableBundles -bool true
defaults write com.apple.mail BundleCompatibilityVersion 3


Start mail.app.

You should now have a PGP option in your mail menu (Message->PGP).

Mail.app menu with new PGP option

You should also see a PGP toolbar when you create a new email:

New PGP toolbar appears when composing a new email

(This step was the silver bullet from macrumors.com forum with an updated GPGMail from Lukas Pitschl - thank you!)

4. Create your public key.

From command line:

gpg --armor --output "Jeff Blogs.asc" --export jeffblogs@dodgymail.com

You'll need to send people your public key if you want them to send encrypted email back to you.

5. Add other people's public keys

gpg --import "Ronald McDonald.asc"

At this point you should now be able to send and receive PGP encrypted emails and mail.app will be reasonably supportive of you.

I found regularly restarting mail.app is useful when fiddling with gpg at the command line.

6. Set yourself up with a verified key service.  This will decrease warnings from mail and GNUPG.

Set yourself up with pgp.com.

Use the name and email address you used to generate your key in step 2 above.

Add the verified key service key:
gpg --import keyserver1.pgp.comGlobalDirectoryKey.asc

Let GNUPG know about the pgp.com key server.  Edit ~/.gnupg/gpg.conf and uncomment "keyserver ldap://keyserver.pgp.com" line.

(You're restarting mail.app between these steps right?)

7. Learn more!

These were helpful to the above:
These might have been helpful if they weren't really long, complicated, out of date, didn't work and I didn't already have the basic idea of how PGP was supposed to work:
And of course GPGMail itself, which doesn't work with current versions of Snow Leopard and mail.app.

-----

2010-06-19 Postnote: The latest OS X upgrade to Mail 4.3 disabled gpgmail.  Two things to fix this:

1. Copy GPGMail.mailbundle from "~/Library/Mail/Bundles (Disabled)" to ~/Library/Mail/Bundles

2. Enter the GPGMail.mailbundle directory and add two new UUIDs to Info.plist in the "SupportedPluginCompatibilityUUIDs" section:


E71BD599-351A-42C5-9B63-EA5C47F7CE8E
B842F7D0-4D81-4DDF-A672-129CA5B32D57

And gpgmail is working again.

(As outlined by user Bytes_U on the Apple support forums.)

03 May 2010

IT Hotsite Best Practices

Introduction

A "hotsite" is a general term for unplanned downtime - a failing site, product, or feature that is having significant impact on revenue generation.  A problem is escalated to hotsite level when significant numbers of (potential) customers are affected and a business ability to earn money is significantly affected.  Hotsite handling may or may not be used if the problem is not under direct control of the team controlling a set of systems (e.g., a critical feature the systems depend on is provided by a remote supplier, such as a web service being used by a mashup).

Hotsites happen.  Costs increase infinitely as you push your system design and management to 100% uptime.  You can aspire for 100% uptime, but it's foolish to guarantee it (e.g., in an SLA).  Change can also cause service disruptions.  In general, the less change, the less downtime.  However, it's rarely commercially viable to strongly limit change.

This article isn't about reducing planned or unplanned downtime, it's a collection of tips, tricks, and best practices for managing an unplanned downtime after it has been discovered by someone who can do (or start to do) something about it.  I'll also focus in on a new type of downtime, one that the people involved haven't seen before.

General strategy - the management envelope

It's important early on for a major problem to separate technically solving the problem from managing the problem itself into the wider business.  Because an unplanned downtime can be extremely disruptive to a business, it's often almost as important to keep people informed about the event as solving the event itself.

Although that may feel like an odd statement, as a business grows there are people throughout the business that are trying to manage risk and mitigate damage caused by the downtime.  Damage control must be managed in parallel with damage elimination.

You want to shelter those that are able to technically solve the problem from those that are hungry for status and are slowing down the problem solving process by "bugging" critical staff for information.  Technical problem solving tends to require deep concentration that is slowed by interruptions.

It is the management envelope's responsibility to:
  • Agree periods of "no interruption" time with the technical staff to work on the problem
  • Shelter the team from people asking for updates but are not helping to solve the problem
  • Keep the rest of the business updated on a regular basis
  • Set and manage expectations of concerned parties
  • Recognize if no progress is being made and escalate
  • Make sure the escalation procedure (particularly to senior management) is being followed
  • Make sure that problems (not necessarily root cause related) discovered along the way make it into appropriate backlogs and "to-do" lists
General strategy - the shotgun or pass-the-baton

Throughout the event, you have to strike a balance between consuming every possible resource that *might* have a chance to contribute (the "shotgun") versus completely serializing the problem solving to maximize resource efficiency ("pass-the-baton").

Some technologists, particularly suppliers who might have many customers like yourself, may not consider your downtime as critical as you do.  They will only want to be brought in when the problem has been narrowed down to their area and not "waste" their time on helping to collaboratively solve a problem that isn't "their problem".

There is a valid argument here.  It is ultimately better to engage only the "right" staff to solve a problem so that you minimize impact on other deliverables.  Your judgment about who to engage will improve over time as you learn the capabilities of the people you can call on and the nature of your problems.

However, my general belief for a 24x7 service like an Internet gambling site that is losing money every second it is down, calling whoever you think you might need to solve the problem is generally fully justified.  And if you're not sure, error on the shotgun side rather than passing the baton from one person to the next.

General strategy - the information flows and formats

Chat.  We use Skype chat with everyone pulled into a single chat.  Skype's chat is time stamped and allows some large number of participants (25+) in a single chat group.  We spin out side chats and small groups to focus on specific areas as the big group chat can become too "noisy", although it's still useful to log information.  It gives us a version history to help make sure change management doesn't spin out of control.  We paste in output from commands and note events and discoveries.  Everything is time threaded together.

The management envelope or technical lead should maintain a separate summary of the problem (e.g., in a text editor) that evolves as understanding of the problem/solution evolves.  This summary can be easily copy/pasted into chat to bring new chat joiners up to speed, keep the wider problem solving team synchronized, and be used as source material for periodic business communications.

Extract event highlights as you go.  It's a lot easier to extract key points as you go then going through hours of chat dialogues afterwards.

Make sure to copy/paste all chat dialogues into an archive.

Email.   Email is used to keep a wider audience updated about the event so they can better manage into partners and (potential) customers.  Send out an email to an internal email distribution list at least every hour or when a breakthrough is made.  Manage email recipients expectations - note if there will be further emails on the event or note if this is the last email of the event.

The emails should always lead off with a non-technical summary/update.  Technical details are fine, but put them at the end of the message.

At a minimum, send out a broad distribution email when:
  • The problem first identified as a likely systemic and real problem (not just a one off for a specific customer or fluke event). Send out whatever you know about the problem at that time to give the business as much notice as possible of the problem. Don't delay sending this message while research is conducted or a solution is created.
  • Significant information is discovered or fixes created over the course of the event
  • Any changes are made in production to address the problem that may affect users or customers
  • More than an hour goes by since the last update and nothing has otherwise progressed (anxiety control)
  • At the end of a hotsite event covering the non-tech details on root cause, solution, impact (downtime duration, affected systems, customer-facing affects)
Chain related emails together over time.  Each time you send out a broad email update, send it out as a Reply-All to your previous email on the event.  This gives new-comers a connected high-level view of what has happened without having to wade through a number of separate emails.

Phone.  Agree a management escalation process.  Key stakeholders ("The Boss") may warrant a phone call to update them.  If anyone can't be reached quickly by email and help is needed, they get called.  Keep key phone numbers with you in a format that doesn't require a network/internet connection.  A runbook with supplier support numbers on the share drive with a down network or power failure isn't very useful.

The early stage

Potential hotsite problems typically come from a monitor/alert system or customer services reporting customer problems. Product owners/operators or members of a QA team (those with deep user-level systems knowledge) may be brought in to make a further assessment on the scope and magnitude of the problem to see if hotsite escalation is warranted.

Regardless, at some point the first line of IT support is contacted.  These people tend to be more junior and make the best call they can on whether the problem is a Big Deal or not.  This is a triage process, and is critical in how much impact the problem is going to make on a group of people.  Sometimes, a manager is engaged to make a call of whether to escalate an issue to hotsite status. Escalating a problem to this level is expensive as it engages a lot of resources around the business and takes away from on-going work. Therefore, a fair amount of certainly that an issue is critical should be reached before the problem is escalated to a hotsite level.  The first line gets better at this with escalation with practice and retrospective consideration of how the event was handled.

Once the event is determined to be a hotsite, a hotsite "management envelope" is identified.  The first line IT support may very well hand off all problem management and communications off to the management envelope while the support person joins the technology team trying to solve the problem.

All relevant communications now shift to the management envelope.  The envelope is responsible for all non-technical decisions that are made.  Depending on their skills, they may also pick up responsibility for making technical decisions as well (e.g., approving a change proposal that will/should fix the problem). The envelope may change over time, and who the current owner and decision maker is should be kept clear with all parties involved.

The technical leader working to solve the problem may shift over time as possible technical causes and proposed solutions are investigated.  Depending on the size and complexity of the problem, the technical leader and management envelope will likely be two different people.

Holding pages.  Most companies have a way to at least put up "maintenance" pages ("sorry server") to hide failing services/pages/sites.  Sometimes these blanket holding pages can be activated by your upstream ISP - ideal if the edge of your network or web server layer is down.  Even better is being able to "turn off" functional areas of your site/service (e.g., specific games, specific payment gateways) in a graceful way such that the overall system can be kept available to customers while only the affected parts of the site/service are hidden behind the holding pages.

Holding pages are a good way to give yourself "breathing room" to work on a problem without exposing the customer to HTTP 404 errors or (intermittently) failing pages/services.

Towards a solution

Don't get caught up in what systemic improvements you need to do in the future.  When the hotsite is happening, focus on bringing production back online and just note/table the "what we need to do in the future" on the side.  Do not dwell on these underlying issues and definitely no recriminations.  Focus on solving the problem.

Be very careful of losing version/configuration control.  Any in-flight changes to stop/start services or anything created at a filesystem level (e.g., log extract) should be captured in the chat.  Changes of state and configuration should be approved in the chat by the hotsite owner (either the hotsite tech lead or the management envelope).  Generally agree within the team where in-flight artifacts can be created (e.g., /tmp) and naming conventions (e.g., name-date directory under /tmp as a scratchpad for an engineer).

All service changes up/down and all config file changes or deployment of new files/codes should be debated, then documented, communicated, reviewed, tested and/or agreed before execution.

Solving the problem

At some point there will be an "ah-ha" moment where a problem is found or a "things are looking good now" observation - you've got a workable solution and there is light at the end of the tunnel.

Maintaining production systems configuration control is critical during a hotsite. It can be tempting to whack changes into production to "quickly" solve a problem without fully understanding the impact of the change or testing it in staging.  Don't do it.  Losing control of configuration in a complex 24x7 environment is the surest way to lead to full and potentially unrecoverable system failure.

While it may seem painful at the time, quickly document the change and communicate it in the chat or email to the parties that can intelligently contribute to it or at least review it.  This peer review is critical in helping to prevent making a problem worse, especially if it's late at night trying to problem solve on little or no sleep.

Ideally you'll be able to test the change out in a staging environment prior to live application.  You may want to invoke your QA team to health check around the change area on staging prior to live application.

Regardless, you're then ready to apply the change to production.  It's appropriate to have the management envelope sign off on the fix - certainly someone other than the person whose discovered and/or created the fix must consider overall risk management.

You might decide to briefly hold off on the fix in order to gather more information to help really find a root cause.  It is sometimes the case that a restart will likely "solve" the problem in the immediate term, even though the server may fail again in a few days.  For recurring problems the time you spend working behind the scenes to identify a more systemic long term fix should increase with each failure.

In some circumstances (tired team, over a weekend) it might be better to shut down aspects of the system rather than fix it (apply changes) to avoid the risk of increasing systems problems.

Regardless, the step taken to "solve" the problem and when to apply it should be a management decision, taking revenue, risk, and short/long term thinking into account.

Tidying up the hotsite event

The change documentation should be wrapped up inside your normal change process and put in your common change documentation archive.  It's important you do this before you end the hotsite event in case there are knock on problems a few hours later.  A potentially new group of people may get involved, and they need to know what you've done and where they can find the changes made.

Some time later

While it may be a day or two later, any time you have an unplanned event, as IT you owe the business a follow-up summary of the problem, effects and solution.

When putting together the root cause analysis, keep asking "Why?" until you bottom out.  The answers may become non-technical in nature and become commercial, and that's ok.  Regardless, don't be like the airlines - "This flight was late departing because the aircraft arrived late.".  That's a pretty weak excuse for why the flight is running late.

Sometimes a root cause is never found.  Maybe during the event you eventually just restarted services or systems and everything came back up normally.  You can't find any smoking gun in any of the logs.  You have to make judgment call on how much you invest in root cause analysis before you let go and close the event.

Other times the solution simply isn't commercially viable.  Your revenues may not warrant a super-resiliant architecture or highly expensive consultants to significantly improve your products and services.  Such a cost-benefit review should be in your final summary as well.

At minimum, if you've not solved the problem hopefully you've found a new condition or KPI to monitor/alert on, you've started graphing it, and you're in a better position to react next time it triggers.

A few more tips

Often a problem is found that is the direct responsibility of one of your staff.  They messed up.  Under no circumstances should criticism be delivered during the hotsite event.  You have to create an environment where people are freely talking about their mistakes in order to effectively get the problem solved.  Tackle sustained performance problems at a different time.

As more and more systems and owners/suppliers are interconnected, the shotgun approach struggles to scale as the "noise" in the common chat increases proportional to the number of people involved.  Although it creates more coordination work, side chats are useful to limit the noise, bringing in just those you need to work on a sub-problem.

Google Wave looks like a promising way to partition discussions while still maintaining an overall problem collaboration document.  Unfortunately, it's easy to insist all participants use Skype (many do anyway), but it's harder with Wave that not many have used or don't even have an account or invite available.

Senior leadership should re-enforce that anyone (Anyone!  Not just Tech) in the business may be called in to help out with a hotsite event.  This makes the intact team working on the hotsite fearless about who they're willing to call for help at 3am.

Depending on the nature of your problem, don't hesitate to call your ISP.  This is especially true if you have a product that is sensitive to transient upstream interruptions or changes in the network.  A wave of TCP resets may cause all kinds of seemingly unrelated problems with your application.

Conclusion

Sooner or later your technical operation is going to deal with unplanned downtime.  Data centres aren't immune to natural disasters and regardless, their fault tolerance and verification may be no more regular than yours.

When a hotsite event does happen, chances are you're not prepared to deal with it.  By definition, a hotsite is not "business as usual" so you're not very "practiced" in dealing with them.  Although planning and regular failover and backup verification is a very good idea, no amount of planning and dry runs will enable you to deal with all possible events.

When a hotsite kicks off, pull in whoever you might need to solve the problem.  While you may be putting a spanner into tomorrow's delivery plans, it's better to error on the shotgun (versus pass-the-baton) side of resource allocation to reduce downtime and really solve the underlying problems.

And throughout the whole event, remember that talking about the event is almost as important as solving the event, especially for bigger businesses.  The wider team wants to know what's going on and how they can help - make sure they're enabled to do so.

Using MobileMe's iDisk as an interim backup while traveling

Introduction

I use an Apple laptop hard disk as my primary (master) data storage device.  To provide interim backups while traveling, I use Apple's MobileMe iDisk for network backups to supplement primary backups only available to me when I'm at home.

Having dabbled with iDisk for a few years, I have two key constraints for using iDisk:
  • I don't always have a lot of bandwidth available (e.g., a mobile phone GPRS connection) and I don't want a frequent automatic sync to hog a limited connection.
  • I don't trust MobileMe with primary ownership of data or files.  Several years ago I switched to using the iDisk Documents folder (with local cache) for primary storage but then had several files magically disappear.
I've now evolved to using iDisk as a secondary backup medium.  I manually run these steps when I have plenty of bandwidth available.  There are two steps to this:
  • rsync files/folders from specific primary locations to a named directory under iDisk
  • Sync the iDisk
How to do it

The rsync command I use looks like this:


for fn in Desktop dev Documents Sites; do
   du -sk "/Users/my_username/$fn" | tee -a ~/logs/laptop_name-idisk.rsync.log
   rsync -avE --stats --delete "/Users/my_username/$fn" "/Volumes/my_mobileme_name/laptop_name/Users/my_username" | tee -a ~/logs/laptop_name-idisk.rsync.log
done

The rsync flags in use:

-a         archive (-rlptgoD no -H)
           -r    recursive
           -l    copy symlinks as symlinks
           -p    preserve permissions
           -t    preserve times
           -g    preserve group
           -o    preserve owner
           -D    same as "--devices --specials" (preserve device and special files)
-v         verbose
-E         preserve extended attributes
--stats    detailed info on sync
--delete   remove destination files not in source


Explanation:
  • I'm targeting specific locations that I want to backup that aren't overly big but tend to change frequently (in this case several folders from my home directory: Desktop, dev, Documents, Sites)
  • A basic log is maintained, including the size of what is being backed up (the "du" command)
  • I use rsync rather than copy because rsync is quite efficient - it generally only copies the differences, not the whole fileset.
  • The naming approach on the iDisk allows me to keep a backup by laptop name allowing me to keep discrete backup collections over time.  My old laptop and backups sit beside my current laptop backups.
  • The naming approach also means I don't use any of the default directories supplied by iDisk as I'm not confident that Apple won't monkey with them.
  • ~/Library/Mail is a high change area but not backed up here (see below for why)
The rsync updates the local iDisk cache.  Once the rsync is complete (after the first rsync I find it takes less than 10 seconds for subsequent rsyncs), manually kick off an iDisk network sync (e.g., via a Finder window, clicking on the icon next to iDisk).

An additional benefit to having a network backup of my important files and folders is that I can view and/or edit these files from the web, iphone, or PC.  I find that being able to access email/IMAP from alternative locations is the most useful feature, but I have had minor benefit from accessing files as well when my laptop was unavailable or inconvenient to access (e.g., quick check of a contract term in the back of a taxi on an iphone).

Other Backups

I have two other forms of backups:
  • Irregular use of Time Machine to a Time Capsule, typically once a week if my travel schedule permits.
  • MobileMe's IMAP for all email filing (and IMAP generally for all email).
Basically, if I'm traveling, I rely on rsync/iDisk and IMAP for backups.  I also have the ability to recover a whole machine from a fairly recent Time Machine backup.

Success Story

In June 2009 I lost my laptop HDD on a return flight home after 2 weeks of travel.  I had a Time Machine backup from right before I'd left on travel, and occasional iDisk rsyncs while traveling.

Once I got home I found an older HDD of sufficient size and restored from the Time Machine image from the Time Capsule.  This gave me a system that was just over 2 weeks "behind".  Once IMAP synchronized my mailboxes, that only left a few documents missing that I'd created while traveling.  Luckily I'd run an rsync and iDisk right before my return flight, so once I'd restored those, I'd recovered everything I'd worked on over the two weeks of travel, only missing only some IMAP filing I'd done on the plane.

Weakness

The primary flaw in my approach is that you have to have the discipline to remember to manually kick off the rsync and iDisk sync after you've made changes you don't want to lose.  I certainly don't always remember to run it, nor do I always have a good Internet connection available to enable it.  However, I find that remembering sometimes is always better than not having any recent backup at all.

Alternative Approaches

An obvious alternative is to use the MobileMeBackup program that is preloaded onto your iDisk under the Software/Backup directory.  Using this tool, you should be able to perform a similar type of backup to what I've done here.  I've not tried it as it was considered buggy back when I first started using iDisk for network backups.  I'll likely eventually try this and may shift to it if it works.

A viable alternative approach is to carry around a portable external hard drive, and make Time Machine backups to it more frequently than you would otherwise do over the network via iDisk.  You could basically keep a complete system image relatively up-to-date if you do this.  More hassle, but lower risk and easier recovery if your primary HDD fails.  However, if you get your laptop bag and external HDD stolen, you'll be worse off.

While on holiday recently, I was clearing images off of camera SD card memory as it filled up.  I put these images both on the laptop HDD and an external HDD.  This protects me from laptop HDD failure, but wouldn't help if both the laptop and external HDD was stolen.

iDisk Comparison to DropBox

DropBox is a popular alternative to iDisk.  I find DropBox to be better at quickly and selectively sharing files, it has better cross-platform support (particularly with a basic Android client), and it's sync algorithm seems to work better than the iDisk equivalent.  You could certainly do everything described here with DropBox.

The downside with DropBox is having to pay $120 per year for 50GB of storage versus $60-100 per year ($60 on promotion, e.g., with a new Apple laptop; otherwise $100) for 20GB of storage with MobileMe.  I find 20GB to be plenty for IMAP, iDisk and photos providing I filter out big auto-generated emailed business reports (store on laptop disk not in IMAP), and only upload small edited sets of photos.  I'll probably exhaust the 20GB in 2-3 more years at my current pace, but I'd expect Apple to increase the minimum by the time I would otherwise be running out of space.

MobileMe is of course more than just iDisk, so if you use more of it's features, it increases in value relative to DropBox.

Both iDisk and DropBox are usable choices, the differences are not sufficiently material to strongly argue for one or the other.  I have seen iDisk improve over the last few years and I'd expect Apple to eventually catch up with DropBox.

Conclusion

While I'm not confident in using MobileMe's iDisk as a primary storage location, I have found it useful as a network backup.  Combined with normal backups using Time Machine and Time Capsule, it provides a high-confidence recovery from damaged or lost primary use laptops.

21 March 2010

Using wget to ask jspwiki to re-index its search DB

For whatever reason, our installation of jspwiki (v2.8.2) decides to ignore or lose pages out of its index (hey, what do you want for free?!).  With our jspwiki hitting 2000 pages, search is the main tool to find pages. Unfortunately, I've taken to keeping my own links page to important pages just so I don't lose them as the search indexing seems to break regularly.  While a re-index solves the problem, but it requires going into the site, authenticating, and clicking a button - way too much work.

Here is a quicky to use wget to log in to jspwiki and force a re-indexing of pages:

# POST to log in and get login and session cookies
wget --verbose --save-cookies=cookie --keep-session-cookies --post-data="j_username=myuid&j_password=mypw&redirect=Main&submitlogin=Login" "http://wiki.mydomain.com/JSPWiki/Login.jsp" --output-document "MainPostLogin.html"

# POST to kick off reindexing using cookies
wget --verbose --load-cookies=cookie --post-data="tab-admin=core&tab-core=Search+manager&bean=com.ecyrd.jspwiki.ui.admin.beans.SearchManagerBean&searchmanagerbean-reload=Force+index+reload" --output-document "PostFromForceIndexReload.html"  "http://wiki.mydomain.com/JSPWiki/admin/Admin.jsp"


Tweak myuid, mypw, and wiki.mydomain.com in the above to have them be what you need.  Drop the output once you're comfortable it's working (I was saving it in the above to make sure I could see artifacts of being authenticated in the output).

Put the above into a cron'ed script and run it hourly.

Note that all versions of wget are not created equal as 1.10 didn't seem to work but 1.10.2 and 1.12 worked fine for the above.

21 February 2010

Internet Gambling Jobs in Gibraltar

Gibraltar has plenty of on-line gambling companies and there is almost always some form of related recruitment going on.

Whether you want to make a fresh start in Southern Spain or just got made redundant, the following are some good starting points if you're interested in working in Gib.  While I've got an IT bias, none of the companies below specialize only in IT.

(Please note I'm not affiliated with any of the companies listed below although I've talked with all of them over the years.)

Gibraltar Local Recruiters

Quad has been in Gibraltar for a long time now (at least in dog years), lists Gibraltar online gambling jobs, and has plenty of information about Gib and Spain on their site.

Ambient has been around for a fair amount of time as well.  They are not Gibraltar based, but close enough (up the Costa).

SRG has just opened up their office next door in Europort in Gib.  Their website covers some basics about Gib like living in Gib/Spain and local Income Tax.

Other Recruiters Operating into Gibraltar

There are plenty of other recruiting companies outside of Gib (typically UK) that operate into Gib.  They come and go, and the recruiters themselves change over time.

There are also a variety of headhunters that typically work other sectors that come and go.

There are two companies that have been around for a long time that have done plenty of work for Gib based companies:  BettingJobs.com and Pentasia.  Both of these companies place world-wide but you can find Gibraltar jobs on their sites as well.

Other Job Sources for Gibraltar

It's traditional (but not cheap!) to post jobs in The Gibraltar Chronicle newspaper on Fridays.  Yes, this is an actual paper newspaper, just like the ones Grandpa used to read.  They don't cross-post jobs to their website.

I keep an eye on jobserve using category IT and keyword "Gibraltar" to create an RSS feed to see what my IT colleagues around Gib are up to.

The GRA (Gibraltar Regulatory Authority) thoughtfully provides a summary page of all "remote gambling" operators with Gib licenses.

Wildcards

I've not personally worked with the following, but they at least had a few listings or Gib or along the Costa.  YMMV.

gibraltarportal.com lists a few local jobs.

The surinenglish.com delegates their recruitment to myservicesdirectory.com, it's very Spain oriented, not too much interesting on the Gib and IT side.

I'm not familiar with Andalucia Technology Recruitment.  I've not seen anything for Gib on their site, but they do have a few IT roles along the Costa.

Bits and Pieces

There is an Excel sheet you can download from the Gib government site to calculate your potential income tax.  With the same salary, you'll typically be better off in Gib than most other European countries.

Other starting points

EGR has published a short list of nominees for their 2010 igaming awards, it's also a good source of companies to look at, although certainly not limited to just Gibraltar.

03 January 2010

Using wget and google to download and translate websites

There is a website for the neighborhood I live in that is all in Spanish (cuartonparque.com).  So that's useful if your Spanish is good, which mine isn't.  Google's translate function is great, but I wanted an archive of the site both in Spanish and English in case the site disappeared or was substantially altered.

wget is a great command line *nix utility to recursively download a website providing the links are statically constructed.  I use wget on OS X (install xcode and macports to enable installation of wget if you don't have it).

For cuartonparque.com, the wget command is straight-forward and well documented.  The site uses simple static links and only has a few levels of linking. To download the site, I used:

wget -rpkv -e "robots=off" 'http://cuartonparque.com' 2>&1 | tee cuartonparque.com.wget.log

This command creates a cuartonparque.com directory with a browsable website.

To download a translate.google.com version of the site was trickier.  Although various googled pages helped a bit, I couldn't find find an example that actually worked.  After some hacking about, I uncovered the required tricks to make this work:
  • Google appears to only process requests from browsers it's familiar with (use -U Mozilla)
  • Google uses frames and changes it's domain name a bit as it translates (find out the final URL of interest by digging around in the page source)
  • Safari really likes a .html extension on files it opens (use --html-extension)
My pain is your gain.  Here is the wget command that downloads the translated version of the website:

wget -rpkv -e "robots=off" -U Mozilla --html-extension 'http://translate.googleusercontent.com/translate_c?hl=en&sl=es&tl=en&u=http://cuartonparque.com/&rurl=translate.google.com&twu=1&usg=ALkJrhjabXZlzJpBCZeWpsmLaKss09lCuQ' 2>&1 | tee -a cuartonparque.com.En.wget.log

wget creates a translate.googleusercontent.com directory with a browsable website, localized from Spanish to English with a horrific URL for the index.html page:

file:///Users/xyz/Downloads/Web%20Sites/cuartonparque.com.En.Google.Trans/translate.googleusercontent.com/translate_c%3Fhl=en&sl=es&tl=en&u=http:%252F%252Fcuartonparque.com%252F&rurl=translate.google.com&twu=1&usg=ALkJrhjabXZlzJpBCZeWpsmLaKss09lCuQ.html

A quick browse around on the downloaded version suggests everything came through, nicely translated to English with wholly-formed pages.  Enjoy!