How to Scale Stuff, Part 1: Don’t Use Stuff you Don’t Need

This first story occurs, believe it or not, in the mid-1970s.

At my High School, we were fortunate enough to have a Data General Nova minicomputer (unlike other cosmically named computers, the Novas were actually very cool. They had this thing called “hardware indirect addressing”, which meant that you could designate any address as being “indirect”, which would cause the hardware to use the address as being an address of an address. This made calls and parameter passing blazingly fast. The only downside was that if you made an address refer to itself, the whole machine would lock up :-) )

Anyway, my project/assignment was to write a test grading program. This program would take in marksense cards (which were like punch cards, except you filled in little dots with a pencil instead of punching holes with a keypunch machine), compare the answers to a key, and print all kinds of reports as a result. Simple, huh?

In those days, I/O (input/output) was really slow. Well, that’s what they told us. As a matter of fact, we didn’t really have any performance tools, so I just took it as a given than I/O was really slow. To get around this perceived problem, I decided that my program had to use multi-tasking. I’d use a sub-task to read in the marksense cards while the main task did the grading. Cool huh?

Let’s summarize:

  • This was one of oh, the first dozen programs I’d ever written. In my life.
  • I had decided to include a complex, not-well-understood-technology into the project based on a hunch. Most of us haven’t had enough life experience by High School to make good eating decisions, much less fuel hunches about technology.
  • It was in Fortran, which isn’t well suited to multi-tasking today, much less 40 years ago
  • There was no internet in those days. There was hardly online anything, much less for High Schools. That meant the only source of information was printed manuals.

It will come as no surprise to anyone that the result was a disaster which never worked right in the end. Looking back, it probably wouldn’t have worked anyway, even if I had known what I was doing.

But the broken rule here was I was using stuff I didn’t need, or that I fell prey to the “bright shiny object” (which was in this case, “multitasking”) (God, was I a geek).

Other people call this “premature optimization”, but I think it’s more basic than that. Over and over I see people using technology, or, worse yet, writing their own, when a simpler approach would work as well, or better.

Another example occurred a very few years later, at Macy’s. We were one of the first to deploy IBM’s Electronic Cash Registers (yes, IBM made cash registers), which meant putting a computer in every store, wired to allll the cash registers in the store. Believe it or not, you had to manually configure the software for each store by telling it about each individual register. Being the lowest one on the totem pole (I’m still in my teens, mind you), I got that job.

Had I been smart, I would have just banged out the configuration, and been done with it. There were only like 10 stores at the time, and, since they had to wire the store for the registers (no ethernet, much less wireless, in those days), we’d only bring a store online once every few months or so. But, I wasn’t smart, and decided to write these complex assembler language macros to do the job. This is roughly the equivalent to writing a program in the C++ preprocessor (#define, #ifdef, etc) to generate your C program.

This took me like a week of late nights to get right, when I could have configured every damn store in the division, plus every Woolworth’s and Emporium too, in a day. And, I was the only person that understood what I’d done to boot. Looking back, I was lucky I didn’t get fired, or at least yelled at.

Don’t get me wrong. I also spent thousands of useful hours messing with new technology (in fact, I got my first real job because I’d spent a year in College doing nothing but playing with Bright Shiny Objects) (VTAM and VSAM for you historical geeks), as learning exercises. I only object to this when you’re getting paid to build something, or, worse yet, building something someone else will have to maintain later on.

Let’s contrast this with a present day example. A friend and I are thinking about a new project, and the NoSQL movement caught our eye (also see this excellent round-up of NoSQL solutions). We’ve both been using databases practically since they were invented (I promise to bore you with that later), so the benefits and pitfalls of relational databases are well known to us.

Our first reaction to this was “Wow, cool! Automatic Scaling! Automatic Sharding! Automatic SPOF elimination!”.

Our second reaction was “Uh, er, how do you write reports? Is every Business Development request a new program?” (there’s a business opportunity there for someone).

Ted Dziuba, as well as many others, has written a great article “I can’t wait for NoSQL to die”.

Anyway, I digress. The question is, does playing with a NoSQL solution like Cassandra qualify as “playing with a bright shiny object”? Well, I think, yes, and no, and I have a compromise.

Yes, because, frankly, what we’re working on is not Google, or Digg, or eBay, yet, and expending time and effort to wrench the architecture into a Cassandra suitable schema probably isn’t a good use of time at this point.

No, it’s not a waste of time, because if we are successful, then we won’t have to panic later when we do grow, and can focus on other things.

On the other hand, if we spend time now fooling around with Cassandra instead of what we know (SQL), then we’ll be wasting valuable time we should be spending on features.

The compromise is to encapsulate data access so that most of the code base doesn’t care whether we use MySQL, Oracle, Cassandra, or flat files for that matter. Having done a few encapsulations like this before, I can tell you it’s not that hard if you think carefully ahead of time about what you’re trying to do.

So, in summary:

  • Beware of Bright Shiny Objects.
  • Playing with Bright Shiny Objects if you your schedule, project, job, and your own common sense are ok with it.
  • Beware of the Bright Shiny Object being harder to use and taking up more time that the just-as-good Dull but workable objects.
  • Use encapsulation where possible to enable the use of Bright Shiny, Dull, or even As-Yet-Undiscovered Objects.
  • Don’t hesitate to play with Bright Shiny Objects as a pure learning exercise.

Thanks for reading!

About these ads
  1. Mike Wilson on Scalability | AKF Partners Blog

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 383 other followers

%d bloggers like this: