Archive

Archive for June, 2009

Wolfram Alpha

June 19th, 2009 Chris 1 comment

There seems to have been (in the past year) a surge in the market to label any type of Information Retrieval system a “Google Killer.” First it was Cuil (which wasn’t so cool) which was a big flop. Then this weirdo came out with Wolfram Alpha, the “Computational Knowledge Engine” – not as much press as Cuil, so not as much flopping. And now we have Bing (aka “Live Search” 2.0 (aka “MSN Search” 2.0)) which had a fantastic debut, slipped back down, and has been gaining more traction.

“Google Killer” was definitely a would-be appropriate label for Cuil or Bing. But Wolfram Alpha, not so much. The “weirdo” that envisioned it is actually a man by the name of Stephen Wolfram, who contributed to the areas of cellular automata and studying Universal Turing Machines. [Actually, Wolfram did a lot more than just that - but having studied cellular automata in some depth, it's his work with which I'm most familiar.]

Wolfram Alpha, the Computational Knowledge Engine, is apparently just that. While slinking around online, I came across a link from a woman claiming that every living human being could fit safely on the landmass of the Australian continent, with more than a quarter acre to himself or herself. Provided was a link – to a Wolfram Alpha query. I was blown away when I followed through (out of curiosity only). To screenshot the resulting page would be to do Wolfram Alpha a disservice, you’ve got to follow the link and play around with it on your own.

Suffice it to say, I will be using Wolfram Alpha again. Soon. A lot.

It knows geography, it knows population statistics, and it can divide. Some would say “big deal.” I think there’s a bit more to it than that. Whatever your feelings, at least it knows the most important answer of all.

wolfram-alpha

New Look

June 18th, 2009 Chris 1 comment

I thought that I might update the blog’s styles a bit to go with the new name, and finally found a good match.

Failure

June 17th, 2009 Chris 1 comment

While working today, I set about to expire various passwords to various hosts, accounts, web services, et cetera.  One of the websites we use here is called SiteGround.

They fail.  In my personal or professional use from here on out, I will not be visiting that site again if I can at all avoid it unless they really change their tune in a big way – and here’s why:

password too long

There is absolutely no reason I (as a web developer) should ever limit my users’ ability to come up with long, complex, secure passphrases for my website.  Additionally, I see no reason that, as an administrator, I should be asked by a third party to trim my passwords.  I could understand if there were hard limits involved, mind you:

  • Maximum input length to *MySQL’s PASSWORD(), MD5(), or SHA() functions
  • Limits on the sizes of GET and POST variables

These are two very good reasons for capping the length of any user-submitted data.  But, if you’re familiar with these sorts of things, you know that it’s not likely anyone will be miffed by those caps anyway.  Using a generic (sufficiently random) password generator with all 52 letters (upper case and lower), ten digits (zero through nine inclusive), hyphens and underscores, even just a ten character passphrase is less than one combination in one quintillion.  Yes, that’s “quintillion” – that’s a million trillions.  The only thing I can think of to even come close to analoging the ridiculous odds against anyone brute-forcing a 10 character base 64 passphrase is the following quote from the new **Star Trek movie (which, by the way, I highly recommend):

“The notion of transwarp beaming [brute forcing that password] is like hitting a bullet with a smaller bullet while wearing a blindfold, whilst riding a horse.

I’m no cryptographer, for sure.  I’m just a lowly web developer – but even I know, at around 10 or 20 characters (at base 64), you’re far more likely to be the victim of a social hack than you are of someone compromising the password randomly or programmatically (excepting vulnerabilities in the hashing algorithm itself).  But that’s not the point.  The point is that there is no decent justification for capping user password sizes other than in scenarios similiar to those listed above.  SiteGround’s code monkeys wasted time imposing a maximum string length and generating the above error message.

Have you ever come across similarly pointless restrictions?  Have you yourself been made to enforce things like this?  I’d love to hear about it.

*Insert your favorite DBMS (and it’s suite of hashing/encryping functions) here.

**I feel compelled to disclose that I am not, nor have I ever been, a Star Trek fan.  Until last Friday.

Update: A certain [undisclosed] financial institution is also limiting their passwords – to 12 characters.

Update: Apparently MySpace limits their passwords to 10 alphanumeric characters (found this out through work).

Discussion Closed Temporarily

June 15th, 2009 Chris No comments

I’m in the process of moving this Blog to a new web host this morning, so until the DNS records propogate fully, no new content will be allowed (as per the steps outlined in this post).

Should be back up in a day or two.

Update: We’re back up, folks.

eval( “round 2″ )

June 12th, 2009 Chris 1 comment

In my previous post entitled Don’t be lazy. Don’t use eval(), I outlined a few reasons why eval() is a bad (if really, really, cool) function to use.  So bad for programmers (both practically and rhetorically) in fact, that my thesis has become to forget it’s very existence.  From that article:

we are spoiled.  All of us.  We are lulled into a false sense of security by believing that throwing more (and better) hardware at a problem is a sufficient excuse for writing poor code, shoddy algorithms, and overall paying less attention to detail.  Don’t get me wrong, Jeff’s argument is carefully thought out and well presented.  But I take it with a grain of salt.  In fact, lots of salt.  Don’t use fast hardware as an excuse to be lazy.  Where does the habit of eschewing proper paradigm and using your hardware as a crutch stop?

There are several points in that paragraph I think are worth noting.

  1. Yes, programmer time is expensive, and hardware is cheap.  There is a lot (a lot) of store to be set by simply getting  job X done quickly.  And, in certain circumstances, more hardware is the way to go.  Some problems can’t be solved by reducing algorithmic complexity and micro-optimizing code at the assembly level – some problems, by nature, require scale.  But that doesn’t mean that we can abandon complexity theory, cast memory consumption to the wind, and start blowing processor cycles on pointless memory lookups for every program we generate.
  2. eval(), in any language, is a security risk.  Perhaps that risk is minimized when running a sandbox, but there is still risk.  And if someone is determined enough to break something, they’re going to do it anyway.  The old adage “locks only keep honest men honest” applies here.  Why make it that much easier?  PHP and JavaScript can’t sandbox though, so you’d be living a fools paradise to use eval() in those places.  In Python, sure, you’ve got your “safe zone”.  But, as several readers have pointed out, there are still ways to exploit the system.  A precaution to this of course, is to perform input scrubbing, before running eval().  Ensure that all identifiers strung together in your to-be-eval()’d code are valid tokens in their own right.  But – at that point, you’re spending so much time worrying about hand-holding the code going through eval() that you might as well have just programmed it yourself using iteration, function pointers, callbacks, and the like.  The argument at the heart of the eval() security debate lies along the same lines as “blacklisting” versus “whitelisting”.  Cleansing eval() input is blacklisting.  Choosing an alternative method of doing the same thing without eval() is whitelisting.  This post traveled through a number of firewalls to reach your eyes.  Firewalls are whitelists.  Symantec is a blacklist.  Which fail more frequently, firewalls or antivirus programs?
  3. eval() can evoke massive laziness.  There have been about three times can recall considering it’s use (yes, against my own thesis).  Each time, I sat back, and thought about an alternative solution.  Each time, I was able to envision a module designed without eval() – and the said alternative designs were always more architecturally sound than the code produce when eval() was thrown in the mix.  “But it still would have been quicker to use eval()” you might say.  And you’d be right.  In the short term. Overall, maintenance would be more difficult, readability and reusability suffer, bugs become more difficult to track (even with unit testing) and you’ve done neither yourself, nor anyone else, any favors in taking the cheap route.

All that having been said, one reader noted that without eval(), Firebug (a web development plugin for Firefox without which I’d be lost) would not be possible.  I disagree with that statement.  I use Firebug daily, and I’ve used the console to run code on the fly a grand total of two times – and one of them was just for the sake of trying it.  Firebug presents such advanced reporting (both in DOM traversal, scripting errors, and traffic analyzation) that I simply don’t use the console in that fashion – and I don’t see how anyone else would need to rely on it.  I understand that it’s widely used – my point is that it’s in no way necessary to practically debug web apps, or to rhetorically advance the state of the art of programming, metaprogramming, et. al.  However, it’s a widely used feature by many fine folk I’d consider top-notch developers, so there you have it.

In class, Dr. Parson presented us with a direct example of how eval() is used in some of his upper level classes.  He said that eval() is used to generate functions which get attached to classes (and thus become members) at runtime, saving the developer many pointless repetitive keystrokes writing banal code.  I agree – eval() can have a purpose here.  But not in the production system.  No – a more responsible use of eval() would be in a developer script to generate these methods in a blank file, and then copy/paste the code into the class definition.  You get the same effect: developers hitting less keys (here, here!) but you avoid the pitfalls of having eval() run every time your application fires up.

Those who claim that Python (and PHP, JavaScript, et. al.) are themselves giant eval()s are absolutely correct in their assertion; however they draw illegitimate conclusions from the fact.  That premise isn’t sufficient ground to argue that eval() is just dandy.  This type of faux logic is tantamount replacing the drivers seat of your sedan with a Go-Kart – you could make a hundred arguments in favor of doing so, but at the end of th day, it just doesn’t make sense.

For those of you who caught my minor rhetorical shift there, you’ll note that I’m essentially qualifying my thesis with the following: in the final product.  Developers create throw-away one-time “scriptlets” all the time to get quick and dirty jobs done.  I go through several per day, just in my quest to have the computer do what it’s better than me at (which is iteratively following rules).  Go ahead – instantiate a Knife object, invent some code involving butter, toss it to eval(), slam that in a loop, and run it until your process reaches it’s virtual memory limit* – but that type of thing appears in your final product (the code that matters to everyone else), then I submit that you could be doing a better job.  Perhaps you aren’t doing a bad job, but you could be doing better.  Maybe I’m a perfectionist, maybe I’m just a little loopy, but whenever I get the feeling that I could be doing something better, it bugs me until I go back and address it.

*This reference comes from an absolutely hilarious quip Dr. Parson made at the end of his rebuttal of the original post.

Failing Forward

June 11th, 2009 Chris 2 comments

Due to issues with my current registrar (1&1) it looks like I may not be moving the site this week after all.  That detail remains to be clarified.  However, in preparing to move the site, I realized just how inappropriate and… downright stuffy the title “Programming with Poise” was.  I pondered for only a few minutes before discovering what I think to be a perfect fit.  Lately, I’ve been thinking a lot about the concept/mindset of “failing forward” – and it’s a catchy phrase, too.

On failing forward: We all screw up.  Sometimes it’s big, sometimes it’s little.  But it does happen.  What makes the difference in {your job, your relationship, your newest project, your diet, whatever} is how you handle that failure.  Do you run and hide?  Ignore it?  Point the finger?  Shrug it off?  Strong-arm your way past it?  Or do you acknowledge the failure, embrace the consequences, learn from the experience, and save the little pearl of wisdom gained for the “next time”?

Relocation

June 6th, 2009 Chris 2 comments

Within the next week or so, I plan on migrating this site to a different host.  Here’s my plan, so as never to experience any downtime – I hope the procedure proves helpful for someone, and if you’ve got any thoughts on how to more smoothly achieve the same result, I’d love to hear from you.

  1. Update the DNS TTL to a small value (I’ll choose 1 hour).
  2. Configure the new web host, and copy the website itself to it’s new home.
  3. Disable commenting on both sites.
  4. Copy the old database content to the new host.
  5. Update the DNS records to point to the new host.
  6. After about 24 hours, take the old site down.
  7. Enable commenting on the new site.
  8. Change the DNS TTL back to something more reasonable (say, 1 week).

In this fashion, I’m able to transfer the site to an entirely new web host, and the only interruption in service will be the disabling of comments – but all of my content remains available.  If I’m not generating any new content (a given) and neither are my readers (given that I’ve frozen discussion), then I can have two copies of the site running simultaneously with no negative ramifications.  When I’m certain that there are no oustanding stale DNS records, I remove the old site, and allow discussion again.  Voi la.

What does your nail look like?

June 4th, 2009 Chris No comments

In a previous post, I outlined a habit I have of using, over-using, and downright abusing “new” tools that I learn for a while – until their novelty wears off I start to learn the line between when it is, and when it is not, appropriate to use them.  Here’s the gist:

I call this the New Solution Syndrome.  The person has been given a new tool.  A new way of doing things.  A solution in search of a problem.  People (when subject to the New Solution Syndrome) will then go, and try to find problems to solve with their new tool, even if it isn’t necessarily the quickest, cheapest, or more significantly, the best tool to use for the job.

Posted to that article was a particularly concise comment:

Think of it this way: you used the hammer so much in the right and wrong situations that you developed a very accurate picture of what a nail is supposed to look like.

Reflecting on this, I had one of those coding “poof” moments.  One of those thoughts that you – and every mother’s son what calls himself a coder – has thought thousands of times before.  Sometimes, though, I think we have these thoughts without ever… truly… groking the real concepts at hand.  Say you have some sort of tool (i.e. a hammer) to choose for each of the following:

  • Before you go to create a repository for that new project, figure out what the nail looks like.
  • Before you go to author a brand-new file, figure out what the nail looks like.
  • Before you write another function/method, figure out what the nail looks like.
  • Before your mind starts wandering about how best to re-write that bottlenecked bit of code, figure out what the nail looks like.
  • Before anyone can cloud your vision and poison your thoughts with their own opinions about architecture, figure out what the nail looks like.
  • Don’t even think about changing that server config file before you figure out what the nail looks like.

“Duh,” “obvious,” and “what does this guy take me for?” are all responses I would expect you to be thinking at the moment.  But take thirty seconds, and think about some recent software development faux-pas.  I don’t mean general coding blunders – a seg fault here, an array index out of bounds there, a few NullPointerExceptions and a nice infinitely recursing function – those are the minutiae of our work.  I want you to think of system design flaws, overly complex interfaces, major scaling problems, or a case of software simply not meeting its spec.

Why do these things happen?  [And they do happen, in the real world.]

Because someone, somewhere down the line, sat down and grabbed a hammer before they really knew what their nail looked like.  Maybe they sat down and spilt too much rhetoric at a board meeting, maybe they stood and drew inaccurate pictures on a whiteboard, maybe they sent that usability report off in too much of a hurry, or maybe they chose a language based on mere whim and personal preference.  It doesn’t matter.  That person didn’t take the time to fully comprehend the task at hand, and now there are issues.

If that person was you, then it’s your fault for failing to look before you lept.  Accept that, move on, and pay attention next time.

If that person was a superior, then it’s your fault for letting someone else think for you.  Accept that, move on, and pay attention next time.

If that person was an inferior, then it’s your fault for leading them astray.  Accept that, move on, and pay attention next time.

At the risk of digressing into a societal flame war here, I need to say that people are far too concerned about shifting blame, and about appointing credit, and not nearly concerned enough about getting the job done right.  Coding well is as much about taking responsibility and ownership, and about making informed decions, as it is about writing code.  One of the easiest areas to start this habit is in choosing which tool(s) to use for a given job.  Of course – you can’t know what kind of hammer you need until you figure out what the nail looks like.