Archive

Archive for April, 2009

The New Solution Syndrome

April 15th, 2009 Chris 2 comments

Ever randomly hand someone a hammer?  What’s the first thing they do?

hammer

They look around, and try to figure out what they’re supposed to do with it.  The thought process is, “Okay, I now have this tool.  How should I be using it?”

I call this the New Solution Syndrome.  The person has been given a new tool.  A new way of doing things.  A solution in search of a problem.  People (when subject to the New Solution Syndrome) will then go, and try to find problems to solve with their new tool, even if it isn’t necessarily the quickest, cheapest, or more significantly, the best tool to use for the job.

I have chronic New Solution Syndrome.  I remember, way back when, as I first learned the joys of the Regular Expressions.  For about a week straight, my answer to every question was Perl.  You could have asked me how to change a flat tire, and I would have prescribed a negative lookbehind assertion with a subpattern capture to prove, very simply, that you should never have driven over that bottle.

And for a while there, my response to any programmatic problem was just to “regex the hell out of it.”

That left most of the projects that I started during that time period in nothing more than a pile of digital rubble in the form of backslashes, periods, and question marks.

But now I consider myself somewhat of an authority on Perl Compatible Regular Expressions.  Coworkers at my former full-time employer would routinely come to my desk with questions on PHP’s preg_* series of functions.  I’ve built a C++ wrapper for the PCRE C API.  I’ve even corrected a professor on the use of assertions in Pythons re module.  And, most importantly, my first reaction to regular expressions is “don’t use them if you don’t need them.”

Quite the 180.  From pattern matching every string that passed uner my nose, to preferring NOT to use regular expressions.

During my New Solution Syndrome with regular expressions, I learned a lot.  In fact, I learned a whole lot.  Because of my fervor to use the new tool in every possible scenario, not only did I quickly need to master many dimensions of the tools’ functionality, but I also faced many-a-hardship when trying to apply the tool in novel (read: usually foolish) ways.  This approach quickly enlightened me to the practical application of the tool, as well as the impractical applications.  And knowing when to use a tool is just as important as knowing when not to.

hammer-egg

I had another outbreak of the New Solution Syndrome when I learned how to program with threads.  And another when I learned how to program with sockets.  Another with Ajax.  With callbacks.  With pointers.  Shell scripts.  Version control.  Polymorphism.  Inheritance. Bitwise operations. On and on and on.  I’m a chronic victim of the New Solution Syndrome.  And I couldn’t be happier.

Sure, when I get in a fit, it distracts me for a while.  And after the fit is over, I probably have to go back and rewrite a bit of code.  But by the time the cycle completes, I know more about the tool than the average bear, and I have a better understanding of it’s benefits, drawbacks, and it’s reasonable use.

So go ahead, go crazy with that novel tool of yours until the excitement wears off.  Smash an egg or two with a hammer.  It’ll only land you with greater experience in the end.

Categories: Uncategorized Tags:

The Sleeper Social Network

April 9th, 2009 Chris No comments
  • Xanga – yes, it’s still around.
  • Facebook is getting too cocky for it’s own good.
  • Twitter seems to have addressed it’s scaling issues (TechCrunch had a fun time over the summer, posting frequently regarding outages).
  • MySpace, the Internet Cesspit, hit such a popularity peak that it’s not going anywhere, unfortunately.
  • Orkut is huge [in Brazil and India].
  • Classmates.com is, despite their horrible advertising, performing well enough (according to ComScore).
  • Digg just keeps [slowly] gaining momentum.
  • Wallop is Microsoft’s half-baked and obligatory “Hey!  Look at me, too!” site.
  • LinkedIn is a great site (although I think they could use a redesign).
  • Yahoo gave up on Mash.

I could keep going for quite a while.  It’s no small secret that the web is turning into a social platform.  We’ve even got companies that try to aggregate social sites.  And ones that try to abstract the very idea of a social site.  This stuff is really settling in.  But what is one name you don’t see on that list?  One big name on the Internet scene.  One that starts with a G.

googlebot

Yes, Google bought Orkut.  But they’ve done nothing with it, nor does it appear as if they will.  They’re more interested in the real players of tomorrows’ social data streams.

I’m not sure that anyone has stopped to assess Google as a social networking platform.  Let’s go through the procedure.

  1. You have your Gmail account because, hey, it’s the single best interface to email you’ve ever used.
  2. Gmail gives you an instant messaging address.  Then it embeds the chat client into your browser.  Got friends who are on Gmail, too?  You can chat with them while responding to your emails contributing to your conversations.
  3. A friend sent you an email and offered to meet you for dinner.  They said, “Let’s meet at the Tavern at 6:30 this Wednesday.”  And Gmail, off to the right-hand side of the message when you open it, posts a neat little link to add this even to your Google Calendar.  Because, if you’ve got a Gmail account, you’ve got a Google Calendar account.  Convenient.
  4. A coworker sends you an email with a Word Document attachment.  Gmail appropriately adds a link where you can open the attachment as a Google Doc.  Edit it right online.  That way, if you have to work in the other office tomorrow, you can access the current file version from wherever you have an Internet connection.  Better yet, allow your associate access, and you can both edit it, on the cloud, in real time.
  5. A family member emails you some pictures, and you notice that it’s convenient to upload them to Picasa Web, since you’ve already got a Google Account.  It’s a pretty intense application, so you stick with it.  You can share these photos with anyone you wish.  Set up slide shows.  Tag people.  Comment.
  6. You visit certain sites with RSS every day, so, you decide to try out Google Reader.  See how it works.  It works well, so you stick with it.  Then you notice that when people from your Gmail contacts share items on their Reader account, you can view those shared items, and vice versa.  You can comment on shared items, by the way.
  7. Google now owns YouTube, so you can post, share, comment on, and rate videos without creating another account.  Nifty.
  8. Do you have a blog?  Lots of people do.  It’s called Blogger.  Which Google now owns.  A good site, and they made it better.  Of course blogging allows for link swapping and comment-posting – that’s the point.
  9. If you’re familiar with Google Checkout, a growing number of sites are supporting it now.  I trust Google more than I trust your-favorite-trinket-site.com.  So, you pay Google, Google pays your hobby site, and only one party ever sees your credit card numbers.  You just have to log in with your Google Account.

I use all of these applications.  They work.  And they work well.  In fact, they routinely exceed my expectations.  That’s why I still use them.  They’re quick, clean, concise, and easy.  And, a lot of times, fun too.  And hey – if I have family, friends, or coworkers, then I can share, collaborate on, and comment on, content from any of the above services with them.  And, as the months continue, these applications are all talking to one another more and more.

Sound familiar yet?

A “real” social networking site builds their own brand of a contact list.  And then, around that concept, they take “real-life” ideas like email, instant messaging, status updates, photo and video sharing, blogging, microblogging, etc., and sort of smash those concepts down into the framework they’ve built.  Facebook messages being a concise form of email.  A MySpace page is kind of like a blog.  Twitter is nothing but a bunch of short status updates.

These companies take the idea of connecting people, and then add in boiled down versions of other standard and proven Internet activities.

Google is not a social networking site.  Google is a social networking platform.  But why don’t people ordinarily think of it that way?  Because they’re coming from the other direction.  Their (well deserved) success in web search allowed them to launch a number of different projects, specifically the above list – their core application suite.  And then they said “Well, one Google Account gives you access to all of them, so why don’t we let the people, and the applications, communicate?”

And now, in this stage, you can create a public profile, and manage many different services from one single page.  What gives them the advantage?  They’re not restricted to the brand of one single site.  Their platform allows them to develop full scale and enterprise ready web applications (or buy them – think YouTube, Orkut, maybe Bebo or Twitter, we’ll see), and then simply link them to your account.  Google’s ownership of YouTube will do to the average video site what Wal-Mart is doing to that local Mom and Pop store that’s been around since you were born just gone out of business.

Genius.  And bravo to Google.  It makes them a dangerous player in the social networking market, whether or not ComScore considers them in the running for such.  It also puts the onus on Google to get things [like security] right.  Recent bugs found in Google Docs haven’t made privacy advocates happy (of course to be fair, privacy advocates will never be happy).

Google: Kudos for the sleeper attack.  But don’t lose your focus or drop the ball.  Privacy concerns and anti-trust bluffs are two current, very public, attack vectors to your stretegy.  If the market finds a third, it’ll hurt.  Resoundingly.

Categories: Uncategorized Tags:

PHP's empty()

April 8th, 2009 Chris No comments

Two quick notes on the use of the empty() function in PHP.

1) It’s actually a language construct, not a function.

2) As per #1, you can’t do things like

if ( empty( array_intersect( $array1, $array2, $array3 ) ) ) {
    ...
}

If you try, PHP will break with no error message, and you’ll waste approximately two hours trying to figure out why the rest of your function is wrong, when it’s not.

Categories: Uncategorized Tags:

Class Wrapper #1: Thread (and Mutex)

April 6th, 2009 Chris No comments

In keeping with my promise, here is the first bit of my C/C++ library I felt like sharing.  This object oriented wrapper abstracts the POSIX Thread C API.  The library is mighty useful, exceptionally powerful, and devastatingly troublesome to debug if you’re not careful about it.  This post is not intended as a full reference or tutorial on the pthreads library, nor does my wrapper attempt to encompass all functionality possible with the library.  I’ve merely packaged the basic threading functionality into a couple of classes, and slapped my name on it.

If you have absolutely zero clue what threading is, I recommend you follow the above link.  Maybe with some basic Google or Wikipedia searches to supplement it, and bootstrap your knowledge.  From 30,000 feet, threading effectively allows your program to run in two (or more) different places at once.

That having been said, the rest of the post assumes some working knowledge of threading.  The concept I’ve designed with my Thread class is twofold.  First, it eases passing arguments into, and retrieving return values from, threads.  Secondly, I’ve built the class such that each Thread can be pseudo self-aware.  By that I mean, you can inspect a Thread object to see whether or not there is an underlying pthread running for it, and test whether that underlying code has executed to completion and would be ready to join, without actually join()‘ing the thread (which could cause the parent thread to block, if the child thread wasn’t done yet).

Thread awareness is easily acheived, at the cost of one layer of indirection.  The programmer must specify a function pointer (which is where the newly created thread will begin execution) and an argument (as a void*).  The void* that gets passed into the starting thread function though, is NOT the value you specified, but rather a pointer back to the Thread object.  With a pointer to the Thread object, the argument you actually want can be accessed via the method Thread::argument() as a void*.  This abstraction is useful, because with a pointer to the Thread object, you can relay to the parent thread status information, send and receive messages and other data, and so on.  In particular, just before the “return” of the child thread, you can make a call similar to

thread->complete( true );

Which is of course a variable visible to the parent thread.  This allows you to check the status of child threads, like so,

if ( processing_thread->complete() ) {
...
}

without any potentially blocking operations.  I’ve found this technique to be highly useful in applications where it is desirable to maintain a continuous pool of threads, and constant blocking on pthread_join() can really be a lag.

Header file: thread.h

Implementation file: thread.cpp

Test driver / example program: thread_main.cpp

Note: Be sure to add the “-lpthread” linker directive when compiling.  At least, that’s what it takes on my particular system (Fedora 10 on an i686).

Update: Added a Read/Write Mutex in addition to the standard Mutex object.

Categories: Uncategorized Tags:

Releasing Code

April 6th, 2009 Chris No comments

I intend to release [most of] my ever-growing C++ library.  Some of the code abstracts C APIs, and some just contains and packages common tasks.  All of it (for the most part) is object oriented.  I first mentioned this library a while ago, but I figure it’s best if I actually begin releasing the code.  It forces me to critically examine, document, and debug the classes (making better software – which, after all, is the whole point), and just maybe someone out there will find the stuff useful.  I’ll put the first one up tomorrow, if all goes well.

Some things to note:

  1. I am not perfect, nor is my code.
  2. If (read: “when”) you find bugs, have suggestions, modifications, improvements, etc., let me know.
  3. The code will all be released under the GNU LGPL (v3).
  4. If you find any of the code useful, give me a shout!  I’d love to hear how and where my code is being used.
Categories: Uncategorized Tags:

The Empty Pool

April 5th, 2009 Chris No comments

empty-pool

The follow-up to my post on the C sockets API has been a long time in coming.  It’s been a bit of a crazy semester, but I assure you we’re closing in on a time when I will consider the OO wrapper ready for public release.

But I’ve been sidetracked.  Where?

I did some research (the exact in-depth details and findings perhaps are fodder for a future entry) and concluded, that, as a responsible citizen of the Internet, from now on, whenever I write a networked application, I am going to ensure that it is IPv6 compliant.  Really – that’s why I’m rewriting my Socket class.  Are you man (or women) enough to match that?

For some time now, we’ve known that the IPv4 address space is running low.  It’s estimated that by mid-2011, the IANA will deplete it’s unallocated pool of IPv4 addresses.  Following this well into 2012, the first RIR is projected to deplete it’s store of addresses.  Some people claim that these estimates are entirely too aggressive, and that we’ve got a good plenty years of unallocated IPv4 addresses left.  Perhaps (and hopefully) that is the case.  Whenever it happens (even if it’s ten years from now), though, it WILL happen, regardless.  We’d be wasting our time to try and eek out every last IP4v address, instead of adopting the next wave.

So the unallocated IPv4 address pool is running dry.  If nothing is done, no new clients will be able to connect to the ‘Net.  Why aren’t you running around like a chicken with your head cut off?  One of two reasons, I estimate.  Either A) because you know about this impending deadline and have taken steps to assure that your organization is either prepared or preparing to be fully IPv6 compliant or B) you either had no idea IPv4 was running out, knew and didn’t care, or knew but didn’t have sufficient motivation to upset yourself migrating.

If you’re in group A, I applaud and respect you and your proactive tenacity.  Please advise us that are less enlightened and experienced.  Guide us.  Mentor us.

If you’re in group B, fear not.  I’m still in the middle of the transition myself.  Journey along with me.  Write your war stories here.  We’ll learn something, it’ll be fun, and you’ll be ready when the IPv4 crunch hits.  Bonus points if you’re an administrator, and you update your dns/web/mail/ftp/etc services for IPv6 compliance!

Wherever you sit, I think you can agree that it’s high time we, as a community, embrace this opportunity to advance ourselves.  It’s not a chicken-and-egg problem.  We HAVE the technology – we just have to jump on the bandwagon.

the_six_million_dollar_man4

Categories: Uncategorized Tags:

It’s All About Balance

April 1st, 2009 Chris No comments

A few things have happened to me within the past 24 hours or so which have caused me to reflect on the art of balance.

Yes. That’s right. Art.

Balance is a skill.  It a branch of science.  It’s a calculation.  It’s an assertation.  These are all very deterministic adjectives.  Very analytical.  Then why call it an art?  Because whatever your natural affinity for balance, it takes a certain amount of practice and experience to achieve it gracefully.

I’m not talking about physical balance, in the sense of Tai Chi or ice skating.  I’m talking about balance in software development.

Brass Scales Of Justice Off Balance, Symbolizing Injustice, Over White

Last night I reached a very minor milestone in a personal project of mine (a high-speed data storage system) and I came to a point where I needed to stop and think to myself: How do I proceed from here?  My first and primary goal with this system is speed.  Raw horsepower.  Later, once I feel comfortable with it, I’ll worry about things like high-availability and all the garbage that comes with it, but my first objective is laser focused on insane single-(server-)node performance.  It’s coded in C/C++.  Do I use direct system calls and handle each and every little operation where necessary?  Do I manipulate APIs in their intended format?  Or do I use nice OO class wrappers, and eat a few cycles of overhead due to object creation/destruction and method calls that do nothing more than call their lower-level counterparts?

My answer: Use the wrappers.

This afternoon I was chatting with a new coworker of mine (project information and shameless plugs coming soon!) about a generalized manifestation of the same predicament.  To abstract, or not to abstract?  When you don’t, you save yourself some processing cycles during execution, but maintenance can be a nightmare.  When you do abstract, the job is easier (overall) and saves you time coding.  I also like abstraction because it allows you to focus on what you want to do, rather than the nuts and bolts of doing it.  The whole “forest and the trees” analogy in disguise.

My answer: Abstract.

And just earlier this evening I was reading a post by Jeff Atwood where he argues that programmers need not also be a hard-core mathletes to effectively do their job.  While I like to agree with the fact that every developer on the face of the planet doesn’t have to have a mastery of vector calculus or be able to multiply three five digit decimal numbers in hex, I don’t particularly like the dismissal of the subject in general as applicable to our field.  I digress.

Jeff’s Answer: Learn what you need to, when (and if) you need to.

Why code abstractly when all it does is take up CPU time?  Why don’t you need to spend 4 years studying mathematics to program a simple socket server?  The answer: balance.

There are two kinds of people (wayyyy broad overgeneralization coming up here – prepare yourself) on the Internet.  Those who are looking for finished goods and services to consume, and those who are looking for good ideas and proofs of concept to produce new goods and services.  If you’re in the latter group, the code itself won’t matter a great deal (yet).  If you’re in the former, then your time is probably worth a lot more than not using that graphics library in the raw.  One instance where you can probably get by throwing hardware at the issue (again, I agree with the fundamental premise of his post, but I think Jeff takes it too far).

There is a delicate balance to be struck between the intensity of recoding the POSIX threads library for your application, and using the one right in front of you.  Between pure MVC design, and just tweaking this or that for it to work.  Between using regular expressions, and standard string functions.  Between ensuring your loops are well-formed and variables are all passed by constant reference where possible, and not.  Between measuring which is faster for copying strings in C, strcpy() or memcpy(), or not caring and moving on.

You might notice, that those arguments get more and more “reasonable” as the list goes on.  There’s always a balance point.  Where Google has the infrastructure, time, money, programmer-hours, technical fortitude, and probably reason, to write their own thread library, you don’t.  You don’t (probably).  But do you have (and should you spend) resources on deciding whether or not to use PCRE rather than grunting through the task with standard string methods?  It depends.  You’ve got to weigh your time (not just inital development, but also maintenance and upgrade time) against the efficiency gain or hit (depending on how important this is in your application).  It’s all about balance because you want your code to be efficient, but Jeff’s right, your time is more valuable than your hardware.

clock-money

Categories: Uncategorized Tags: