Archive

Archive for August, 2008

Useful Applications: Part II

August 8th, 2008 Chris No comments

If you’re reading this post, and you’re familiar with version control software, and you’re not using it for one of your current projects, shame on you.

That having been said, for those unfamiliar with it, let me explain. Version control software is a client-server application which tracks changes in files over time. That’s it. Pretty dull, huh? It has it’s uses. When you want to start a new project, you create what’s known as a “repository” for it under the version control system. The repository keeps track of all the files you’re dealing with. If you delete a file from your project, you tell the repository that you deleted it. If you add a file to your project, you tell it that you added a file. And when you change files, you tell it which ones you changed. And that’s really all there is to it.

If this seems unnecessary, that’s understandable. It did to me at first. But there are numerous advantages to be gleaned from using version control software, such as CVS, or Subversion which is what I use. Ever have one of those dismal “oops” moments? The kind when you’re crawling around on the command line (am I the only one who still uses the CLI?) and you accidentally delete a file? Or you accidentally overwrite the wrong file? Well, them’s the breaks, I’m afraid. Unix isn’t shy. You delete a file, it’s gone. Unless you’re sitting on a nice desktop like Gnome or KDE, which has built in recycle-bin functionality, then that file is history.

Or is it?

If you’ve been using version control religiously – which I submit as the only way to use it – then all you have to do is open up your version control client application (I recommend TortoiseSVN if you’re on Windows, and RapidSVN if you’re running Linux) and revert the dirty deed from the repository. It’s that simple. There’s always the wonderful case when you find a bit of code, and you don’t know when or why it was added, and you remove it, only to find out days later it broke another section of the application. Version control will let you bring up ANY previous version of a file that you’ve committed to the repo (that’s short for repository), at the push of a button. And with a nice file comparison tool, you can easily see what you changed, and where.

A good version control client will also allow you to merge files together. Say you’re working on lines 1-40 of a file, and I’m working on lines 41-100. Well modern version control systems were built with teams in mind. You commit your changes, and I’ll commit my changes. If our code changes don’t affect one another, then the two will be accepted by the repo without complaint. If they do conflict, however, you’ll be able to see what’s different where, and easily rectify the disagreement.

If you aren’t using version control for every project, then you’re doing yourself a disservice.

As a note, I like Subversion because of it’s ease of use. Also, Tigris (who maintains Subversion) puts out both the Windows and Linux clients I mentioned, TortoiseSVN and RapidSVN, respectively. So it’s all packaged by the same great folks. And those file comparison tools I was blathering about? TortoiseSVN has one built right in. For Linux I can recommend Meld. There’s a small learning curve, just in getting used to how these systems work, and learning to discipline yourself into using them properly, but once you get it – I hate to use cliches – it really is one thing you’ll wonder just how you got by without.

For those who don’t have a server of their own, or don’t feel like tooling around with daemon services, there are free online services such as the wonderful Assembla which provide a Subversion server to you, wherever you are. All you need is a network connection.

Useful Applications: Part I

August 8th, 2008 Chris No comments

I’m going to start a small series on applications (or generic types of applications) that every developer should (read: really, really, really should) be using. Now of course I’m going to be able to give examples and recommendations for only the brands that I use. My goal is to introduce new types of software to those who have never used them, and not to wage a religious war bickering about which implementation of the technology is “best.”

Today, we look at automatic documentation software. What’s that… a program that writes your comments for you? Hardly. But what documentation generation software WILL do, is take the comments you write in your code, and coerce them into a more user-friendly format, usually within an html environment. And, if you pick the right software, it can do a whole lot more for you.

A lot of developers tend to think of documentation simply as the comments throughout their code.

/* This is a C-style
* block
*comment
*/
# This is a shell script comment
// This is a PHP-style comment

And of course there are more. This is nice for the casual developer, personal projects, and the like. But for really large projects, for hard-core open source stuff, or for professional software, there’s something missing here. What if you’re on a team, and you want your team to know how to use an API you wrote, without worrying them with the specifics of it’s implementation? What if you just want a professional looking interface to your latest and greatest API? Perhaps you’d just like something to throw online to prove to others “Hey, I am actually working on this” without blindly tossing source code onto the interwebs. This is where documentation generation comes into play.

I’ve got experience with Doxygen. Great open source project. For Doxygen (and it’s the same with others, such as phpDocumenter and Javadoc), all you need to do is format your comments the proper way. For a block-style comment, use the form:

/** This is a bloc comment. */

or

/** This is a
* block comment.
*/

Inline comments (C++-style) go like this…

/// This is a line comment, and applies to the code below it.
<This is some code>

And this

<This is some code> ///< This comment applies to the code preceding it.

That’s it. That’s all you have to do, and Doxygen will create documentation for you in a slew of forms, including html, xml, latex, rtf, postscript, pdf, and even Unix man page format. Here‘s an example of Doxygen’s output. Note that all the developers of that code had to do, was to follow the commenting convention set by Doxygen. Doxygen was responsible for chewing through all of the code, writing the HTML, styles, diagrams, links, lists, tabs… everything!

Doxygen supports “C++, C, Java, Objective-C, Python, IDL (Corba and Microsoft flavors), Fortran, VHDL, PHP, C#, and to some extent D” according to it’s website. Now like I said at the beginning, which documentation generator you use is your choice, but I’d HIGHLY recommend trying one/some out then actually using it/them. Doxygen is also capable (alongside graphviz’s dot tool) of creating images that depict call graphs, dependency trees, and the like.

Oh, a rappa!

August 8th, 2008 Chris No comments

That title should probably read “OO API Wrapper.” Sorry, wrong regional dialect. My mistake.

I’m pretty sure that anyone who has (successfully) used an API at some point will tell you that they are wonderful. What I’m also guessing, from my admittedly limited experience, however, is that a lot of them will also tell you, if they’re honest, that APIs can be clumsy. Why is that? From the C/C++ perspective, the answer is dreadfully clear to me. Many of the most basic and useful APIs are written in C. Which means the large portion of them were written using a procedural paradigm.

You’re required to create a resource handle object (be it a pointer to some struct or class as in the MySQL and cURL C APIs, or maybe just a plain old int, as is the case for using POSIX threads), and then call a barrage of functions (passing the handle as a parameter each time) to manipulate it. Straightforward enough for any reasonable person, I’d estimate. Arbitrarily using cURL as the example, we typically see something like the following…

CURL* handle;
curl_easy_setopt( handle, CURLOPT_ERRORBUFFER, _error );
curl_easy_setopt( handle, CURLOPT_FAILONERROR, true );
curl_easy_setopt( handle, CURLOPT_FOLLOWLOCATION, true );
curl_easy_setopt( handle, CURLOPT_HEADER, false );
curl_easy_setopt( handle, CURLOPT_NOSIGNAL, 1 );
curl_easy_setopt( handle, CURLOPT_AUTOREFERER, 1 );
curl_easy_setopt( handle, CURLOPT_SSL_VERIFYPEER, 0 );
curl_easy_setopt( handle, CURLOPT_NOPROGRESS, true );
curl_easy_setopt( handle, CURLOPT_CONNECTTIMEOUT, timeout );
curl_easy_setopt( handle, CURLOPT_MAXREDIRS, 5 );
curl_easy_setopt( handle, CURLOPT_TIMEOUT, wait );
curl_easy_setopt( handle, CURLOPT_USERAGENT, agent.c_str() );
curl_easy_setopt( handle, CURLOPT_VERBOSE, false );
curl_easy_perform( handle );

Not too bad, a few lines for the ability to browse the web. Decent enough, says me. Especially when the alternative (without the API that is) would be to use the system() function to pass a string like the following to the OS.
curl -f -v –connect-timeout 30 –data-ascii –url http://www.somesite.org/index.html

Portability? What’s that? Worse yet, you could use libcurl nude. Yikes. And by the way, that string isn’t NEAR complete, if we’re dedicated to all of the customizations represented in the sample code. I don’t even want to think about that. So… looking at it this way, an API couldn’t get better. But seriously. What kind of self-respecting programmer doesn’t try, at least once in a while, to fix what isn’t broken? Come on… admit it. You’ve done it. You did it once a long time ago and learned your lesson. And then a situation, much like the above, arises and you said to yourself “Robert – ” since that may in fact be how you address yourself – “Robert, this works… but I can do better.” It’s natural. Some of these APIs – as much functionality as they offer, and as wonderful as they are, just don’t cut it sometimes. The libcurl C API, for instance, requires you to implement two callbacks, which are responsible for some very low level memory allocations and pointer dereferences. Stuff that would make one of this new generation of upcoming greenhorn developers (who know naught but their fancy, highly abstracted, memory managed interpreters) run home and cry in the fetal position. Jeez – I didn’t even want to do it, and I’m a fan of the low level stuff. So who wants to do that every time you write an application that needs to rip a web page? Who wants to do it for a SINGLE app that needs to rip just one page?

Not me.

Probably not you.

And, if you’re anything like me, you despise re-writing code. One of the biggest problems with this type of code is the fact that it flies in the face of all that fancy object oriented paradigm you’re expected to exalt. This is my idea. For about a year now I’ve been doing this, and it’s worked wonderfully. Before you use an API, think about it. Then, take the hour to learn the API. Test it a few times. Be a little crazy. Get comfortable with it. And then take a second hour… maybe two… and write a well formed, object oriented wrapper class for the API. Commit this class to your favorite version control system (I prefer Subversion). Then take the time to debug it. Let’s face it. Sometimes, you just have to sidetrack from your project, and I think this is a perfectly legitimate cause.

Let’s examine the situation. You’ve now got a properly written class wrapping an API at your disposal. Granted, some APIs are quite large, so this may be no trivial undertaking. But, get the bare bones working first, and then add on as you need/want to – as with any development project. Instead of all that code *up there* (that doesn’t include stuff like those nasty callbacks, which, using my method, you’d only have to write once) all you need is something like this:

Curl curl;
curl.setUrl( "http://some.place.dot.com/index/page.html" );
curl.exec();

Oh, and the html you grabbed can be accessed via curl.getResult(). Isn’t that a lot better? Granted, you’re now probably half a day “off course.” But guess what? You were going to have to learn and get used to the API anyway. You were going to have to hack your way around it’s quirks anyway. You were going to have to write 90% of that code anyway in your actual application. And, the real reward, the next time you have to use cURL in an application, it’s as easy as copying the wrapper class files into your project directory and using them.

Personally, I’ve got wrappers written for cURL, MySQL, PCRE (Didn’t do much for this one, just re-packaged libpcrecpp from teh kind folks over at Google), Pthread, and a C++ implementation of the MD5 algorithm I found online. I’ve also got a wrapper on some common time functions in a class called “Timer.” And they’re all committed to a single version control project. When I need to add functionality, I go back, update this project with the proper code, and then, once all is copesetic, copy the updated files to whatever project needed the wrapper.

Now, of course, the wrapper can’t do EVERYTHING for you. Some APIs, like cURL and Pthread, require callbacks and whatnot. Of course, you deal with these case-by-case as appropriate for your situation. But I’ve found that these wrappers can save a LOT of time in both the writing and debugging phases. Also, they pay for themselves the first time you use them, if for no other reason than the fact that they keep your project code looking neat and object oriented, as it should be. Develop them generically, and you can use the same wrappers for multiple projects (of course adapting them as needed, when needed).

Estimates? We don't need no stinkin' estimates!

August 8th, 2008 Chris No comments

ETAs on projects are among the very most necessary evils of software development. Why are they necessary? For a slew of reasons. Think about it – you don’t hand your car to a mechanic without asking him when he’ll be done fixing it. You wouldn’t pay a plumber to renovate your bathroom without finding out exactly how many days or weeks worth of mornings you’d have to walk next door to take your showers. In the same light, we cannot, as software developers, expect our overlords to agree to sending us nice pretty paychecks every Friday without any [reasonable] expectation of when we will deliver them their product.

So why are estimates evil? Well, we don’t have to discuss this one in great depth. Come on, they stink! Right? Think about it, you’ve got a large project ahead of you. One that involves multiple languages and protocols, different APIs, synchronization across processes and maybe even machines, mutual exclusion, exception handling, and/or whatever the case may be. And you’re supposed to [accurately, and with a straight face] tell the guy standing in front of you – who has NO idea what this project entails – when you’ll be done. Yeah… estimates are can be rough. Especially if it’s a project that’s relatively novel.

In the past, I struggled trying to get past the barriers to be able to honestly give relatively accurate predictions on software delivery times. But in a stroke of creativity (I don’t get these often, so it’s always nice when it does happen) I stumbled across the perfect analogy.

Picture yourself in Los Angeles. In a vehicle. Now, in twenty seconds or less, I want you to try to estimate how long it would take you, from there, to drive cross country to New York City. Got it? Of course you don’t. It’s difficult enough to even know where to start, unless you’ve actually done this kind of thing before. You can go on Google Maps, and it’ll estimate for you (1 day and 17 hours if you’re feeling a bit too lazy to follow the link). But will it really take this long? Can Google account for traffic jams? Accidents? Bad weather? Police? Detours? Potty breaks? Food breaks? Mechanical problems? Refueling? Sleeping?

Of course not. As much as I’m a Google fan, it’s not that good [yet].

I’ve found this analogy to be strikingly apropos. The perfect way to explain to your manager why exactly it is so difficult giving software delivery estimates. But it doesn’t help anyone who still actually needs to learn how to estimate projects with any degree of accuracy. You can read the Five-Minute-Task Time Estimate Worksheet. This is a good lesson for anyone new to the game – and hilarious for anyone who isn’t.

So, with all I’ve written thus far, you may by now be expecting some revolutionary, well presented, possibly alliterative, three step guide to producing accurate estimates on software projects. Well, here it is – get off your duff and start working. There is nothing you can be taught in this regard, in my experience. The intuition one may seem to have regarding project estimates has only been earned through much development. Many iterations of seeing projects (however large or small, personal or corporate) go from conceptualization to implementation.

Sorry to leave you high and dry, but experience is, after all, the best teacher.