Tuesday, May 21, 2013

Say Goodbye to PHP shell_exec() and PHP exec()

Dealing with the shell in PHP can be a pain. If you are executing anything more complicated than a no argument command with a one line string result, things start getting complicated.

To get started, if you want to check the return status of your commands, you need to pass a reference parameter to exec()! Ok, so it's not returning the command exit status, well then you might expect that it returns the command's output instead? Nope; the return value of exec()is actually the last line from the command's output. If you want to get the full output of the command, then you're looking at yet another reference parameter. And wait, which order do those references parameters go?

After you've got your parameters sorted out, what about escaping arguments for security and accuracy? For that, you're left manually using PHP's other global functions.

Alright, so now you've got a safe, accurate command. How do you know it works? You might want to test it, right? Well, if you want to test any of your code that uses exec(), you're looking at a rough road ahead because PHPUnit has no facility to let you make assertions against parameters passed by reference. Furthermore, if you have more than one call to exec, then you really have no way to stub results out of the box because you can't differentiate between calls to exec. I previously blogged about one approach I've taken to solving this -- http://asheepapart.blogspot.com/2012/10/testing-shell-methods-with-phpunit.htmlMockShell works alright, but it doesn't help us with escaping the command for security or accuracy.

Enter the Bart Command class. Command is a simple class that acts as a facade over a single shell command. It takes a variable number of arguments to its constructor, representing arguments to the shelled command. The first constructor arg is reserved for the actual command itself. Each shell argument is escaped and then substituted for placeholders in the shell command.

A command can be passed around and executed on demand. The results are returned as a full string or an array of each line. If the command fails, a custom exception of type is CommandException thrown.

We're using Command extensively in our internal code bases with a lot of success. Currently, you can see it used in a few places in the Git class.

I also created the gist below,


Testing is easy. There isn't any pass-by-reference magic, you can just apply all your standard testing know-how. You can use Diesel to inject your stub into your system under test. You can also mock the results of the shorthand Shell->command() method.

Monday, May 13, 2013

Bash Completion with SSH or SCP

I spend a lot of time ssh-ing around to the same machines. Using ~/.ssh/config helps me create nicknames for those hosts, but what's even better is bash completion. I created the following script to register the "ssh" and "scp" commands with bash completion so that I can just hit <TAB> to let bash complete the possible machine names.

I spent a lot of time reading up on how the bash completion system works and tried to make this script as clear as possible so that I can adopt the same strategy for future completion work.



Hope this helps!

Tuesday, May 7, 2013

CodeIgniter & PHP 5.4 -- Creating default object from empty value

I, sadly, use Code Igniter to power several of my legacy web applications. We recently upgraded our development environment PHP 5.4 and have not yet upgraded our production systems. We work with these apps infrequently in development and this hasn't been a problem. However, today I needed to test moving traffic from HTTP to HTTPS on one of the apps and kept receiving the several instances of the following message,

Severity: Warning
Message: Creating default object from empty value

Each of these warnings was linked to lines in my base controller (REST_Controller) that were performing assignments to properties of class fields. E.g. $this->rest->db = $this->load->database(...). I was unable to replicate the error on another machine running 5.3 and determined that my problem must be arising from PHP 5.4.

Code Igniter works in weird ways and my first guess was that some magic code or eval was failing due to the elevated strictness of PHP 5.4. So I embarked on a mission to discover where the class fields were being assigned inside Code Igniter. I scoured deep into CI_Config and CI_Loader and Code_Igniter.php, but couldn't find my answer; there were no magic methods or variable variable assignments. Feeling rather dejected, frustrated, and confused, I took another look at the error message: "creating default object from empty value" and an idea formed in my mind: what if PHP was telling me that it was in fact creating the class field right there on the spot?

The easiest way to test that theory was to create the object myself and test if the message still appeared. I updated the code to create the field just before each of the failing lines. E.g.

$this->rest = new stdClass();
$this->rest->db = $this->load->database(...);

And, voila -- no more warnings! So it turns out that the problem had little to do with Code Igniter itself, but just that historic PHP is way too friendly to lazy coders.

Now, with that out of the way, on to testing the HTTPS migration...

Thursday, April 25, 2013

Best Practices of Code Reviews

Long ago I read a great white paper on the Best Kept Secrets of Code Review published by SmartBear software. It's an excellent study of code reviews and offers some great advice. Since then, I've continued learning and owe many thanks to some great engineers and thought leaders. What follows is a compilation of my current thoughts on the subject.

Before I get started, I'll give my definition of code review:

  • Code Review -- a critical analysis by an individual or group of a functional unit of code, be it a diff, a completed work, or a combination of both.

Next, the five types of code review, as defined by SmartBear -- plus one from me.
  1. Over the shoulder
  2. Email
  3. Tools, something like Gerrit
  4. Pair programming
  5. Formal inspections
  6. Self review
I feel that the first two on their own are insufficient for any development organization with a focus on quality. The informality of over the shoulder or email does not induce the proper level of seriousness. Emails can be lost and are not available to the team at large.

A (good) code review tool is irreplaceable. It provides history, contextual comments, patch sets, a web link to reference for the future, etc. It also provides an integration point for build systems and development workflows, a critical component to any team attempting continuous delivery. Pair programming is very valuable, for more than just code review, but is not the subject of my post.

Formal reviews can go either direction. I'll start with a bad one. At a previous company, we used to hold ad hoc, hour long code reviews. We had no tool and instead would invite the entire team into one cramped room in which we'd hook up a laptop to the TV and someone would show us part of their code. There was rarely any preparation on part of the presenter, nor any advance notice of what we'd be seeing. It was kind of like open mic night at the local coffee shop.

Looking back on this, I'm astonished that we didn't realize its ineffectuality. Roughly 70% of the engineers did not pay attention; reviews never started on time; an average max of only 15% of the attendees actually had exposure to the code under review; bugs were rarely found; meetings often devolved into tangential debates. In other words, it was a terrible waste of time.

Not all formal reviews have to be so dreadful. In fact, we conduct very successful formal reviews at my present company. We call them "Code Workshops." These entail advance notice of the code being shared, an assigned reviewer who comes prepared, and a (comfortable) room full of attentive and respectful engineers. The outcome is generally positive and we all leave a little wiser.

Finally, self reviews. A self review is the act of critically reviewing code that you're written. This can happen at any point: before you submit for review by others, before you refactor it, or just for fun on a lonely night. Self review should not be overlooked. It is a critical factor in effective peer code review and overall a fundamental factor to personally developing better code practices.


Given the preceding context, I've come to learn and adopt the following lists.

Learnings

  • The cost of finding a customer reported bug versus finding a bug in development is significantly disproportionate. The cost benefit of ensuring quality through effective code review and other means is immense.
  • Sign-off from senior engineers should not be required. This says "We don't trust you." You should inherently trust the people you hire to be smart enough to *request* code reviews when its needed. If you're faced with habitual stealth submitters, your problems with those employees may extend beyond just code reviews.
  • Synchronous pair code reviewing for complex changes reduces feedback cycles and total time to deliver. Moreover, beyond code review, it promotes rapid knowledge transfer.
  • An engineer's time is your most precious resource. Make sure it's used as effectively as possible and avoid doing anything manually that can be automated. In the context of reviewing code, automate as much as you possibly can with static analysis tools and build systems so that engineer's spend their time leaving unique comments (and not nit-picks about code standars). We use Facebook's Arcanist for our static analysis of PHP and Adam Rosien's Sniff for Scala.

Best Practices

  • Use code review as training for new hires. Keep track of good reviews in a team wiki.
  • Do NOT use code review as a way to teach people how certain code is used. Hold training sessions and record them and upload them internally (or to Box!).
  • Do not oversubscribe reviewers: although having more people look over the code will most likely result in more defects being discovered, the relationship quickly becomes logarithmic rather than linear.
  • Strictly enforce limits on the size of a review change set. Effectiveness in reviews begins decreasing after the first 15 minutes spent reviewing and drops drastically after 60 minutes. Respecting this practice will also help when bisecting code history during bug hunting.
  • Code reviews of unfamiliar code should be accompanied by a brief overview or an in person conversation.
  • Before requesting a code review, the original author should very critically review their own code. A reviewer is not a janitor. In fact, authors often find several problems in this phase.
  • Create a checklist of common issues. Checklists helped uncover 30% more defects in equal amounts of time in studies conducted by SmartBear (paper, page 126).
    •  A checklist should be supplemental to automation by a static analysis phase of your build system.
    • See sample checklist on page 112 of the SmartBear white paper.

Tips

  • Reviewers tend to spend majority of time on declarations and signatures.
    • Keep variable declaration closer to their use.
    • Use shorter functions.
  • Reduce feedback cycles by leaving concise, intelligent comments.


Final Thoughts


If I could encourage you to do one thing, it would be to find a good tool that works for your team and diligently learn how to use it to its maximum potential. Integrate it into your development process and use it in your build system. Treat it with the same level of importance as your code repository.

The bottom line is that no matter how you're doing it, code reviews help your team do better. These bullet points are the product of several years of software development and teamwork, but nonetheless simply remain suggestions and not prescriptions.


Further Reading

http://smartbear.com/resources/whitepapers/best-kept-secrets-of-peer-code-review
http://www.ibm.com/developerworks/rational/library/11-proven-practices-for-peer-review/



Friday, April 12, 2013

Gerrit Code Review - Unpack error Missing unknown (Continued)

Several months ago I blogged about an issue I encountered administering the Gerrit Code Review system, version 2.3. Sadly, we had a recurrence today. This time, we had to take a different approach to solve it.

The problem manifested in much the same way. All of our gerrit users began seeing the following error message when pushing to Gerrit,
fatal: Unpack error, check server log 
remote: Resolving deltas: 100% (4/4)
error: unpack failed: error Missing unknown 98aab6fe33971a17b1bfb5a5288070e14f166b79 

The difference from last time is that this Gerrit installation is not the "owner" of the repository. We use gitosis to manage all of our repositories and keep the repositories replicated across several servers via post-receive hooks. This setup works great for our use case, until it comes to Gerrit. We do not require that all code be code reviewed, consequently not all code gets merged through Gerrit. For this reason, Gerrit can become out of date from the real repository. In order to fit Gerrit into our system configuration, we host one of our replicated repositories on the same server that hosts Gerrit. That repository has a special post-receive hook that pushes with --mirror and --force to the checkout that Gerrit uses. This keeps Gerit up to date with merges happening outside its control.

At some point this afternoon a commit related to an unmerged patch set disappeared from the Gerrit repository. The result was the error message pasted above for anyone trying to push new patch sets into Gerrit.

In this situation, we had no backups of the Gerrit repository. Digging into the Gerrit database, I found a reference to the commit in the the patch_sets table. The change_id lead me to the commit's author. SSH-ing to his machine and inspecting his git checkout, I was able to track down the missing object. Using git cat-file -t 98aab6fe33971a17b1bfb5a5288070e14f166b79, I identified the object was a commit. And upon asking him about it, he said it was the very commit that he had in code review.

I deleted the record from patch_sets, DELETE FROM patch_sets WHERE revision = '98aab6fe33971a17b1bfb5a5288070e14f166b79' LIMIT 1;.

We had him push to Gerrit again. The push was successful and I observed the record was recreated in the patch_sets table. At this point all other pushes from other users began functioning as expected.

Our problems weren't over yet. He was unable to view his review through the Web UI, instead Gerrit showed "Internal Server Error". The following error was showing up in the error_log,


[2013-04-12 19:16:01,526] WARN  / : Error in changeDetail
java.lang.NullPointerException
        at com.google.gerrit.server.project.ChangeControl.isPatchVisible(ChangeControl.java:177)

We identified that the patch_set_id numbers were out of order in the patch_set table for his change_id. We updated them to be in order, but this time, the change detail page loaded with no content. So far, we've been unable to fix that problem. Instead we ended up fixing the problem by generated a new Change-Id and pushing that as a new review into Gerrit, which worked as expected.

So, two questions remain unanswered:
  1. How did we get into this situation?
  2. How can we fix the problem review?
UPDATE April 15th, 2013

We have tracked the problem down with the missing patch set commit object to a race condition between Gerrit writing the commit to the repository and our external gitosis post-receive hooks pushing an independent commit at the same time. My assumption is that internally JGit is not handling lock files correctly, or getting confused with file descriptors. Read more at https://www.google.com/search?q=jgit+gerrit+broken

We've fixed this by removing the  --mirror flag from our post-receive hook, and instead specifying just refs/heads/master. This will reduce the probabilities of race conditions on lock files.

After much fiddling, I was unable to restore UI access to the change review in question. I surrendered to Gerrit and abandoned it over the ssh interface. 



Friday, October 19, 2012

Testing shell methods with PHPUnit

Testing is fun. Testing global functions isn't, especially when those global functions use pass-by-reference parameters.

I write a lot of command line applications with PHP and I often must deal with exec, shell_exec, and passthru. This causes major headaches when I need to simulate a behavior for the purposes of testing.

In Bart, we have a class named Shell, which wraps many of the global shell and system functions of the PHP language. It's also a great place to collect any methods that don't come out of the Box with PHP. Combining the Shell class with Diesel  lets me mock or stub out pretty much all of my interactions with the shell. There's one catch though: PHPUnit doesn't work with mocking methods that have pass-by-reference parameters. Both exec and passthru return information back to the caller via this approach.

Enter the MockShell class. MockShell is a small stub class exposing an exec and passthru method. It also exposes two other methods to configure the commands expected by either of these methods for the duration of a unit test. Finally, it provides a verify() method to be called upon completion of a test run to verify that its expectations were met.

MockShell takes a mocked Shell class instance as a parameter. Any method calls sent to the MockShell that it doesn't understand are passed along the mocked MockShell. This utility of this is to allow you to provide only one stubbed object when you configure Diesel for the test. See below,

function testSymlinksCreated()
{
 $phpuShell = $this->getMock('Bart\Shell');
 $phpuShell->expects($this->exactly(1))
  ->method('mkdir')
  ->with('~/code/nagios/logs', 0777, false);

 // Create the MockShell and supply it the mocked Shell instance
 // ...so that any calls to methods _other_ than exec and passthru
 // ...may be passed on to that mock.
 $shell = new \Bart\Stub\MockShell($this, $phpuShell);
 $shell
  ->expectExec("cd ~/code/nagios && ln -s /etc/nagios config", array(), 0, null)
  ->expectExec("cd /www && ln -s ~/code/nagios nagios", array(), 0, null);

 $this->registerDiesel('Bart\Shell', $shell);

 // Configure a generator for the nagios app
 $g = new Generator('nagios');
 // Expect that this creates the "logs" directory
 // ...and creates the two symlinks above
 $g->generateDirs();

 // Verify that the two execs were called
 // PHPUnit will verify the call to mkdir()
 $shell->verify();
}

Check out all the Bart code at, http://github.com/box/bart.

Saturday, September 29, 2012

Dan Pink on Creativity and Motivation

A friend recently shared Dan Pink's talk on Creativity and Motivation with me. Pink discusses the progression of the 20th century 1st world workforce from a majority of simple task oriented workers to a majority of cognitive, right brained knowledge workers in our modern day. He then proposes that this new workforce is motivated not by the same monetary goals as yesterday, but rather by more fulfilling incentives, which he lists as autonomy, mastery, and purpose.

I wholeheartedly agree that the leading thought workers of today are ultimately attracted to these higher causes of self-fulfillment. Yet, I differ from his purportedly absolute stance that motivating by these three factors and removing monetary reward from the picture is a recipe for success.

Pink provides two extended examples,

  1. Atlassian Fedex day. Fedex day at Atlassian is essentially a 24 hour period during which engineers are encouraged to develop whatever code they want, specifically nothing on which they're currently working. Eventually, this has graduated into a 20% autonomy policy.
  2. Microsoft Encarta versus Wikipedia. Encarta was a multi-year project out of Microsoft, costing them large amounts of money and diligent coordination. Wikipedia promised no money to any contributor nor did it set any schedule for content.
In the case of Atlassian, his claim is that the engineers produce higher quality work faster during autonomous time because they are working for themselves. I agree that this has a lot of merit, but I disagree that this measurement alone can prove his point. In a typical development environment, engineers report to product managers. Product managers set the specifications and timelines. Typically, the communication between the engineer and the product manager is poor and this poor communication leads to bugs, missed deadlines, and subpar products. Much has been written on this subject (see Agile, XP, Scrum, Kanban). While this isn't the only difference between autonomous and required development, it does provide reasonable justification to question the percentage that autonomy plays in the success of Fedex days.

Next, Encarta versus Wikipedia. Again, I agree that there is much more incentive to contribute to wikipedia as an individual versus a requirement to work on Encarta as an employee. However, I attribute to that discrepancy very little influence in the eventual failure of Encarta. Three other factors played a very large role in the outcome of the Encarta / Wikipedia showdown,
  1. It happened right around when the internet was becoming ubiquitous.
  2. Wikipedia is a SaaS model: available anywhere; always up to date; social. Contrast that against Encarta which was available only on your machine, valid only at the time of install, and did not provide a straightforward way to share information with everyone.
  3. Wikipedia is free (See Chris Anderson, Freemium).
Ultimately I'm not opposed to his proposal; I'm just not convinced by his two examples.

My feeling is that material gain must still be taken into account. For example, consider a job that pays $200,000 versus a job paying only $40,000 that promised full autonomy in choosing projects and deadlines: very few engineers will choose the latter. At 200,000 against 160,000, you'd potentially have a strong case.

My conclusion is that autonomy, mastery, and purpose drive higher quality innovation, but that the level at which those drivers operate is variable. Furthermore, this debate needs more unassailable data points.