A guide to using PHP CodeSniffer

By Lorna Mitchell

PHP Code Sniffer (PHPCS) is a package for syntax checking, available from PEAR. It can check code against defined rules covering anything from whitespace through doc comments to variable naming conventions and beyond. In this article we'll look at getting started with PHPCS, using it to syntax check our files, and then onto creating rules and defining standards. 

Installing PHPCS

There are two main ways of installing PHPCS: directly, or via PEAR. Using the PEAR repositories is recommended and adapts to all the various platforms. It's also probably very familiar to all PHP developers!

The alternative is to use the method available for your system – for example my Ubuntu system had an Aptitude package called php-codesniffer which installed this functionality for me.

To use the PEAR method you simply need to update your PEAR repository and then type:

pear install PHP_CodeSniffer

Now we have the package installed, it's time to take a look at what it can do for us.

Using PHPCS

PHPCS is a command-line utility which can output varying levels of detail and evaluate one file, a whole directory, or a pattern match of target files. Its output is a list of flaws found, with an error message and line number supplied.

By default PHPCS comes pre-installed with a number of coding standard definitions. To see which definitions are available for checking, use the -i switch:

phpcs -i
The installed coding standards are MySource, PEAR, Squiz, PHPCS and Zend

This shows some default coding standards including the PHPCS standard, the Zend standard (used by Zend Framework and many other projects), and the widely-known PEAR standard.

It's possible to build on and adapt these existing standards to fit in with the coding standards used by a particular project and this subject will be explored further later in the article. The various standards have different requirements for code standards and as such we can evaluate a simple file against a couple of different standards to see some immediate differences.

Take the following code sample:

_id);
return $ingredients;
}
}
?>

Validating this class code against the Zend standard, we use the following syntax:

phpcs --standard=Zend recipe.class.php

FILE: /home/lorna/phpcs/recipe.class.php
--------------------------------------------------------------------------------
FOUND 3 ERROR(S) AND 0 WARNING(S) AFFECTING 3 LINE(S)
--------------------------------------------------------------------------------
 10 | ERROR | Variable "prep_time" is not in valid camel caps format
 13 | ERROR | Spaces must be used to indent lines; tabs are not allowed
 17 | ERROR | A closing tag is not permitted at the end of a PHP file
--------------------------------------------------------------------------------

However the Zend standards don't require some elements which other standards do, for example the PEAR standards expect capitalised class names and opening function braces to be on new lines, and this is evident if we validate the same recipe.class.php file against the PEAR standard instead.

We use the same syntax as before but changing the --standard switch to PEAR:

FILE: /home/lorna/phpcs/recipe.class.php
--------------------------------------------------------------------------------
FOUND 8 ERROR(S) AND 0 WARNING(S) AFFECTING 5 LINE(S)
--------------------------------------------------------------------------------
2 | ERROR | Missing file doc comment
3 | ERROR | Class name must begin with a capital letter
3 | ERROR | Missing class doc comment
6 | ERROR | Protected member variable "_id" must not be prefixed with an
| | underscore
 12 | ERROR | Missing function doc comment
 12 | ERROR | Opening brace should be on a new line
 13 | ERROR | Line indented incorrectly; expected at least 8 spaces, found 1
 13 | ERROR | Spaces must be used to indent lines; tabs are not allowed
--------------------------------------------------------------------------------

We can go through our class and alter it to conform with these standards, the changes are mostly semantic but especially on large codebases, consistency and familiarity are absolutely key to facilitate easier maintenance and readability by developers. There are a few easy things we can fix in our code to have PHPCS standards show fewer warnings.

Here is the updated class file:

<?php
 
class Recipe
{
 
    protected $id;
 
    public $name;
 
    public $prep_time;
 
    function getIngredients()
    {
        $ingredients = Ingredients::fetchAllById($this->_id);
        return $ingredients;
    }
}
?>

The small tweaks to the class are small enough to almost insignificant, in fact a programmer glancing over one class or the other would probably have to glance again to be able to spot the differences. The clearest way to compare the files I think is using the diff file between the two versions:

3c3
< class recipe
---
> class Recipe
6c6
<     protected $_id;
---
>     protected $id;
12,13c12,14
<     function getIngredients() {
<   $ingredients = Ingredients::fetchAllById($this->_id);
---
>     function getIngredients()
>     {
>         $ingredients = Ingredients::fetchAllById($this->_id);

Now if we re-check the file against the PEAR standards you can see that only the missing PHPDocumentor comments are listed as problems with the file:

FILE: /home/lorna/phpcs/recipe.class.php
--------------------------------------------------------------------------------
FOUND 3 ERROR(S) AND 0 WARNING(S) AFFECTING 3 LINE(S)
--------------------------------------------------------------------------------
2 | ERROR | Missing file doc comment
3 | ERROR | Missing class doc comment
 12 | ERROR | Missing function doc comment
--------------------------------------------------------------------------------

To get our code to correctly validate we can then add the comments expected by PHPDocumentor. There are some great tutorials around for working with this tool, but the best place to start is on their homepage at http://www.phpdoc.org/.

Here's the class with the comments added:

<?php
 
/**
 * Recipe class file
 *
 * PHP Version 5.2
 *
 * @category Recipe
 * @package  Recipe
 * @author   Lorna Jane Mitchell <lorna@ibuildings.com>
 * @license  http://opensource.org/licenses/gpl-license.php GNU Public License
 * @link     http://example.com/recipes
 */
 
/**
 * Recipe class
 *
 * The class holding the root Recipe class definition
 *
 * @category Recipe
 * @package  Recipe
 * @author   Lorna Jane Mitchell <lorna@ibuildings.com>
 * @license  http://opensource.org/licenses/gpl-license.php GNU Public License
 * @link     http://example.com/recipes/recipe
 */
class Recipe
{
 
    protected $id;
 
    public $name;
 
    public $prep_time;
 
    /**
     * Get the ingredients
     *
     * This function calls a static fetching method against the Ingredient class
     * and returns everything matching this recipe ID
     *
     * @return array An array of Ingredient objects
     */
    function getIngredients()
    {
        $ingredients = Ingredient::fetchAllByRecipe($this->id);
        return $ingredients;
    }
}
?>

What's inside the machine

PHPCS works on the basis of tokenising the contents of a file and then validating those against a given set of rules. The tokenising step splits down PHP into a series of building blocks, and the rules are able to check all sorts of things against it. So if we were to tokenise the function we included in the class file at the start, we'd get something that looks like this:

Array
(
    [0] => Array
        (
            [0] => 367
            [1] => <?php
 
            [2] => 1
        )
 
    [1] => Array
        (
            [0] => 370
            [1] =>
 
            [2] => 2
        )
 
    [2] => Array
        (
            [0] => 333
            [1] => function
            [2] => 3
        )
 
    [3] => Array
        (
            [0] => 370
            [1] =>
            [2] => 3
        )
 
    [4] => Array
        (
            [0] => 307
            [1] => getIngredients
            [2] => 3
        )
 
    [5] => (
    [6] => )
    [7] => Array
        (
            [0] => 370
            [1] =>
            [2] => 3
        )
 
    [8] => {
    [9] => Array
        (
            [0] => 370
            [1] =>
 
            [2] => 3
        )
 
    [10] => Array
        (
            [0] => 309
            [1] => $ingredients
            [2] => 4
        )
... (truncated)

The output is truncated because of the sheer size of it, even when examined with print_r rather than var_dump. The tokenising is actually functionality that's available to us in PHP itself - the get_all_tokens() method which accepts a string. The function file_get_contents was used to retrieve the contents of the file as the argument to get_all_tokens and what you see above is the print_r output of that.

There's a great deal of information here, which isn't really readable by us (180 lines is generated in total, from a 2-line function) as there is so much of it, but we can process the output and have PHPCS check it against a set of rules.

Making the rules

To understand how we can create our own standards definitions, let's take a look at the existing standards that ship with PHPCS. These differ in location between different platforms, depending where PEAR puts them.

For me, on an ubuntu installation, they're in /usr/share/php/PHP/CodeSniffer/Standards. All the standards extend from the base CodingStandard.php in this directory.

This defines two simple methods: getIncludedSniffs() and getExcludedSniffs(). These allow us to use existing standard definitions and simply add and remove individual standards to make a standard that works for us.

These 'sniffs' are atomic rules covering anything from line length to variable naming to spotting 'code smells' such as unreachable code or badly formatted loops. Each coding standard has its own 'Sniffs' directory and anything included in here is automatically part of the standard. However each standard can also draw on the sniffs in other standards, and there are a great set of 'starter' sniffs, used in most of the standards, which are included with PHPCS in the Generic directory.

To get a feel for sniffs and how they inherit, let's take a look at a real standard. The PEAR standard is in the PEAR directory and the class file is PEARCodingStandard.php (this extension is nothing if not methodical!).

The class looks like this:

<?php
/**
 * PEAR Coding Standard.
 *
 * PHP version 5
 *
 * @category  PHP
 * @package   PHP_CodeSniffer
 * @author    Greg Sherwood <gsherwood@squiz.net>
 * @author    Marc McIntyre <mmcintyre@squiz.net>
 * @copyright 2006 Squiz Pty Ltd (ABN 77 084 670 600)
 * @license   http://matrix.squiz.net/developer/tools/php_cs/licence BSD Licence
 * @version   CVS: $Id: PEARCodingStandard.php,v 1.6 2007/08/02 23:18:31 squiz Exp $
 * @link      http://pear.php.net/package/PHP_CodeSniffer
 */
 
if (class_exists('PHP_CodeSniffer_Standards_CodingStandard', true) === false) {
    throw new PHP_CodeSniffer_Exception('Class PHP_CodeSniffer_Standards_CodingStandard not found');
}
 
/**
 * PEAR Coding Standard.
 *
 * @category  PHP
 * @package   PHP_CodeSniffer
 * @author    Greg Sherwood <gsherwood@squiz.net>
 * @author    Marc McIntyre <mmcintyre@squiz.net>
 * @copyright 2006 Squiz Pty Ltd (ABN 77 084 670 600)
 * @license   http://matrix.squiz.net/developer/tools/php_cs/licence BSD Licence
 * @version   Release: 1.1.0
 * @link      http://pear.php.net/package/PHP_CodeSniffer
 */
class PHP_CodeSniffer_Standards_PEAR_PEARCodingStandard extends PHP_CodeSniffer_Standards_CodingStandard
{
 
 
    /**
     * Return a list of external sniffs to include with this standard.
     *
     * The PEAR standard uses some generic sniffs.
     *
     * @return array
     */
    public function getIncludedSniffs()
    {
        return array(
                'Generic/Sniffs/Formatting/MultipleStatementAlignmentSniff.php',
                'Generic/Sniffs/Functions/OpeningFunctionBraceBsdAllmanSniff.php',
                'Generic/Sniffs/NamingConventions/UpperCaseConstantNameSniff.php',
                'Generic/Sniffs/PHP/LowerCaseConstantSniff.php',
                'Generic/Sniffs/PHP/DisallowShortOpenTagSniff.php',
                'Generic/Sniffs/WhiteSpace/DisallowTabIndentSniff.php',
               );
 
    }//end getIncludedSniffs()
 
 
}//end class
?>

This class is showing that there are some sniffs included from the Generic directory as well as those specific to this standard. Looking at the Sniffs for PEAR we see that they have the following:

./Classes:
ClassDeclarationSniff.php
 
./Commenting:
ClassCommentSniff.php  FileCommentSniff.php  FunctionCommentSniff.php  InlineCommentSniff.php
 
./ControlStructures:
ControlSignatureSniff.php  InlineControlStructureSniff.php
 
./Files:
IncludingFileSniff.php  LineEndingsSniff.php  LineLengthSniff.php
 
./Functions:
FunctionCallArgumentSpacingSniff.php  FunctionCallSignatureSniff.php  ValidDefaultValueSniff.php
 
./NamingConventions:
ValidClassNameSniff.php  ValidFunctionNameSniff.php  ValidVariableNameSniff.php
 
./WhiteSpace:
ScopeClosingBraceSniff.php  ScopeIndentSniff.php

These are in addition to the included Generic sniffs we saw listed earlier. There's a lot of detail in these various standards and I strongly recommend you take a look at them yourself since covering them all in any depth would make rather a long article. We will take a look at the Functions sniffs used by PEAR though as these standards are well-known and easy to understand.

Function Sniffs for the pear standard

The PEARCodingStandard class includes the sniff Generic/Sniffs/Functions/OpeningFunctionBraceBsdAllmanSniff.php.

A closer look in this Generic/Sniffs/Functions directory reveals there is also a standard called OpeningFunctionBraceKernighanRitchieSniff.php. This is a little piece of computer science history, there are two schools of thought on where the opening brace should go when a function is declared.

Brian Kernighan and Dennis Ritchie (who invented Unix and C between them) advocated having it on the same line as the function declaration whereas the BSD style, championed by Eric Allman (creator of sendmail), has it on the following line. PEAR famously uses the on-a-new-line style which is why the OpeningFunctionBraceBsdAllmanSniff is used in the PEAR standard.

Now we've finished our history lesson, let's dive in and take a look at this sniff. All sniffs take two arguments: one with all the tokens of the file in it, and one indicating where in the token stack actually triggered this function call.

<?php
/**
 * Generic_Sniffs_Methods_OpeningMethodBraceBsdAllmanSniff.
 *
 * PHP version 5
 *
 * @category  PHP
 * @package   PHP_CodeSniffer
 * @author    Greg Sherwood <gsherwood@squiz.net>
 * @author    Marc McIntyre <mmcintyre@squiz.net>
 * @copyright 2006 Squiz Pty Ltd (ABN 77 084 670 600)
 * @license   http://matrix.squiz.net/developer/tools/php_cs/licence BSD Licence
 * @version   CVS: $Id: OpeningFunctionBraceBsdAllmanSniff.php,v 1.8 2008/05/05 03:59:12 squiz Exp $
 * @link      http://pear.php.net/package/PHP_CodeSniffer
 */
 
/**
 * Generic_Sniffs_Functions_OpeningFunctionBraceBsdAllmanSniff.
 *
 * Checks that the opening brace of a function is on the line after the
 * function declaration.
 *
 * @category  PHP
 * @package   PHP_CodeSniffer
 * @author    Greg Sherwood <gsherwood@squiz.net>
 * @author    Marc McIntyre <mmcintyre@squiz.net>
 * @copyright 2006 Squiz Pty Ltd (ABN 77 084 670 600)
 * @license   http://matrix.squiz.net/developer/tools/php_cs/licence BSD Licence
 * @version   Release: 1.1.0
 * @link      http://pear.php.net/package/PHP_CodeSniffer
 */
class Generic_Sniffs_Functions_OpeningFunctionBraceBsdAllmanSniff implements PHP_CodeSniffer_Sniff
{
 
 
    /**
     * Registers the tokens that this sniff wants to listen for.
     *
     * @return void
     */
    public function register()
    {
        return array(T_FUNCTION);
 
    }//end register()
 
 
    /**
     * Processes this test, when one of its tokens is encountered.
     *
     * @param PHP_CodeSniffer_File $phpcsFile The file being scanned.
     * @param int                  $stackPtr  The position of the current token in the
     *                                        stack passed in $tokens.
     *
     * @return void
     */
    public function process(PHP_CodeSniffer_File $phpcsFile, $stackPtr)
    {
        $tokens = $phpcsFile->getTokens();
 
        if (isset($tokens[$stackPtr]['scope_opener']) === false) {
            return;
        }
 
        $openingBrace = $tokens[$stackPtr]['scope_opener'];
 
        // The end of the function occurs at the end of the argument list. Its
        // like this because some people like to break long function declarations
        // over multiple lines.
        $functionLine = $tokens[$tokens[$stackPtr]['parenthesis_closer']]['line'];
        $braceLine    = $tokens[$openingBrace]['line'];
 
        $lineDifference = ($braceLine - $functionLine);
 
        if ($lineDifference === 0) {
            $error = 'Opening brace should be on a new line';
            $phpcsFile->addError($error, $openingBrace);
            return;
        }
 
        if ($lineDifference > 1) {
            $ender = 'line';
            if (($lineDifference - 1) !== 1) {
                $ender .= 's';
            }
 
            $error = 'Opening brace should be on the line after the declaration; found '.($lineDifference - 1).' blank '.$ender;
            $phpcsFile->addError($error, $openingBrace);
            return;
        }
 
        // We need to actually find the first piece of content on this line,
        // as if this is a method with tokens before it (public, static etc)
        // or an if with an else before it, then we need to start the scope
        // checking from there, rather than the current token.
        $lineStart = $stackPtr;
        while (($lineStart = $phpcsFile->findPrevious(array(T_WHITESPACE), ($lineStart - 1), null, false)) !== false) {
            if (strpos($tokens[$lineStart]['content'], $phpcsFile->eolChar) !== false) {
                break;
            }
        }
 
        // We found a new line, now go forward and find the first non-whitespace
        // token.
        $lineStart = $phpcsFile->findNext(array(T_WHITESPACE), $lineStart, null, true);
 
        // The opening brace is on the correct line, now it needs to be
        // checked to be correctly indented.
        $startColumn = $tokens[$lineStart]['column'];
        $braceIndent = $tokens[$openingBrace]['column'];
 
        if ($braceIndent !== $startColumn) {
            $error = 'Opening brace indented incorrectly; expected '.($startColumn - 1).' spaces, found '.($braceIndent - 1);
            $phpcsFile->addError($error, $openingBrace);
        }
 
    }//end process()
 
 
}//end class
 
?>

Read as much or as little of the above listing as interests you. Personally I think this is a nice example of checking the brace position, formatting the error messages nicely, and also checking the indent is correct since we've figured out where everything is anyway.

This sniff handles function declarations spread over multiple lines and also checks that the brace is on the very next line rather than simply checking it exists after some whitespace. It's certainly doing some thorough checking, and much more easily and quickly than we would be able to do it by hand, either on our own code or via peer review.

While we're on the subject of function structure, let's also examine the PEAR-specific sniffs for functions.

They are:

  • FunctionCallArgumentSpacingSniff.php
  • FunctionCallSignatureSniff.php
  • ValidDefaultValueSniff.php

Can you guess what each of the sniffs does? They're good examples of adding continuity to your code, I won't replicate their actual contents here but I'd definitely recommend taking a look at them, especially if you're considering writing standards of your own.

They each enforce some element of good practice when declaring functions, warning you when arguments without default values are placed after those with defaults, for example. They also look at spacing around the brackets and the arguments in function declarations (PEAR doesn't allow spaces around brackets but requires them between arguments, after the comma).

Continuity like this makes code easy to read because its laid out in a way your brain expects.

Final thoughts on coding standards

Coding standards can be seen as a timewaster, a hurdle, another piece of paperwork invented by business to keep a hard-working developer from delivering all he can.

Certainly attempting to code in an unfamiliar standard without the support of a tool such as PHPCS is difficult to achieve, if it's even possible. Because I work with many different codebases, belonging to clients, to open source projects, and of my own, I see many different standards and have to code in a number of standards and sometimes all within one day!

I can't hold the detail in my head but a tool like PHPCS means I am always generating code which is useful for my collaborators on a project, whether they are clients, friends, or even myself at a later date, and I can do it without really having to think too hard about it once I've taken the time to get this tool in place.

PHPCS is most useful when it is painlessly built into our existing development lifecycle and tools. There are integrations for this tool against most editors and it can easily be added as an SVN hook or as a step in your continuous integration process. Having the barrier to usage of this tool as low as possible makes it more likely that, as developers, we'll make use of its features throughout our projects.

This module is available through PEAR and I would like to thank the authors of all the extensions there who work so hard to make their code available and maintain it for others to us, with special thanks to Greg Sherwood, who is the maintainer of the PHP_CodeSniffer package in PEAR.