Understanding the PHP-FPM status page

PHP-FPM has a very useful feature that allows you to setup a status page to view that status of a PHP-FPM pool, configurable using the option pm.status_path. Most people who have worked with PHP-FPM have probably heard of this setting, and maybe even used it (some tools such as Munin require it for metrics). Here is a sample of the basic output:

$ curl http://localhost/server-status
pool: default
process manager: dynamic
start time: 11/Dec/2014:17:51:33 -0500
start since: 61383
accepted conn: 4682
listen queue: 0
max listen queue: 0
listen queue len: 0
idle processes: 11
active processes: 1
total processes: 12
max active processes: 2
max children reached: 0
slow requests: 3

There is also the very useful full view, which gives you per-process information for every currently running FPM process in the pool:

$ curl http://localhost/server-status?full
pool: default
process manager: dynamic
start time: 11/Dec/2014:17:51:33 -0500
start since: 61418
accepted conn: 4687
listen queue: 0
max listen queue: 0
listen queue len: 0
idle processes: 11
active processes: 1
total processes: 12
max active processes: 2
max children reached: 0
slow requests: 3

************************
pid: 29187
state: Idle
start time: 12/Dec/2014:07:00:14 -0500
start since: 14097
requests: 112
request duration: 85
request method: GET
request URI: /watcher.php
content length: 0
user: -
script: /var/www/vhosts/www.example.com/watcher.php
last request cpu: 0.00
last request memory: 262144

This view can be somewhat difficult to understand if you’re not super familiar with PHP (and some parameters like Request Duration can be hard to understand even if you are).

Recently, I’ve been trying to find what the unit is for the request duration, and couldn’t find anything on Google. I divided into the C source code, and there was a very hand doc comment explaining each parameter, that I’ll include here.

Pool info:

  • pool – the name of the pool
  • process manager – static, dynamic or ondemand
  • start time – the date and time FPM has started
  • start since – number of seconds since FPM has started
  • accepted conn – the number of requests accepted by the pool
  • listen queue – the number of requests in the queue of pending connections
  • max listen queue – the maximum number of requests in the queue of pending connections since FPM has started
  • listen queue len – the size of the socket queue of pending connections
  • idle processes – the number of idle processes
  • active processes – the number of active processes
  • total processes – the number of idle + active processes
  • max active processes – the maximum number of active processes since FPM has started
  • max children reached – the number of times, the process limit has been reached, when pm tries to start more children (works only for pm ‘dynamic’ and ‘ondemand’)
  • slow requests – the number of requests that exceeded your request_slowlog_timeout value

Per process info:

  • pid – the PID of the process
  • state – the state of the process (Idle, Running, …)
  • start time – the date and time the process has started
  • start since – the number of seconds since the process has started
  • requests – the number of requests the process has served
  • request duration – the duration in microseconds (1 million in a second) of the current request (my own definition)
  • request method – the request method (GET, POST, …) (of the current request)
  • request URI – the request URI with the query string (of the current request)
  • content length – the content length of the request (only with POST) (of the current request)
  • user – the user (PHP_AUTH_USER) (or ‘-‘ if not set) (for the current request)
  • script – the main script called (or ‘-‘ if not set) (for the current request)
  • last request cpu – the %cpu of the last request consumed (it’s always 0 if the process is not in Idle state because CPU calculation is done when the request processing has terminated)
  • last request memory – the max amount of memory the last request consumed (it’s always 0 if the process is not in Idle state because memory calculation is done when the request processing has terminated)

Rant: Constants can’t be mutable

I’ve been working with Ruby lately, and a few people have told me that Ruby has constants, but they are mutable. Recently, I’ve been reading a new book on Ruby, and read the following passage:Dcx1I5U

A lot of other places mirror this (e.g. rubylearning.com). You cannot have constants if they are mutable. Mutable constants are just variables. Take a look at the definition of a constant from Wikipedia:

In computer programming, a constant is an identifier with an associated value which cannot be altered by the program during normal execution – the value is constant. This is contrasted with a variable, which is an identifier with a value that can be changed during normal execution – the value is variable. 

This is pretty basic stuff to most programmers (and people with common sense). Ruby constants are actually just variables that follow the naming convention for constants.

It’s a stretch, but you could say Ruby has weak constants (I’d prefer to just say that no, ruby does not have constants).

If you use bash colors, check for a tty first!

Most of the command line programs I write use ANSI escape codes to output colored text (along with bold/underlined text). I find this makes the programs a lot easier to understand at a glance. For example, my test runners will output success in green, skipped in yellow, and failure in red. I can very quickly scroll through the output and spot failures.

However, if you’re going to do this, or if you already do this, you may run into problems with redirecting your program to a log file or piping it to another program. Your colors show up as ugly escape codes and make it harder to parse the log files. Pretty frequently, you only want colors if you’re outputting to an interactive terminal.

When you use colors in your command line program, you should support two features. Your program should have an option to disable colors, such as –no-color, and you should automatically detect if STDOUT is a TTY, and if not, disable colors.

Here’s a PHP example:

// Disable colors
if (in_array('--no-color', $argv) || !posix_isatty(STDOUT)) {
  $runner->color = false;
}

And here’s a Python example:

if not sys.stdout.isatty():
  color_mode = False

Since stdout won’t be a TTY if you are using redirection or piping, it’s a good thing to check before using color codes.

How Linux pipes work under the hood

Piping is one of the core concepts of Linux & Unix based operating systems. Pipes allow you to chain together commands in a very elegant way, passing output from one program to the input of another to get a desired end result.

Here’s a simple example of piping:

ls -la | sort | less

The above command gets a listing of the current directory using ls, sorts it alphabetically using the sort utility, and then paginates it for easier reading using less.

I’m not going to go into depth about pipes as I’ll assume you already know what they are (at least in the context of bash pipes). Instead, I’m going to show how pipes are implemented under the hood.

How Pipes Are Implemented

Before I explain how bash does pipes, I’ll explain how the kernel implements pipes (at a high level).

  • Linux has a VFS (virtual file system) module called pipefs, that gets mounted in kernel space during boot
  • pipefs is mounted alongside the root file system (/), not in it (pipe’s root is pipe:)
  • pipefs cannot be directly examined by the user unlike most file systems
  • The entry point to pipefs is the pipe(2) syscall
  • The pipe(2) syscall is used by shells and other programs to implement piping, and just creates a new file in pipefs, returning two file descriptors (one for the read end, opening using O_RDONLY, and one for the write end, opened using O_WRONLY)
  • pipefs is stored an in-memory file system

Pipe I/O, buffering, and capacity

A pipe has a limited capacity in Linux. When the pipe is full, a write(2) will block (or fail if the O_NONBLOCK flag is set). Different implementations of pipes have different limits, so applications shouldn’t rely on a pipe having a particular size. Applications should be designed to consume data as soon as it is available so the writing process doesn’t block. That said, knowing the pipe size is useful. Since Linux 2.6.35, the default pipe capacity is 65,536 bytes (it used to be the size of the page file, e.g. 4096 bytes in i386 architectures).

When a process attempts to read from an empty pipe, read(2) will block until data is available in the pipe. If all file descriptors pointing to the write end of the pipe have been closed, reading from the pipe will return EOF (read(2) will return 0).

If a process attempts to write to a full pipe, write(2) will block until enough data has been read from the pipe to allow the write call to succeed. If all file descriptors pointing to the read end of the pipe have been closed, writing to the pipe will raise the SIGPIPE signal. If this signal is ignored, write(2) fails with the error EPIPE.

All of this is important when understanding pipe performance. If a process A is writing data at roughly the same speed as process B is reading it, pipes work very well and are highly performance. An imbalance here can cause performance problems. See the next section for more information/examples.

How Shells Do Piping

Before continuing, you should be aware of how Linux creates new processes.

Shells implement piping in a manner very similar to how they implement redirection. Basically, the parent process calls pipe(2) once for each two processes that get piped together. In the example above, bash would need to call pipe(2) twice to create two pipes, one for piping ls to sort, and one to pipe sort to less. Then, bash forks itself once for each process (3 times for our example). Each child will run one command. However, before the children run their commands, they will overwrite one of stdin or stdout (or both). In our above example, it will work like this:

  • bash will create two pipes, one to pipe ls to sort, and one to pipe sort to less
  • bash will fork itself 3 times (1 parent process and 3 children, for each command)
  • child 1 (ls) will set it’s stdout file descriptor to the write end of pipe A
  • child 2 (sort) will set it’s stdin file descriptor to the read end of pipe A (to read input from ls)
  • child 2 (sort) will set it’s stdout file descriptor to the write end of pipe B
  • child 3 (less) will set it’s stdin file descriptor to the read end of pipe B (to read input from sort)
  • each child will run their commands

The kernel will automatically schedule processes so they roughly run in parallel. If child 1 writes too much to pipe A before child 2 has read it, child 2 will block for a while until child 2 has had time to read from the pipe. This normally allows for very high levels of efficiency as one process doesn’t have to wait for the other to complete to start processing data. Another reason for this is that pipes have a limited size (normally the size of a single page of memory).

Pipe Example Code

Here is a C example of how a program like bash might implement piping. My example is pretty simple, and accepts two arguments: a directory and a string to search for. It will run ls -la to get the contents of the directory, and pipe them to grep to search for the string.

#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <fcntl.h>

#define READ_END 0
#define WRITE_END 1

int main(int argc, char *argv[])
{
    int pid, pid_ls, pid_grep;
    int pipefd[2];

    // Syntax: test . filename
    if (argc < 3) {
        fprintf(stderr, "Please specify the directory to search and the filename to search for\n");
        return -1;
    }

    fprintf(stdotestut, "parent: Grepping %s for %s\n", argv[1], argv[2]);

    // Create an unnamed pipe
    if (pipe(pipefd) == -1) {
        fprintf(stderr, "parent: Failed to create pipe\n");
        return -1;
    }

    // Fork a process to run grep
    pid_grep = fork();

    if (pid_grep == -1) {
        fprintf(stderr, "parent: Could not fork process to run grep\n");
        return -1;
    } else if (pid_grep == 0) {
        fprintf(stdout, "child: grep child will now run\n");

        // Set fd[0] (stdin) to the read end of the pipe
        if (dup2(pipefd[READ_END], STDIN_FILENO) == -1) {
            fprintf(stderr, "child: grep dup2 failed\n");
            return -1;
        }

        // Close the pipe now that we've duplicated it
        close(pipefd[READ_END]);
        close(pipefd[WRITE_END]);

        // Setup the arguments/environment to call
        char *new_argv[] = { "/bin/grep", argv[2], 0 };
        char *envp[] = { "HOME=/", "PATH=/bin:/usr/bin", "USER=brandon", 0 };

        // Call execve(2) which will replace the executable image of this
        // process
        execve(new_argv[0], &new_argv[0], envp);

        // Execution will never continue in this process unless execve returns
        // because of an error
        fprintf(stderr, "child: Oops, grep failed!\n");
        return -1;
    }

    // Fork a process to run ls
    pid_ls = fork();

    if (pid_ls == -1) {
        fprintf(stderr, "parent: Could not fork process to run ls\n");
        return -1;
    } else if (pid_ls == 0) {
        fprintf(stdout, "child: ls child will now run\n");
        fprintf(stdout, "---------------------\n");

        // Set fd[1] (stdout) to the write end of the pipe
        if (dup2(pipefd[WRITE_END], STDOUT_FILENO) == -1) {
            fprintf(stderr, "ls dup2 failed\n");
            return -1;
        }

        // Close the pipe now that we've duplicated it
        close(pipefd[READ_END]);
        close(pipefd[WRITE_END]);

        // Setup the arguments/environment to call
        char *new_argv[] = { "/bin/ls", "-la", argv[1], 0 };
        char *envp[] = { "HOME=/", "PATH=/bin:/usr/bin", "USER=brandon", 0 };

        // Call execve(2) which will replace the executable image of this
        // process
        execve(new_argv[0], &new_argv[0], envp);

        // Execution will never continue in this process unless execve returns
        // because of an error
        fprintf(stderr, "child: Oops, ls failed!\n");
        return -1;
    }

    // Parent doesn't need the pipes
    close(pipefd[READ_END]);
    close(pipefd[WRITE_END]);

    fprintf(stdout, "parent: Parent will now wait for children to finish execution\n");

    // Wait for all children to finish
    while (wait(NULL) > 0);

    fprintf(stdout, "---------------------\n");
    fprintf(stdout, "parent: Children has finished execution, parent is done\n");

    return 0;
}

I’ve commented it thoroughly, so hopefully it makes sense.

Named vs Unnamed Pipes

In the above examples, we’ve been using unnamed/anonymous pipes. These pipes are temporary, and are discarded once your program finishes or all of their file descriptors are closed. They are the most common type of pipe.

Named pipes, also known as FIFOs (for first in, first out), get created as a named file on your hard disk. They allow multiple unrelated programs to open and use them. You can have multiple writers quite easily, with one reader, for a very simplistic client-server type design. For example, nagios does this, with the master process reading a named pipe, and every child process writing commands to the named pipe.

Named pipes are creating using the mkfifo command or syscall. Example:

mkfifo ~/test_pipe

Other than their creation, they work pretty much the same as unnamed pipes. Once you create them, you can open them using open(2). You must open the read end using O_RDONLY or the write end using O_WRONLY. Most operating systems implement unidirectional pipes, so you can’t open them in both read/write mode.

FIFOs are often used as a unidirectional IPC technique, for a system with multiple processes. A multithreaded application may also use named or unnamed pipes, as well as other IPC techniques such as shared memory segments.

FIFOs are created as a single inode, with the property i_pipe set as a reference to the actual pipe. While the name may exist on your filesystem, pipes don’t cause I/O to the underlying device, as once the inode is read, FIFOs behave like unnamed pipes and operate in-memory.

Analysis of a WordPress plugin exploit

This morning, I was reading ArsTechnica like I do every morning, and saw an article about how yet another popular WordPress plugin was found to have a remote execution vulnerability. The comments on the article were predictably bad and misinformed, so I decided to look into the security fix and see what caused the original issue (and how the exploit worked).

The plugin is Custom Contacts Form, which has over 670,000 downloads.

This bug is awful, catastrophic to sites that enable registration by untrusted users.

First, this bug has been in the plugin for at least 3 years, I didn’t feel like figuring out exactly when it cropped up though. 3 years is a long time.

This bug allows any visitor on your blog (they don’t even need to be logged in) to download an export file of your contact form. That alone could be very catastrophic depending on your site.

More importantly, this bug allows any authenticated user on your blog (of any privilege level) to execute arbitrary SQL commands. Let that sink in for a moment.

So, how did this bug come to be?

Looks like gross incompetence combined with a possible misunderstanding of the is_admin function. See, is_admin doesn’t check to see if the current user is an admin, it checks if the current user is in the admin area (/wp-admin), which as most WordPress users know, any user can access (it’s where the profile settings are). Even subscriber level users can access the admin area.

So let’s take a look at the code that caused the issue:

if (!is_admin()) { /* is front */
    // ...
} else { /* is admin */
    // ...
    add_action('init', array(&$custom_contact_admin, 'adminInit'), 1);
    // ...
}

So seen above is a shortened code snipped from the main plugin file, that adds a hook to execute adminInit if the user is in the WordPress admin. Now lets look at that hook:

function adminInit() {
    $this->downloadExportFile();
    $this->downloadCSVExportFile();
    $this->runImport();
}

The above function executes a few other functions. This is already worrying based on function names. I’d expect adminInit to check if the current user had some specific capability or role first, but it doesn’t. Maybe it still does in those functions?

function runImport() {
    if (isset($_POST['ccf_clear_import']) || isset($_POST['ccf_merge_import'])) {
        //chmod('modules/export/', 0777);
        ccf_utils::load_module('export/custom-contact-forms-export.php');
        $transit = new CustomContactFormsExport(parent::getAdminOptionsName());
        $settings['import_general_settings'] = ($_POST['ccf_import_overwrite_settings'] == 1) ? true : false;
        $settings['import_forms'] = ($_POST['ccf_import_forms'] == 1) ? true : false;
        $settings['import_fields'] = ($_POST['ccf_import_fields'] == 1) ? true : false;
        $settings['import_field_options'] = ($_POST['ccf_import_field_options'] == 1) ? true : false;
        $settings['import_styles'] = ($_POST['ccf_import_styles'] == 1) ? true : false;
        $settings['import_saved_submissions'] = ($_POST['ccf_import_saved_submissions'] == 1) ? true : false;
        $settings['mode'] = ($_POST['ccf_clear_import']) ? 'clear_import' : 'merge_import';
        $transit->importFromFile($_FILES['import_file'], $settings);
        ccf_utils::redirect('options-general.php?page=custom-contact-forms');
    }
}

Oh….. Guess not. Just as a note, the two download functions also don’t check permissions, which is how an attacker can dump your contact entries.

Now in this runImport function, the important call is to $transit->importFromFile. It takes an uploaded file, and does something with it. Let’s take a look:

function importFromFile($file, $settings = array('mode' => 'clear_import', 'import_general_settings' => false, 'import_forms' => true,'import_fields' => true, 'import_field_options' => true, 'import_styles' => true, 'import_saved_submissions' => false)) {
    $path = CCF_BASE_PATH. 'import/';
    $file_name = basename(time() . $file['name']);
    $file_extension = pathinfo($file['name'], PATHINFO_EXTENSION);
    if ( stripos( $file_extension, 'sql' ) ) {
        unlink( $file['tmp_name'] );
        wp_die( 'You can only import .sql files.' );
    }
    // ...
}

I’ve left out the bulk of the function, as you can probably see what it’s going to do. It takes a SQL file, and runs it. Since this function isn’t behind an authentication/capability/role check, that means anyone can upload any SQL file and run it….

So how would this have been avoided? A simple capability check is normally sufficient:

if ( current_user_can( 'manage_options' ) ) {
    // Is a real admin
}

You could also check to see if the user has the “administrator” role. So is this what the plugin author did to resolve the issue? Nope, he just removed the code….

So as you can see, this isn’t a security issue with WordPress, it’s just bad programming.