A Mechanism to Help Write Web Application Firewalls for Nginx

Developing a Web Application Firewall module for Nginx is not an easy task. The lack of input body filters makes it harder. Nginx is an outstanding web server, but it is not perfect. Actually, nothing is perfect.

So we added the input body filter mechanism to our own Nginx distribution, which is named Tengine. By taking advantage of this mechanism, processing the request body is not that complicated anymore (In standard Nginx, request body may be buffered to disk file and you have to deal with up to two buffers)

Here I have an example to demonstrate how to write an input body filter. It is a simple module to fight hash collision DoS attacks.

/*
 * Copyright (C) Joshua Zhu, http://www.zhuzhaoyuan.com
 */
 
 
#include <ngx_config.h>
#include <ngx_core.h>
#include <ngx_http.h>
 
 
typedef struct {
    ngx_flag_t                            enable;
    ngx_uint_t                            max_post_params;
} ngx_http_anti_hashdos_loc_conf_t;
 
 
typedef struct {
    ngx_uint_t                            post_params;
} ngx_http_anti_hashdos_ctx_t;
 
 
static ngx_int_t ngx_http_anti_hashdos_input_body_filter(ngx_http_request_t *r,
    ngx_buf_t *buf);
 
static void *ngx_http_anti_hashdos_create_loc_conf(ngx_conf_t *cf);
static char *ngx_http_anti_hashdos_merge_loc_conf(ngx_conf_t *cf, void *parent,
    void *child);
static ngx_int_t ngx_http_anti_hashdos_init(ngx_conf_t *cf);
 
 
static ngx_command_t ngx_http_anti_hashdos_filter_commands[] = {
 
    { ngx_string("anti_hashdos"),
      NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_FLAG,
      ngx_conf_set_flag_slot,
      NGX_HTTP_LOC_CONF_OFFSET,
      offsetof(ngx_http_anti_hashdos_loc_conf_t, enable),
      NULL },
 
    { ngx_string("anti_hashdos_max_post_params"),
      NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_TAKE1,
      ngx_conf_set_num_slot,
      NGX_HTTP_LOC_CONF_OFFSET,
      offsetof(ngx_http_anti_hashdos_loc_conf_t, max_post_params),
      NULL },
 
      ngx_null_command
};
 
 
static ngx_http_module_t ngx_http_anti_hashdos_filter_module_ctx = {
    NULL,                                 /* preconfiguration */
    ngx_http_anti_hashdos_init,           /* postconfiguration */
 
    NULL,                                 /* create main configuration */
    NULL,                                 /* init main configuration */
 
    NULL,                                 /* create server configuration */
    NULL,                                 /* merge server configuration */
 
    ngx_http_anti_hashdos_create_loc_conf,/* create location configuration */
    ngx_http_anti_hashdos_merge_loc_conf  /* merge location configuration */
};
 
 
ngx_module_t ngx_http_anti_hashdos_filter_module = {
    NGX_MODULE_V1,
    &ngx_http_anti_hashdos_filter_module_ctx, /* module context */
    ngx_http_anti_hashdos_filter_commands,/* module directives */
    NGX_HTTP_MODULE,                      /* module type */
    NULL,                                 /* init master */
    NULL,                                 /* init module */
    NULL,                                 /* init process */
    NULL,                                 /* init thread */
    NULL,                                 /* exit thread */
    NULL,                                 /* exit process */
    NULL,                                 /* exit master */
    NGX_MODULE_V1_PADDING
};
 
 
static ngx_http_input_body_filter_pt  ngx_http_next_input_body_filter;
 
 
static ngx_int_t
ngx_http_anti_hashdos_input_body_filter(ngx_http_request_t *r,
    ngx_buf_t *buf)
{
    u_char                           *p;
    ngx_http_anti_hashdos_ctx_t      *ctx;
    ngx_http_anti_hashdos_loc_conf_t *ahlf;
 
    ahlf = ngx_http_get_module_loc_conf(r, ngx_http_anti_hashdos_filter_module);
 
    if (!ahlf->enable) {
        return ngx_http_next_input_body_filter(r, buf);
    }
 
    ctx = ngx_http_get_module_ctx(r, ngx_http_anti_hashdos_filter_module);
    if (ctx == NULL) {
        ctx = ngx_pcalloc(r->pool, sizeof(ngx_http_anti_hashdos_ctx_t));
        if (ctx == NULL) {
            return NGX_HTTP_INTERNAL_SERVER_ERROR;
        }
 
        ctx->post_params = 1;
    }
 
    for (p = buf->pos; p < buf->last; p++) {
 
        if (*p == '&') {
            ctx->post_params++;
        }
    }
 
    if (ctx->post_params > ahlf->max_post_params) {
        ngx_log_error(NGX_LOG_ERR, r->connection->log, 0,
                      "anti hashdos: \"%V\" blocked, too many post params: %d",
                      &r->connection->addr_text,
                      ctx->post_params);
        return NGX_HTTP_BAD_REQUEST;
    }
 
    return ngx_http_next_input_body_filter(r, buf);
}
 
 
static void *
ngx_http_anti_hashdos_create_loc_conf(ngx_conf_t *cf)
{
    ngx_http_anti_hashdos_loc_conf_t *conf;
 
    conf = ngx_pcalloc(cf->pool, sizeof(ngx_http_anti_hashdos_loc_conf_t));
    if (conf == NULL) {
        return NULL;
    }
 
    conf->enable = NGX_CONF_UNSET;
    conf->max_post_params = NGX_CONF_UNSET_UINT;
 
    return conf;
}
 
 
static char *
ngx_http_anti_hashdos_merge_loc_conf(ngx_conf_t *cf, void *parent, void *child)
{
    ngx_http_anti_hashdos_loc_conf_t *prev = parent;
    ngx_http_anti_hashdos_loc_conf_t *conf = child;
 
    ngx_conf_merge_value(conf->enable, prev->enable, 0);
    ngx_conf_merge_uint_value(conf->max_post_params, prev->max_post_params,
                              120);
 
    return NGX_CONF_OK;
}
 
 
static ngx_int_t
ngx_http_anti_hashdos_init(ngx_conf_t *cf)
{
    ngx_http_next_input_body_filter = ngx_http_top_input_body_filter;
    ngx_http_top_input_body_filter = ngx_http_anti_hashdos_input_body_filter;
 
    return NGX_OK;
}

The code looks quite straight forward, right? And it is similar to an output body filter, just a few steps:

1) Implement your own input body filter function. e.g.

static ngx_int_t
ngx_http_anti_hashdos_input_body_filter(ngx_http_request_t *r, ngx_buf_t *buf)
{
    /* Do the input body filtering here */
}

2) In your input body filter function, return an HTTP error code if something is wrong. Otherwise, call ngx_http_next_input_body_filter(r, buf) directly to pass the buf to the next input body filters.

3) Install your input body filter in the post_configuration hook function. Push your input body filter to the head of the input body filter chain. e.g.

static ngx_int_t
ngx_http_anti_hashdos_init(ngx_conf_t *cf)
{
    ngx_http_next_input_body_filter = ngx_http_top_input_body_filter;
    ngx_http_top_input_body_filter = ngx_http_anti_hashdos_input_body_filter;
 
    return NGX_OK;
}

NOTE: This is just a demonstration to show how to write input body filters. If you want to fight hash collision DoS attacks completely, you have to write more code and various POST content types should be processed.

Download the code here:
http://www.zhuzhaoyuan.com/download/tengine/anti_hashdos.tar.gz

Comments (6)

Tengine, a customized Nginx, goes to open source

We’re glad to announce that Tengine, our home-baked Nginx at Taobao now becomes an open source project.

Taobao is the largest e-commerce website in Asia and ranked #12 on Alexa’s top global sites list. Our website serves billions of pageviews per day. For busy website as us, Nginx is obviously the best choice. Thanks to Nginx’s high performance, small footprint and flexibility, we have done more with less.

We first learned the Nginx internals by using it as a traditional web server and developing dozens of modules. Then from June of this year we started hacking the Nginx core to expand its capabilities. As some of the features we have developed may also benefit other Nginx users and websites, so why not open source them? We do not want to be just open source software users, but also open source contributors. That’s why the Tengine open source project came out.

Tengine is based on the latest stable version of Nginx (Nginx-1.0.10). There are a few features and bug fixes you may be interested in Tengine:

  • Logging enhancement. It supports syslog (local and remote) and pipe logging. You can also do log sampling, i.e. not all requests have to be written.
  • Protects the server when the system load and memory use goes high.
  • Combines multiple CSS or JavasScript requests into one request to reduce the downloading time.
  • Sets the worker process number and CPU affinities automatically. Setting Nginx’s worker_cpu_affinity is not a pain any more.
  • Enhanced limit_req module with whitelist support and more limit_req directives in one location.
  • More operations engineer friendly server information, so host can be located easily when error happens.
  • More command lines support. You can list all modules compiled in and the directives supported, even the content of configuration file itself.
  • Set expiration for files according to specific content type.
  • Error pages can be set back to ‘default’.

Basically, Tengine can be considered as a better or superset of Nginx. You can download the tar ball here:
http://tengine.taobao.org/download/tengine-1.2.0.tar.gz

We want to say thank you to the Nginx team, especially to Igor. Thank you very much for your great work! We would love to donate the patches against the Nginx-1.1 branch later if you think the patches are okay.

Frankly, I’m not sure whether the features in Tengine right now can impress you guys or not. It’s the first step we moving towards open source after all. We have built a team working on Tengine and have quite a long to-do list. I promise you more enhancements are coming out.

Comments (4)

I’m Back

I’m so terrible about keeping my blog up-to-date. This blog has not been updated for 2 years. Sorry guys.

Yet today I have good news for you, if you are an Nginx fan. We are going to open source our home-baked Nginx! This Nginx fork is named Tengine. It serves on thousands of production servers at our website taobao.com. (For those know nothing about Taobao: Taobao is the largest e-commerce website in Asia, and is ranked #15 on Alexa’s top global sites list)

There are quite a few features that you may be interested. I’ll release the more detailed announcement soon.

Comments (6)

Nginx Internals (Slides & Video)

Last Saturday I gave the talk “Nginx Internals” in Guangzhou. Here are the presentation slides and the video of the talk.

Nginx Internals Video part 1 (in Chinese):

Nginx Internals Video part 2 (in Chinese):
UPDATE: This part is lost, saddly, and I have no backup :(

Nginx Internals Video part 3 (in Chinese):

Comments (25)

Nginx Internals Talk in Guangzhou, China

nginx map

nginx map (click to view large image)

I’m going to give a free talk on nginx’s internals next month (September 19), in Guangzhou, China.

I’ve been reading the source code of nginx for a few days. Digging into this charming code is really a pleasant experience, though at first glance it appeared a little bit difficult to understand. Nginx becomes more and more popular, but unfortunately there is not enough documentation on its architecture and implementation. Now that I have spent a considerable amount of time reading the source code and have gained some knowledge, why not share it with those who want to know things under the hood?

So, if you are interested in this talk and you can be in Guangzhou that day, feel free to join in. Please comment on this post or drop me an email to let me know which parts you are interested in (see the mind map above, draft version though).

There might be a thousand Hamlets in a thousand people’s eyes. Note that I’m not Igor, and the only way I try to understand the nuts and bolts is by reverse engineering it, hence I can’t guarantee you no mistakes or misunderstandings in my talk. And frankly, it is not a trivial topic after all, not only because of the size of nginx’s code base, but also its elaborate design.

The speech will be in Chinese while slides will be in English. Specifics of time and location are coming soon. Stay tuned.

Update:
Time: 14:30-17:30, September 19, 2009
Location: Netease Building Tower E, Guangzhou Information Port #16 Keyun RD. Tianhe District, Guangzhou
Registration: http://blog.laiyonghao.com/2009/09/programming-tech-party/370

Comments (12)

« Previous entries