Analytics Architecture¶

Overview¶

Analytics data is gathered on each request made to API Umbrella and logged to a database. The basic flow of how analytics data gets logged is:

[nginx] => [rsyslog] => [storage database]

To explain each step:

nginx logs individual request data in JSON format to a local rsyslog server over a TCP socket (using lua-resty-logger-socket).
rsyslog’s role in the middle is for a couple of primary purposes:
- It buffers the data locally so that if the analytics server is down or requests are coming in too quickly for the database to handle, the data can be queued.
- It can transform the data and send it to multiple different endpoints.
The storage database stores the raw analytics data for further querying or processing.

Data is logged directly to Elasticsearch from rsyslog:

[nginx] ====> [rsyslog] ====> [Elasticsearch]
        JSON            JSON

rsyslog buffers and sends data to Elasticseach using the Elasticsearch Bulk API.
rsyslog’s omelasticsearch output module is used.

The analytic APIs in the web application directly query Elasticsearch:

[api-umbrella-web-app] => [Elasticsearch]