说了很多504的问题,今天来说说502的问题,它表示网关错误。通过会话id查询Nginx的错误日志,日志数量很多,频繁502,觉得很是奇怪,想着是不是nginx自己会重试?
再一次拿出配置看一看:
location / {
proxy_set_header Host api.test.com ;
proxy_ssl_server_name on;
proxy_read_timeout 10;
proxy_connect_timeout 5;
proxy_buffering off;
proxy_redirect off;ßß
proxy_pass https://api.test.com;
proxy_intercept_errors on;
error_page 504 = @retry;
}
location @retry {
proxy_ssl_server_name on;
proxy_read_timeout 30;
proxy_connect_timeout 5;
proxy_buffering off;
proxy_redirect off;
proxy_pass https://api.test.com;
proxy_next_upstream off;
}
proxy_pass直接代理第三方服务,实际上并不是upstream调用,为了避免重试,加上了proxy_next_upstream指令。
这个指定的定义如下:
Specifies in which cases a request should be passed to the next server
意思就是决定什么样的情况proxy会交给下一个server处理,默认值如下:
proxy_next_upstream error timeout
就是在出现网络错误或者超时的时候会触发,为了避免重试所以我特意在上述语句中配置了:
proxy_next_upstream off;
就是关闭重试,可配置重启后问题依然存在,后来我才明白,因为调用端(Go)长链接,所以多个请求的会话ID是一致的,所以理论上遇到502,nginx并没有重试,那如何才能进一步验证这个结论呢?通过日志和源码做了确认。
首先看日志:
2023/11/07 14:41:53 [error] 28863#28863: *80920 no live upstreams while connecting to upstream, client: 121.11.2.217, server: _, request: "POST /v1/ HTTP/1.1", upstream: "https://api.test.com/v1/", host: "47.251.21.133"
2023/11/07 14:41:53 [debug] 28863#28863: *80920 http next upstream, 40000000
2023/11/07 14:41:53 [debug] 28863#28863: *80920 finalize http upstream request: 502
2023/11/07 14:41:53 [debug] 28863#28863: *80920 finalize http proxy request
2023/11/07 14:41:53 [debug] 28863#28863: *80920 http finalize request: 502, "/v1/?" a:1, c:3
2023/11/07 14:41:53 [debug] 28863#28863: *80920 http special response: 502, "/v1/?"
2023/11/07 14:41:53 [debug] 28863#28863: *80920 HTTP/1.1 502 Bad Gateway^M
接下去查源码,ngx_http_upstream.c 文件:
if (rc == NGX_BUSY) {
ngx_log_error(NGX_LOG_ERR, r->connection->log, 0, "no live upstreams");
ngx_http_upstream_next(r, u, NGX_HTTP_UPSTREAM_FT_NOLIVE);
return;
}
出现了no live upstreams日志,然后调用ngx_http_upstream_next函数,传递的NGX_HTTP_UPSTREAM_FT_NOLIVE就是40000000。
ngx_http_upstream_next(ngx_http_request_t *r, ngx_http_upstream_t *u,ngx_uint_t ft_type)
{
ngx_msec_t timeout;
ngx_uint_t status, state;
ngx_log_debug1(NGX_LOG_DEBUG_HTTP, r->connection->log, 0,
"http next upstream, %xi", ft_type);
switch (ft_type) {
default:
status = NGX_HTTP_BAD_GATEWAY;
}
ngx_http_upstream_finalize_request(r, u, status);
}
ngx_http_upstream_next函数处理,打印了http next upstream日志,ft_type值是NGX_HTTP_UPSTREAM_FT_NOLIVE,默认赋值status就是NGX_HTTP_BAD_GATEWAY,然后调用ngx_http_upstream_finalize_request函数。
ngx_http_upstream_finalize_request(ngx_http_request_t *r,ngx_http_upstream_t *u, ngx_int_t rc)
{
ngx_uint_t flush;
ngx_log_debug1(NGX_LOG_DEBUG_HTTP, r->connection->log, 0,
"finalize http upstream request: %i", rc);
if (u->cleanup == NULL) {
/* the request was already finalized */
ngx_http_finalize_request(r, NGX_DONE);
return;
}
ngx_http_upstream_finalize_request函数打印了finalize http upstream request内容,中止了upstream请求,最后调用ngx_http_finalize_request函数结束。
ngx_http_finalize_request(ngx_http_request_t *r, ngx_int_t rc)
{
ngx_connection_t *c;
ngx_http_request_t *pr;
ngx_http_core_loc_conf_t *clcf;
c = r->connection;
ngx_log_debug5(NGX_LOG_DEBUG_HTTP, c->log, 0,
"http finalize request: %i, \"%V?%V\" a:%d, c:%d",
rc, &r->uri, &r->args, r == c->data, r->main->count);
}
ngx_http_finalize_request最终输出了http finalize request,中止http请求。
那么日志中的proxy中止在哪儿呢,在ngx_http_proxy_module.c文件中,定义了handler,结束的时候会处理。
ngx_http_proxy_pass(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
clcf->handler = ngx_http_proxy_handler;
}
ngx_http_proxy_handler(ngx_http_request_t *r)
{
u->finalize_request = ngx_http_proxy_finalize_request;
}
ngx_http_proxy_finalize_request(ngx_http_request_t *r, ngx_int_t rc)
{
ngx_log_debug0(NGX_LOG_DEBUG_HTTP, r->connection->log, 0,
"finalize http proxy request");
return;
}
这篇文章确认了nginx遇到502错误不会重试,但没有找到502出现的根本问题,也没有解决问题,敬请期待吧。




