Discussion:
posix_memalign(16, 16384) failed (12: Cannot allocate memory)
JohnCarne
2016-09-23 09:20:56 UTC
Permalink
Hello,

[***@web1 ~]# nginx -v
nginx version: nginx/1.11.4

We are now after 13 days we observer suddenly in nginx logs this in an
intempestive manner, and causing nginx to reload, causing slow down on
server : posix_memalign(16, 16384) failed (12: Cannot allocate memory)

This happens after our upgrade to last nginx version through nDeploy.

I called in nginx sysadmin, Ndeploy sysadmin too, and finally cloudlinux
support which made an incredible job investigating the issue over 7 days by
enabling multiple kernel debug tools to find out what is going on.

All nginx/linux settings has been tweaked/verified. Issue can't be solved,
and about 5 guys has broken their head on the issue, without being able to
solve. We know all the basic, even advanced, and experts were in.

Cloudlinux support says this is the cause, and you need nginx expert to find
out why nginx beheave likes this :

From the information we collected it appears that nginx is really changing
his ulimits:
# grep nginx /home/abackupnomem3.log | tail
nginx-792752 [009] 5438179.898678: setrlimit: (sys_setrlimit+0x63/0x70

Conclusion is that nginx manage those rlimits. This is not a solution, but a
way for you where to dig more.


This was added : ulimit -q unlimited in etc/init.d/nginx :

start() {
echo -n $"Starting $prog: "
ulimit -n 64000
ulimit -q unlimited
daemon --pidfile=${pidfile} ${nginx} -c ${conffile}
RETVAL=$?
echo
[ $RETVAL = 0 ] && touch ${lockfile}
return $RETVAL
}


Anyone has a clue ?

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,269787,269787#msg-269787
JohnCarne
2016-09-23 10:42:18 UTC
Permalink
I add my config at server level

#Core Functionality

user nobody;
worker_processes auto;
worker_rlimit_nofile 50000;
thread_pool iopool threads=32 max_queue=65536;
pid /var/run/nginx.pid;
error_log /var/log/nginx/error_log;
#error_log /home/abackup/debug.log debug;

#Load Dynamic Modules
#include /etc/nginx/conf.d/dynamic_modules_custom.conf;

events {
worker_connections 2048;
use epoll;
multi_accept on;
accept_mutex off;
}

#Settings For other core modules like for example the stream module
include /etc/nginx/conf.d/main_custom_include.conf;

#Settings for the http core module
include /etc/nginx/conf.d/http_settings.conf;

***************************

http {

sendfile on;
sendfile_max_chunk 512k;
aio threads=iopool;
directio 50m; #Serve Large files like media files using directio

tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 60;
keepalive_disable msie6 safari;
types_hash_max_size 2048;

server_tokens off;
client_max_body_size 128m;
client_body_buffer_size 256k;
map_hash_bucket_size 128;
map_hash_max_size 2048;

#Tweak timeout settings below in case of a DOS attack
client_header_timeout 1m;
client_body_timeout 1m;
reset_timedout_connection on;

connection_pool_size 512;
client_header_buffer_size 4k;
large_client_header_buffers 4 32k;
request_pool_size 8k;
output_buffers 4 32k;
postpone_output 1460;

#FastCGI
fastcgi_buffers 16 16k;
fastcgi_buffer_size 32k;
# the below options depend on theoretical maximum of your PHP script
run-time
fastcgi_read_timeout 300;
fastcgi_send_timeout 300;

server_names_hash_max_size 256000;
server_names_hash_bucket_size 2048;
include /etc/nginx/mime.types;
default_type application/octet-stream;

ssl_protocols TLSv1.2 TLSv1.1 TLSv1;

# Open File Cache
open_file_cache max=8192 inactive=5m;
open_file_cache_valid 5m;
open_file_cache_min_uses 2;
open_file_cache_errors on;

# Logging Settings
open_log_file_cache max=1000 inactive=20s valid=1m min_uses=2;
#Mapping $msec to $sec so that we dont break cPanel bandwidth calculator
map $msec $sec {
~^(?P<secres>.+)\. $secres;
}
log_format bytes_log "$sec $bytes_sent .";
log_not_found off;
access_log off;

# Micro-caching nginx
proxy_cache_path /var/cache/nginx/microcaching keys_zone=micro:20m
levels=1:2 inactive=900s max_size=2000m;
proxy_cache micro;
proxy_cache_lock on;
proxy_cache_valid 200 1s;
proxy_cache_use_stale updating;
proxy_cache_bypass $cookie_nocache $arg_nocache;

# GeoIP
# Uncomment to enable
#geoip_country /usr/share/GeoIP/GeoLiteCountry.dat;
#geoip_city /usr/share/GeoIP/GeoLiteCity.dat;

#Limit Request Zone conf
include /etc/nginx/conf.d/limit_request_custom.conf;
#
#CloudFare RealIP conf
include /etc/nginx/conf.d/cloudfare_realip.conf;
#
#FastCGI and PROXY cache config
include /etc/nginx/conf.d/nginx_cache.conf;
#
#Phusion Passenger Setting
include /etc/nginx/conf.d/passenger.conf;
#
#Custom Include File where you can include any custom settings
include /etc/nginx/conf.d/custom_include.conf;
#
# Virtual Host Configs
include /etc/nginx/conf.d/default_server.conf;
include /etc/nginx/sites-enabled/*.conf;
}

**********************************
vhost level
**********************************

location / {
limit_req zone=FLOODPROTECT burst=100;
limit_conn PERIP 125;
limit_conn PERSERVER 500;

proxy_send_timeout 900;
proxy_read_timeout 900;

proxy_buffer_size 32k;
proxy_buffers 16 32k;
proxy_busy_buffers_size 64k;
proxy_temp_file_write_size 64k;

proxy_connect_timeout 300s;

proxy_pass http://PROXYLOCATION;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_redirect off;
proxy_set_header Proxy "";
}

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,269787,269788#msg-269788
JohnCarne
2016-09-23 11:36:04 UTC
Permalink
sysctl tweaked at maximum already

# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and
# sysctl.conf(5) for more details.

# Tweak for nginx workers/connections added 16/09/2016 for issue
investigation on pisix error in nginx logs
net.core.somaxconn = 512
net.core.netdev_max_backlog = 512
net.ipv4.tcp_max_syn_backlog = 20480

# Tweaks added 16/09/2016 for issue investigation on pisix error in nginx
logs
net.netfilter.nf_conntrack_max = 196608
net.nf_conntrack_max = 196608

# Tweaks added 19/09/2016 cloudlinux
vm.max_map_count=655300

# Tweaks added 20/09/2016
net.core.rmem_default=262144
net.core.wmem_default=262144
net.core.rmem_max=262144
net.core.wmem_max=262144

# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Decrease the time default value for tcp_fin_timeout connection
net.ipv4.tcp_fin_timeout = 15

# Decrease the time default value for tcp_keepalive_time connection
net.ipv4.tcp_keepalive_time = 1800

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1

# Enable TCP SYN Cookie Protection
net.ipv4.tcp_syncookies = 1

# Increase the tcp-time-wait buckets pool size
net.ipv4.tcp_max_tw_buckets = 1440000

# Turn off the tcp_sack
net.ipv4.tcp_sack = 0

# Turn off the tcp_timestamps
net.ipv4.tcp_timestamps = 0

# Controls the default maxmimum size of a mesage queue
kernel.msgmnb = 65536

# Controls the maximum size of a message, in bytes
kernel.msgmax = 65536

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296

# Disable IPv6 autoconf
net.ipv6.conf.all.autoconf = 0
net.ipv6.conf.default.autoconf = 0
net.ipv6.conf.eth0.autoconf = 0
net.ipv6.conf.all.accept_ra = 0
net.ipv6.conf.default.accept_ra = 0
net.ipv6.conf.eth0.accept_ra = 0

# Various
vm.swappiness = 1
vm.disable_fs_reclaim=1
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10

#Disable CloudLinux ptrace
kernel.user_ptrace = 0

# Symlinks
fs.enforce_symlinksifowner = 1
fs.symlinkown_gid = 99

# CageFS
fs.proc_super_gid = 485
fs.proc_can_see_other_uid=0
fs.suid_dumpable=1

# SecureLinks Link Traversal Protection Allowd Group Id
fs.protected_symlinks_allow_gid = 487
fs.fs.protected_hardlinks_allow_gid = 487
fs.file-max = 1048576

fs.protected_symlinks_create = 0
fs.protected_hardlinks_create = 0

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,269787,269792#msg-269792
JohnCarne
2016-09-23 11:39:02 UTC
Permalink
nginx -V
nginx version: nginx/1.11.4
built by gcc 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC)
built with OpenSSL 1.0.2h 3 May 2016
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx
--modules-path=/etc/nginx/modules --with-openssl=./openssl-1.0.2h
--conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error_log
--http-log-path=/var/log/nginx/access_log --pid-path=/var/run/nginx.pid
--lock-path=/var/run/nginx.lock
--http-client-body-temp-path=/var/cache/nginx/client_temp
--http-proxy-temp-path=/var/cache/nginx/proxy_temp
--http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp
--http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp
--http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nobody
--group=nobody --with-http_ssl_module --with-http_realip_module
--with-http_addition_module --with-http_sub_module --with-http_dav_module
--with-http_flv_module --with-http_mp4_module --with-http_gunzip_module
--with-http_gzip_static_module --with-http_random_index_module
--with-http_secure_link_module --with-http_stub_status_module
--with-http_auth_request_module --add-dynamic-module=naxsi-http2/naxsi_src
--with-file-aio --with-threads --with-stream --with-stream_ssl_module
--with-http_slice_module --with-ipv6 --with-http_v2_module
--with-http_geoip_module=dynamic
--add-dynamic-module=ngx_pagespeed-release-1.11.33.3-beta
--add-dynamic-module=/usr/local/rvm/gems/ruby-2.3.0/gems/passenger-5.0.30/src/nginx_module
--add-module=ngx_cache_purge-2.3 --add-module=ngx_brotli --with-cc-opt='-O2
-g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong
--param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic'
--with-ld-opt=-Wl,-E

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,269787,269793#msg-269793
Maxim Dounin
2016-09-23 16:07:28 UTC
Permalink
Hello!
Post by JohnCarne
Hello,
nginx version: nginx/1.11.4
We are now after 13 days we observer suddenly in nginx logs this in an
intempestive manner, and causing nginx to reload, causing slow down on
server : posix_memalign(16, 16384) failed (12: Cannot allocate memory)
This happens after our upgrade to last nginx version through nDeploy.
I called in nginx sysadmin, Ndeploy sysadmin too, and finally cloudlinux
support which made an incredible job investigating the issue over 7 days by
enabling multiple kernel debug tools to find out what is going on.
All nginx/linux settings has been tweaked/verified. Issue can't be solved,
and about 5 guys has broken their head on the issue, without being able to
solve. We know all the basic, even advanced, and experts were in.
Just a basic hint, in case you haven't tried it yet: re-compile
nginx without any 3rd party modules, and check if it helps.
Post by JohnCarne
Cloudlinux support says this is the cause, and you need nginx expert to find
From the information we collected it appears that nginx is really changing
# grep nginx /home/abackupnomem3.log | tail
nginx-792752 [009] 5438179.898678: setrlimit: (sys_setrlimit+0x63/0x70
Conclusion is that nginx manage those rlimits. This is not a solution, but a
way for you where to dig more.
The setrlimit() call is used by nginx to manage some limits it
knows about and configured to manage. In particular, it is used
for the

worker_rlimit_nofile 50000;

directive as seen in your config, and for the worker_rlimit_core
directive. Details about the directives can be found here:

http://nginx.org/r/worker_rlimit_core
http://nginx.org/r/worker_rlimit_nofile

They set RLIMIT_CORE and RLIMIT_NOFILE limits, nothing more, and
have nothing to do with the memory allocation errors you see.
--
Maxim Dounin
http://nginx.org/
JohnCarne
2016-09-23 18:03:54 UTC
Permalink
Thank you for your feeback

We reverted now to 1.11.3 with no brotly, and geoip module, (this version
had caused no issue)

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,269787,269801#msg-269801
JohnCarne
2016-09-24 18:10:58 UTC
Permalink
Maxim,

After 29 hours error re-appeared jus tonce, which is much less than before

I see a correlation on my monit system at this exact time :
apache traffic had a peak, which equals to a big download peak

I'm now thinking to nginx tweaks i have not done yet

I now enlarge
from 64m to
client_max_body_size 256m;

From 256k
client_body_buffer_size 512k;

add :
send_timeout 300s;

This one could be issue :
sendfile_max_chunk 512k;
I put it to 0

This is too large too
directio 50m;
I put it as :
directio 4m;

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,269787,269816#msg-269816
JohnCarne
2016-09-25 09:46:46 UTC
Permalink
I confirm we still not escaped with error which appeared just now :

2016/09/25 09:22:15 [emerg] 461680#461680: posix_memalign(16, 16384) failed
(12: Cannot allocate memory)

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,269787,269824#msg-269824
JohnCarne
2016-09-26 07:28:42 UTC
Permalink
No error after 24 hours now, nginx version without modules was 1 part of the
solution

Now I tweak aio nginx with this :
directio_alignment 4k;

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,269787,269843#msg-269843
JohnCarne
2016-09-26 08:24:00 UTC
Permalink
just now
2016/09/26 10:18:52 [emerg] 5027#5027: malloc(4096) failed (12: Cannot
allocate memory)
2016/09/26 10:18:53 [emerg] 5043#5043: malloc(4096) failed (12: Cannot
allocate memory)
2016/09/26 10:18:54 [emerg] 5048#5048: malloc(4096) failed (12: Cannot
allocate memory)
2016/09/26 10:18:54 [emerg] 5066#5066: malloc(4096) failed (12: Cannot
allocate memory)
2016/09/26 10:18:55 [emerg] 5076#5076: malloc(4096) failed (12: Cannot
allocate memory)

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,269787,269846#msg-269846
JohnCarne
2016-09-26 08:27:32 UTC
Permalink
also :
2016/09/26 10:26:22 [emerg] 14146#14146: posix_memalign(16, 16384) failed
(12: Cannot allocate memory)

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,269787,269847#msg-269847
JohnCarne
2016-09-26 10:42:58 UTC
Permalink
Im now testing what said sys nginx :

worker_processes 1;
Try 1 first and the error is fixed you can increase it to 4 or 8

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,269787,269850#msg-269850
Loading...