threads-check

了解pthread_create fail


更新記錄

item note
20161102 第一版

目錄


結論

  • 開機pthread_create EAGAIN fail
    此值會決定,開機時單一proccess可以開啟的thread量

    1
    2
    [root@localhost ~]# cat /sys/fs/cgroup/pids/init.scope/pids.max 
    512
  • 修改方式

    1
    2
    3
    4
    [root@localhost ~]# cat /etc/systemd/system.conf | grep DefaultTasksMax
    #DefaultTasksMax=512
    DefaultTasksMax=1024
    [root@localhost ~]#

DefaultTasksMax
Configure the default value for the per-unit TasksMax= setting

systemd

  • systemd is a system and service manager for Linux,

fedora 24

開機時pthread_create失敗

  • error

    1
    2
    3
    4
    main<line:9854> @_@_@_@_@_@_@_@_@_@_@_@ init ok,start create thread...
    xxxx_netservice.c xxxx_client_listen<line:3042> Listen to 5566
    MFailed to connect with server0
    create xxx_client_recv thread 102 error, errno:11.
  • Errors: Linux System Errors

    1
    #define EAGAIN          11      /* Try again */

EAGAIN (Resource temporarily unavailable)

  • pthread_create
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    EAGAIN Insufficient resources to create another thread.

    EAGAIN A system-imposed limit on the number of threads was
    encountered. There are a number of limits that may trigger
    this error: the RLIMIT_NPROC soft resource limit (set via
    setrlimit(2)), which limits the number of processes and
    threads for a real user ID, was reached; the kernel's system-
    wide limit on the number of processes and threads,
    /proc/sys/kernel/threads-max, was reached (see proc(5)); or
    the maximum number of PIDs, /proc/sys/kernel/pid_max, was
    reached (see proc(5)).

fedora22

測試自已建立fedora22 dom,沒有特別去調整環境,36CH/64CH都可以正常帶起來,
看來fedora24,比fedora22吃更多資源

  • fedora22

    1
    2
    [root@localhost limits.d]# uname -a
    Linux localhost.localdomain 4.4.13-200.fc22.x86_64 #1 SMP Wed Jun 8 15:59:40 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
  • fedora24

    1
    2
    [root@localhost ~]# uname -a
    Linux localhost.localdomain 4.6.5-300.fc24.x86_64 #1 SMP Thu Jul 28 01:10:12 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
  • fedora22原本就有下例檔案,但在fedora24是沒有,why?
    Not enough threads or processes : “thread create failed”

    1
    2
    3
    4
    [root@localhost limits.d]# ls -l /etc/security/limits.d/
    total 8
    -rw-r--r--. 1 root root 191 Jun 26 2015 20-nproc.conf
    -rw-r--r--. 1 root root 151 May 12 2015 95-jack.conf

比較不同CH需要開啟多少thread

比較fedora22 不同CH需要開啟多少thread

CH threads diff
16 528
32 656 比16CH多了128
64 824 比36CH多了168
自已建立fedora22 dom
16 549
32 672 比CH16多123
64 839 比CH36多167
未startx之前 131
startx之後 338 多了207
  • log
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    fedora22 (自已建立DOM)
    --------------------------------------------
    CH16
    [root@localhost ~]# ps -elfT | wc -l
    549

    CH36
    [root@localhost ~]# ps -elfT | wc -l
    672 (比CH16多123)

    CH64
    [root@localhost ~]# ps -elfT | wc -l
    839 (比CH36多167)

    測試startx用了多了thread (207)
    未startx之前
    [root@localhost ~]# ps -elfT | wc -l
    131

    startx之後
    [root@localhost ~]# ps -elfT | wc -l
    338

    原本fedor22 dom
    -------------------------------------
    CH16
    [root@localhost ~]# ps -elfT | wc -l
    528

    CH36
    [root@localhost disks]# ps -elfT | wc -l
    656 (比CH16多128)

    CH64
    [root@localhost ~]# ps -elfT | wc -l
    824 (比CH36多168)

startx 比較

fedora 未startx之前 startx之後 xgui
fedora22 131 338 207
fedora24 163 392 229
diff 32 54 22
  • fedora 24 startx
    1
    2
    3
    4
     [root@localhost ~]# ps -elfT | wc -l
    163
    [root@localhost ~]# ps -elfT | wc -l
    392

修改記錄

將clinet file_buf去除(改由malloc)

多產生 20個thread

修改前

1
2
3
xxxx_netservice.c xxxx_client_listen<line:3042> Listen to 5566
MFailed to connect with server0
create xxx_client_recv thread 102 error, errno:11.

修改後

1
2
3
4
ldvr->core->ch_num:36
main<line:9854> @_@_@_@_@_@_@_@_@_@_@_@ init ok,start create thread...
xxx_netservice.c xxxx_client_listen<line:3054> Listen to 5566
create xxx_client_recv thread 122 error, errno:11.

移除 package

開機後
由原本的163,減少為141 (少了22)
不過看起來,,還是要改程式,,將thread/MEM等減少

1
2
[root@localhost ~]# ps -elfT | wc -l
141

  • Resource temporarily unavailable
    EAGIN是指thread,不是processes,難怪除刪package都沒有差,,

    1
    2
    The user fails to log in because an EAGAIN error occurs if the user's number of executing threads has reached the nproc resource limit.
    Note: Despite the name, this is a limit on threads, not processes.
  • 1.移除nomachine
    修改後

    1
    2
    3
    main<line:9854> @_@_@_@_@_@_@_@_@_@_@_@ init ok,start create thread...
    xxxx_netservice.c xxxx_client_listen<line:3054> Listen to 5566
    create xxxx_client_recv thread 122 error, errno:11.
  • 2.移除colord
    dnf remove colord.x86_64
    修改後

    1
    2
    3
    main<line:9854> @_@_@_@_@_@_@_@_@_@_@_@ init ok,start create thread...
    xxxx_netservice.c xxxx_client_listen<line:3054> Listen to 5566
    create xxxx_client_recv thread 122 error, errno:11.
  • 3.移除firewalld
    firewalld.noarch
    修改後

    1
    2
    3
    main<line:9854> @_@_@_@_@_@_@_@_@_@_@_@ init ok,start create thread...
    xxxx_netservice.c xxxx_client_listen<line:3054> Listen to 5566
    create xxxx_client_recv thread 122 error, errno:11.
  • 4.移除upower.x86_64
    dnf remove upower.x86_64
    修改後

    1
    2
    3
    main<line:9854> @_@_@_@_@_@_@_@_@_@_@_@ init ok,start create thread...
    xxxx_netservice.c xxxx_client_listen<line:3054> Listen to 5566
    create xxxx_client_recv thread 122 error, errno:11.
  • 5.移除wpa
    wpa_supplicant.x86_64
    修改後

    1
    2
    3
    main<line:9854> @_@_@_@_@_@_@_@_@_@_@_@ init ok,start create thread...
    xxxx_netservice.c xxxx_client_listen<line:3054> Listen to 5566
    create xxxx_client_recv thread 122 error, errno:11.
  • 6.移除
    ModemManager.x86_64
    mlocate.x86_64
    修改後

    1
    2
    3
    main<line:9854> @_@_@_@_@_@_@_@_@_@_@_@ init ok,start create thread...
    xxxx_netservice.c xxxx_client_listen<line:3054> Listen to 5566
    create xxxx_client_recv thread 122 error, errno:11.

關閉service

  • systemctl disable packagekit.service
    沒有改善
    1
    2
    3
    main<line:9854> @_@_@_@_@_@_@_@_@_@_@_@ init ok,start create thread...
    xxxx_netservice.c xxxx_client_listen<line:3054> Listen to 5566
    create xxxx_client_recv thread 122 error, errno:11.

增加RAM到8G

  • 增加RAM到8G
    沒有明顯改善

    1
    2
    3
    main<line:9854> @_@_@_@_@_@_@_@_@_@_@_@ init ok,start create thread...
    xxxx_netservice.c xxxx_client_listen<line:3050> Listen to 5566
    create xxxx_client_recv thread 122 error, errno:11.
  • 跟記憶不足沒關,在4G時,還有3.XG空間可以使用

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    [root@localhost ~]# cat /disks/1/nvr.log 
    total used free shared buff/cache available
    Mem: 7.7G 107M 7.3G 81M 240M 7.3G
    Swap: 0B 0B 0B
    ----------
    core file size (blocks, -c) unlimited
    data seg size (kbytes, -d) unlimited
    scheduling priority (-e) 0
    file size (blocks, -f) unlimited
    pending signals (-i) 31382
    max locked memory (kbytes, -l) 64
    max memory size (kbytes, -m) unlimited
    open files (-n) 1024
    pipe size (512 bytes, -p) 8
    POSIX message queues (bytes, -q) 819200
    real-time priority (-r) 0
    stack size (kbytes, -s) 8192
    cpu time (seconds, -t) unlimited
    max user processes (-u) 31382
    virtual memory (kbytes, -v) unlimited
    file locks (-x) unlimited
    ----------

把CMS Cline由128改為64

CH total ps nvr thread
16 432 251
36 563 374

16x6 + 64x2 + 27(22?) = 251
36x6 + 64x2+ 30(22?) = 374 (251 + 120)
64x6 + 64x2 + ? == 374 + 168 = 542

  • 64CH
    1
    2
    3
    main<line:9866> @_@_@_@_@_@_@_@_@_@_@_@ init ok,start create thread...
    xxxx_netservice.c xxxx_client_listen<line:3050> Listen to 5566
    create fuho_recv thread 38 error,errno:11.

若CMS Clinet為128
16x6 + 128x2 + 27(22?) => 379 (251 + 128)
36x6 + 128x2+ 30(22?) => 502 ( 374 + 128)
64x6 + 128x2 + ? => 670 (542 + 128)


程序

程序流程

  • main flow

    [main flow]
  • ldvr init

    [ldvr init]
  • var struct

    st.png
  • 16CH thread量

    1
    2
    3
    4
    5
    128x2 = 256 (CMS的send及 recv)
    9
    16x6 = 96
    ----------
    約400

環境設定

  • overcommit_memory
    定義接受或拒絕大型記憶體需求的狀況。此參數有三種可用值: 0,1,2
    1 - kernel 不進行記憶體過度寫入處理
    1
    # echo 1 > /proc/sys/vm/overcommit_memory

ulimit 比較

  • fedor22

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    [root@localhost vm]# ulimit -a
    core file size (blocks, -c) 0
    data seg size (kbytes, -d) unlimited
    scheduling priority (-e) 0
    file size (blocks, -f) unlimited
    pending signals (-i) 15155
    max locked memory (kbytes, -l) 64
    max memory size (kbytes, -m) unlimited
    open files (-n) 1024
    pipe size (512 bytes, -p) 8
    POSIX message queues (bytes, -q) 819200
    real-time priority (-r) 0
    stack size (kbytes, -s) 8192
    cpu time (seconds, -t) unlimited
    max user processes (-u) 15155
    virtual memory (kbytes, -v) unlimited
    file locks (-x) unlimited
  • fedora 24

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    [root@localhost ~]# ulimit -a
    core file size (blocks, -c) unlimited
    data seg size (kbytes, -d) unlimited
    scheduling priority (-e) 0
    file size (blocks, -f) unlimited
    pending signals (-i) 15254
    max locked memory (kbytes, -l) 64
    max memory size (kbytes, -m) unlimited
    open files (-n) 1024
    pipe size (512 bytes, -p) 8
    POSIX message queues (bytes, -q) 819200
    real-time priority (-r) 0
    stack size (kbytes, -s) 8192
    cpu time (seconds, -t) unlimited
    max user processes (-u) 15254
    virtual memory (kbytes, -v) unlimited
    file locks (-x) unlimited
  • uimit parameter

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    -a     All current limits are reported
    -b The maximum socket buffer size
    -c The maximum size of core files created
    -d The maximum size of a process's data segment
    -e The maximum scheduling priority ("nice")
    -f The maximum size of files written by the shell and its children
    -i The maximum number of pending signals
    -l The maximum size that may be locked into memory
    -m The maximum resident set size (many systems do not honor this limit)
    -n The maximum number of open file descriptors (most systems do not allow this value to be set)
    -p The pipe size in 512-byte blocks (this may not be set)
    -q The maximum number of bytes in POSIX message queues
    -r The maximum real-time scheduling priority
    -s The maximum stack size
    -t The maximum amount of cpu time in seconds
    -u The maximum number of processes available to a single user
    -v The maximum amount of virtual memory available to the shell and, on some systems, to its children
    -x The maximum number of file locks
    -T The maximum number of threads

不同RAM的ulimit

4G RAM (FC24)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[root@localhost ~]# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 15155
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 15155
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

8G RAM (FC24)
max user processes變小 ?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[root@localhost ~]# ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 31382
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 8096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited


其它

  • 查看目前開啟多少threads

    1
    ps -elfT | wc -l
  • ps parameter

    1
    2
    3
    4
    5
    6
    7
           -e     Select all processes.  Identical to -A.
    T Select all processes associated with this terminal. Identical to the t option without any argument.

    THREAD DISPLAY
    H Show threads as if they were processes.

    -L Show threads, possibly with LWP and NLWP columns.
  • 安裝套件比較
    To list all units installed on the system, use the list-unit-files command instead.

    1
    systemctl list-unit-files

關閉一些不必要的daemon

比較systemctl –all (fc22及fc24),決定fc24要關閉那些程序

  • 查看程序狀態
    systemctl status –all

    1
    2
    3
    4
    5
    6
    7
    8
    [root@localhost etc]# systemctl status abrt-xorg.service 
    ● abrt-xorg.service - ABRT Xorg log watcher
    Loaded: loaded (/usr/lib/systemd/system/abrt-xorg.service; enabled; vendor preset: enabled)
    Active: active (running) since Wed 2016-11-02 14:10:22 CST; 1h 11min ago
    Main PID: 701 (abrt-dump-journ)
    Tasks: 1 (limit: 512)
    CGroup: /system.slice/abrt-xorg.service
    └─701 /usr/bin/abrt-dump-journal-xorg -fxtD
  • 關閉程序

    1
    systemctl stop abrt-xorg.service

防止fork bomb限制

  • nproc.conf
    防止fork bomb限制
    1
    2
    3
    4
    5
    6
    7
    [root@localhost limits.d]# cat 20-nproc.conf 
    # Default limit for number of user's processes to prevent
    # accidental fork bombs.
    # See rhbz #432903 for reasoning.

    * soft nproc 4096
    root soft nproc unlimited

systemctl

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
systemctl

start PATTERN...
Start (activate) one or more units specified on the command line.

stop PATTERN...
Stop (deactivate) one or more units specified on the command line.

status [PATTERN...|PID...]]
Show terse runtime status information about one or more units, followed by most recent log data from the journal.


disable NAME...
Disables one or more units. This removes all symlinks to the specified unit files from the unit configuration directory, and hence undoes the changes made by enable.

-a, --all
When listing units, show all loaded units, regardless of their state, including inactive units.

To list all units installed on the system, use the list-unit-files command instead.

參考