here is a long run (1.5 year) matrix instance running official matrix repos

(other times I saw postgres: 11/main: synapse_user synapse ::1(38138) SELECT)
and here is in the dmesg (starts with the end of a OOM and starts with one of them, then you have an idea of the time between two) [0]
This started when I upgraded to 1.7.2 in 2019-12-21 02:40:35 [1], but the first memory leak started at 2019-12-21 12:29:04; then I decided to upgrade from stretch (debian oldstable) to buster (debian stable).
Then I downgraded to 1.7.1 and I still (?) have the memory leak
wget https://packages.matrix.org/debian/pool/main/m/matrix-synapse-py3/matrix-synapse-py3_1.7.1+buster1_amd64.deb
dpkg -i matrix-synapse-py3_1.7.1+buster1_amd64.deb
Then I downgraded to 1.7.0 and the memory leak continues, but in a slower time same timing, first oom requires a little more time, but then again as usual
wget https://packages.matrix.org/debian/pool/main/m/matrix-synapse-py3/matrix-synapse-py3_1.7.0+buster1_amd64.deb
dpkg -i matrix-synapse-py3_1.7.0+buster1_amd64.deb
My configuration overrides the default one (uses postgresql and ldap provider) [3]
Thanks Aaron and realitygaps in the synapse matrix channel for their initial support which was crucial to start this bug report
[0]
[Sat Dec 21 21:43:55 2019] Out of memory: Kill process 3547 (python) score 897 or sacrifice child
[Sat Dec 21 21:43:55 2019] Killed process 3547 (python) total-vm:4568952kB, anon-rss:3618040kB, file-rss:0kB, shmem-rss:0kB
[Sat Dec 21 21:43:55 2019] oom_reaper: reaped process 3547 (python), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[Sat Dec 21 21:46:51 2019] python invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
[Sat Dec 21 21:46:51 2019] python cpuset=/ mems_allowed=0
[Sat Dec 21 21:46:51 2019] CPU: 0 PID: 3757 Comm: python Not tainted 4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2
[Sat Dec 21 21:46:51 2019] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014
[Sat Dec 21 21:46:51 2019] Call Trace:
[Sat Dec 21 21:46:51 2019] dump_stack+0x5c/0x80
[Sat Dec 21 21:46:51 2019] dump_header+0x6b/0x283
[Sat Dec 21 21:46:51 2019] ? do_try_to_free_pages+0x2ec/0x370
[Sat Dec 21 21:46:51 2019] oom_kill_process.cold.30+0xb/0x1cf
[Sat Dec 21 21:46:51 2019] ? oom_badness+0xe9/0x140
[Sat Dec 21 21:46:51 2019] out_of_memory+0x1a5/0x430
[Sat Dec 21 21:46:51 2019] __alloc_pages_slowpath+0xbd8/0xcb0
[Sat Dec 21 21:46:51 2019] __alloc_pages_nodemask+0x28b/0x2b0
[Sat Dec 21 21:46:51 2019] filemap_fault+0x3bd/0x780
[Sat Dec 21 21:46:51 2019] ? page_add_file_rmap+0x109/0x210
[Sat Dec 21 21:46:51 2019] ? alloc_set_pte+0xf2/0x560
[Sat Dec 21 21:46:51 2019] ? _cond_resched+0x15/0x30
[Sat Dec 21 21:46:51 2019] __xfs_filemap_fault+0x6d/0x200 [xfs]
[Sat Dec 21 21:46:51 2019] __do_fault+0x36/0x130
[Sat Dec 21 21:46:51 2019] __handle_mm_fault+0xe6c/0x1270
[Sat Dec 21 21:46:51 2019] handle_mm_fault+0xd6/0x200
[Sat Dec 21 21:46:51 2019] __do_page_fault+0x249/0x4f0
[Sat Dec 21 21:46:51 2019] ? async_page_fault+0x8/0x30
[Sat Dec 21 21:46:51 2019] async_page_fault+0x1e/0x30
[Sat Dec 21 21:46:51 2019] RIP: 0033:0x53cf39
[Sat Dec 21 21:46:51 2019] Code: Bad RIP value.
[Sat Dec 21 21:46:51 2019] RSP: 002b:00007fff6f061790 EFLAGS: 00010202
[Sat Dec 21 21:46:51 2019] RAX: 000000000079c910 RBX: 0000000000000002 RCX: 0000000000000001
[Sat Dec 21 21:46:51 2019] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00007fa53de76db8
[Sat Dec 21 21:46:51 2019] RBP: 00000000008a5780 R08: 00000000008a5780 R09: 0000000000000002
[Sat Dec 21 21:46:51 2019] R10: 00007fa56b2d0be0 R11: 0000000000000022 R12: 0000000000000001
[Sat Dec 21 21:46:51 2019] R13: 0000000000000002 R14: 0000000000598e40 R15: 00007fff6f061f28
[Sat Dec 21 21:46:51 2019] Mem-Info:
[Sat Dec 21 21:46:51 2019] active_anon:959454 inactive_anon:1978 isolated_anon:0
active_file:14 inactive_file:42 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
slab_reclaimable:7449 slab_unreclaimable:9459
mapped:35816 shmem:36953 pagetables:3418 bounce:0
free:21726 free_pcp:62 free_cma:0
[Sat Dec 21 21:46:51 2019] Node 0 active_anon:3837816kB inactive_anon:7912kB active_file:56kB inactive_file:268kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:143364kB dirty:0kB writeback:0kB shmem:147812kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 53248kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[Sat Dec 21 21:46:51 2019] Node 0 DMA free:15840kB min:268kB low:332kB high:396kB active_anon:60kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Sat Dec 21 21:46:51 2019] lowmem_reserve[]: 0 2937 3896 3896 3896
[Sat Dec 21 21:46:51 2019] Node 0 DMA32 free:54128kB min:50748kB low:63432kB high:76116kB active_anon:2964716kB inactive_anon:16kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129200kB managed:3043412kB mlocked:0kB kernel_stack:464kB pagetables:8480kB bounce:0kB free_pcp:104kB local_pcp:0kB free_cma:0kB
[Sat Dec 21 21:46:51 2019] lowmem_reserve[]: 0 0 959 959 959
[Sat Dec 21 21:46:51 2019] Node 0 Normal free:16432kB min:16564kB low:20704kB high:24844kB active_anon:873040kB inactive_anon:7896kB active_file:288kB inactive_file:0kB unevictable:0kB writepending:0kB present:1048576kB managed:982132kB mlocked:0kB kernel_stack:1824kB pagetables:5192kB bounce:0kB free_pcp:136kB local_pcp:0kB free_cma:0kB
[Sat Dec 21 21:46:51 2019] lowmem_reserve[]: 0 0 0 0 0
[Sat Dec 21 21:46:51 2019] Node 0 DMA: 2*4kB (UM) 1*8kB (U) 1*16kB (U) 0*32kB 3*64kB (UM) 2*128kB (UM) 2*256kB (UM) 1*512kB (M) 2*1024kB (UM) 0*2048kB 3*4096kB (M) = 15840kB
[Sat Dec 21 21:46:51 2019] Node 0 DMA32: 564*4kB (UME) 444*8kB (UME) 103*16kB (UME) 74*32kB (UE) 124*64kB (UME) 54*128kB (UE) 36*256kB (UME) 19*512kB (UME) 7*1024kB (UME) 0*2048kB 1*4096kB (M) = 54880kB
[Sat Dec 21 21:46:51 2019] Node 0 Normal: 1307*4kB (UME) 781*8kB (UME) 238*16kB (UME) 47*32kB (UME) 6*64kB (ME) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 17172kB
[Sat Dec 21 21:46:51 2019] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Sat Dec 21 21:46:51 2019] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Sat Dec 21 21:46:51 2019] 37022 total pagecache pages
[Sat Dec 21 21:46:51 2019] 0 pages in swap cache
[Sat Dec 21 21:46:51 2019] Swap cache stats: add 0, delete 0, find 0/0
[Sat Dec 21 21:46:51 2019] Free swap = 0kB
[Sat Dec 21 21:46:51 2019] Total swap = 0kB
[Sat Dec 21 21:46:51 2019] 1048442 pages RAM
[Sat Dec 21 21:46:51 2019] 0 pages HighMem/MovableOnly
[Sat Dec 21 21:46:51 2019] 38079 pages reserved
[Sat Dec 21 21:46:51 2019] 0 pages hwpoisoned
[Sat Dec 21 21:46:51 2019] Tasks state (memory values in pages):
[Sat Dec 21 21:46:51 2019] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[Sat Dec 21 21:46:51 2019] [ 257] 0 257 6622 615 110592 0 0 systemd-journal
[Sat Dec 21 21:46:51 2019] [ 278] 0 278 5514 260 65536 0 -1000 systemd-udevd
[Sat Dec 21 21:46:51 2019] [ 372] 103 372 1706 107 53248 0 0 rpcbind
[Sat Dec 21 21:46:51 2019] [ 374] 100 374 23270 232 86016 0 0 systemd-timesyn
[Sat Dec 21 21:46:51 2019] [ 384] 0 384 4879 246 73728 0 0 systemd-logind
[Sat Dec 21 21:46:51 2019] [ 385] 0 385 56456 291 81920 0 0 rsyslogd
[Sat Dec 21 21:46:51 2019] [ 388] 105 388 2243 139 57344 0 -900 dbus-daemon
[Sat Dec 21 21:46:51 2019] [ 390] 0 390 581 17 40960 0 0 acpid
[Sat Dec 21 21:46:51 2019] [ 391] 0 391 2126 66 57344 0 0 cron
[Sat Dec 21 21:46:51 2019] [ 396] 0 396 20540 92 57344 0 0 irqbalance
[Sat Dec 21 21:46:51 2019] [ 403] 0 403 1372 44 53248 0 0 atd
[Sat Dec 21 21:46:51 2019] [ 627] 0 627 1403 29 45056 0 0 agetty
[Sat Dec 21 21:46:51 2019] [ 633] 0 633 3963 218 73728 0 -1000 sshd
[Sat Dec 21 21:46:51 2019] [ 646] 111 646 53362 3130 151552 0 -900 postgres
[Sat Dec 21 21:46:51 2019] [ 648] 0 648 30525 533 118784 0 0 nginx
[Sat Dec 21 21:46:51 2019] [ 649] 33 649 30664 742 126976 0 0 nginx
[Sat Dec 21 21:46:51 2019] [ 650] 33 650 30596 584 126976 0 0 nginx
[Sat Dec 21 21:46:51 2019] [ 652] 111 652 53416 7486 339968 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 653] 111 653 53395 33117 405504 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 654] 111 654 53362 1578 131072 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 655] 111 655 53496 1215 188416 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 656] 111 656 17198 542 114688 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 657] 111 657 53462 707 135168 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 764] 108 764 6988 2348 98304 0 0 lua5.2
[Sat Dec 21 21:46:51 2019] [ 1013] 104 1013 5013 205 77824 0 0 exim4
[Sat Dec 21 21:46:51 2019] [ 1431] 0 1431 4228 289 73728 0 0 sshd
[Sat Dec 21 21:46:51 2019] [ 1434] 0 1434 5254 302 73728 0 0 systemd
[Sat Dec 21 21:46:51 2019] [ 1435] 0 1435 5740 588 81920 0 0 (sd-pam)
[Sat Dec 21 21:46:51 2019] [ 1448] 0 1448 2283 671 57344 0 0 bash
[Sat Dec 21 21:46:51 2019] [ 1453] 0 1453 2103 273 57344 0 0 htop
[Sat Dec 21 21:46:51 2019] [ 1604] 0 1604 4228 296 73728 0 0 sshd
[Sat Dec 21 21:46:51 2019] [ 1610] 0 1610 2430 732 57344 0 0 bash
[Sat Dec 21 21:46:51 2019] [ 3290] 0 3290 4228 285 69632 0 0 sshd
[Sat Dec 21 21:46:51 2019] [ 3296] 0 3296 2316 713 49152 0 0 bash
[Sat Dec 21 21:46:51 2019] [ 3476] 0 3476 4368 798 69632 0 0 vi
[Sat Dec 21 21:46:51 2019] [ 3757] 110 3757 1120978 905539 7499776 0 0 python
[Sat Dec 21 21:46:51 2019] [ 3850] 111 3850 53924 19314 405504 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 3860] 111 3860 54039 15842 376832 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 3868] 111 3868 53871 15040 380928 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 3878] 111 3878 54022 17647 389120 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 3885] 111 3885 53850 14661 372736 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 3889] 111 3889 54058 19481 401408 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 3890] 111 3890 54064 16440 385024 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 3892] 111 3892 54157 17175 385024 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 3895] 111 3895 53927 24754 409600 0 0 postgres
[Sat Dec 21 21:46:51 2019] [ 3897] 111 3897 54120 35123 409600 0 0 postgres
[Sat Dec 21 21:46:51 2019] Out of memory: Kill process 3757 (python) score 898 or sacrifice child
[Sat Dec 21 21:46:51 2019] Killed process 3757 (python) total-vm:4483912kB, anon-rss:3622156kB, file-rss:0kB, shmem-rss:0kB
[Sat Dec 21 21:46:51 2019] oom_reaper: reaped process 3757 (python), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[1]
Start-Date: 2019-12-19 17:41:49
Commandline: apt upgrade
Upgrade: git-man:amd64 (1:2.11.0-3+deb9u4, 1:2.11.0-3+deb9u5), git:amd64 (1:2.11.0-3+deb9u4, 1:2.11.0-3+deb9u5), matrix-synapse-py3:amd64 (1.6.1+stretch1, 1.7.1+stretch1)
End-Date: 2019-12-19 17:42:50
Start-Date: 2019-12-21 02:40:35
Commandline: apt upgrade
Upgrade: libsasl2-modules-db:amd64 (2.1.27~101-g0780600+dfsg-3, 2.1.27~101-g0780600+dfsg-3+deb9u1), libsasl2-2:amd64 (2.1.27~101-g0780600+dfsg-3, 2.1.27~101-g0780600+dfsg-3+deb9u1), matrix-synapse-py3:amd64 (1.7.1+stretch1, 1.7.2+stretch1), libsasl2-modules:amd64 (2.1.27~101-g0780600+dfsg-3, 2.1.27~101-g0780600+dfsg-3+deb9u1)
End-Date: 2019-12-21 02:41:17
(from /var/log/apt/var/log/history.log)
[2]
https://github.com/matrix-org/synapse/pull/6576/files
which in debian packaging is here: /opt/venvs/matrix-synapse/lib/python3.5/site-packages/synapse/handlers/federation.py
[3]
#certificates
tls_certificate_path: "/etc/letsencrypt/live/example.com/fullchain.pem"
tls_private_key_path: "/etc/letsencrypt/live/example.com/privkey.pem"
# overridden: default is sqlite
database:
name: psycopg2
args:
user: synapse_user
password: synapse_user
database: synapse
host: localhost
cp_min: 5
cp_max: 10
password_providers:
- module: "ldap_auth_provider.LdapAuthProvider"
config:
enabled: true
uri: "ldaps://ldap.example.com"
start_tls: true
base: "o=webusers,dc=example,dc=com"
attributes:
uid: "uid"
mail: "mail"
name: "uid"
# overridden: default is false
allow_guest_access: True
# reverse proxy -> https://github.com/matrix-org/synapse#using-a-reverse-proxy-with-synapse
listeners:
# Main HTTPS listener
# For when matrix traffic is sent directly to synapse.
-
# The port to listen for HTTPS requests on.
port: 8448
# Local interface to listen on.
# The empty string will cause synapse to listen on all interfaces.
#bind_address: ''
# include IPv6 -> src https://github.com/matrix-org/synapse/issues/1886
bind_address: '::'
# This is a 'http' listener, allows us to specify 'resources'.
type: http
tls: true
# Use the X-Forwarded-For (XFF) header as the client IP and not the
# actual client IP.
x_forwarded: false
# List of HTTP resources to serve on this listener.
resources:
-
# List of resources to host on this listener.
names:
- client # The client-server APIs, both v1 and v2
# Should synapse compress HTTP responses to clients that support it?
# This should be disabled if running synapse behind a load balancer
# that can do automatic compression.
compress: true
- names: [federation] # Federation APIs
compress: false
# Unsecure HTTP listener,
# For when matrix traffic passes through loadbalancer that unwraps TLS.
- port: 8008
tls: false
bind_address: '127.0.0.1'
type: http
#x_forwarded: true
x_forwarded: false
resources:
- names: [client]
compress: true
- names: [federation]
compress: false
# Enable communities
enable_group_creation: True
here is a long run (1.5 year) matrix instance running official matrix repos
(other times I saw
postgres: 11/main: synapse_user synapse ::1(38138) SELECT)and here is in the dmesg (starts with the end of a OOM and starts with one of them, then you have an idea of the time between two) [0]
This started when I upgraded to 1.7.2 in 2019-12-21 02:40:35 [1], but the first memory leak started at 2019-12-21 12:29:04; then I decided to upgrade from stretch (debian oldstable) to buster (debian stable).
Then I downgraded to 1.7.1 and I still (?) have the memory leak
Then I downgraded to 1.7.0 and the memory leak continues,
but in a slower timesame timing, first oom requires a little more time, but then again as usualMy configuration overrides the default one (uses postgresql and ldap provider) [3]
Thanks Aaron and realitygaps in the synapse matrix channel for their initial support which was crucial to start this bug report
[0]
[1]
(from
/var/log/apt/var/log/history.log)[2]
https://github.com/matrix-org/synapse/pull/6576/files
which in debian packaging is here: /opt/venvs/matrix-synapse/lib/python3.5/site-packages/synapse/handlers/federation.py
[3]