Determining What Process Isn't Cleaning Semaphores

by Aaron Chamberlain   Last Updated July 11, 2019 17:00 PM - source

I was having the common issue of not being able to restart httpd because of a high number of locked semaphores. After clearing those and restarting httpd, I also doubled the semaphores.

However, I'm lead to believe some Apache process has a memory leak and it's not releasing them, so I'll still run into the issue but later.

Example ipcs -s:


------ Semaphore Arrays --------
key        semid      owner      perms      nsems     
0x00000000 13467648   apache     600        1         
0x00000000 13697025   apache     600        1         
0x00000000 13729794   apache     600        1         
0x00000000 13762563   apache     600        1         
0x00000000 13795332   apache     600        1         
0x00000000 14057477   apache     600        1         
0x00000000 14123014   apache     600        1         
0x00000000 14155783   apache     600        1         
0x00000000 14188552   apache     600        1         
0x00000000 14221321   apache     600        1         
0x00000000 14254090   apache     600        1         

So let's trace down what process was managing those ipcs -s -i 13697025:

Semaphore Array semid=13697025
uid=48   gid=48  cuid=0  cgid=0
mode=0600, access_perms=0600
nsems = 1
otime = Not set                   
ctime = Thu Jul 11 03:41:01 2019  
semnum     value      ncount     zcount     pid       
0          1          0          0          18395

And finally what corresponds to the pid ps --pid 18395:

  PID TTY          TIME CMD

So am I reading that right the second semaphore in the list belongs to a process that already died and didn't clean up?

For example, running through the process with the second to last in the list produced that it actually did belong to a running process:

  PID TTY          TIME CMD
22331 ?        00:03:42 httpd

It's obvious these are owned by Apache, but what would be the best way of debugging what's causing this. We have in the range of 50+ virtual hosts is there a way to log an actual server request and trace it back to the process spawned, etc?

I have reason to believe it's the php-sqlsrv.x86_64 extension or something related to it because of the time relation between when that was configured and when the issue started happening. I have the ability to roll back but just want to learn how to go deep into debugging, perhaps even to submit a patch for the bug.



Related Questions




CentOS suspicious process

Updated March 15, 2016 08:00 AM

top is only showing current user processes

Updated September 24, 2015 04:00 AM