Hang Detection Policy Example

I wanted to try hang detection policy so i decided to create a sample application that will first create hang and then i will use the WAS tools to debug the issue. I followed these steps

  • Create a sample HangDetectionServlet as shown in the listing

    public class HangDetectionServlet extends HttpServlet {
    private static final long serialVersionUID = 1L;

    protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
    DateFormat d = new SimpleDateFormat("mm:ss:SS");
    response.getWriter().println("Entering HangDetectionServlet.doGet() " + d.format(new Date()));
    try {
    Thread.sleep(250000);
    } catch (InterruptedException e) {
    e.printStackTrace();
    }
    response.getWriter().println("Exiting HangDetectionServlet.doGet() " + d.format(new Date()));
    }

    }

    This servlet is very simple, when it gets HTTP GET request it puts the current thread in sleep for 250 seconds.

  • By default WAS marks a thread has hanged thread if it is running for more than 10 minutes. So our HangDetectionServlet will not be marked as hanged. So in the next step we will change the hang detection policy on WAS so that if the thread is running for more than 60 seconds/ 1 minute it will be considered hanged.

  • You can set Hang Detection policy on WAS using WAS Admin Console. First Login into WAS admin console and then go to Servers < Application Servers < server_name. Then under Server Infrastructure go to Administration -> Custom Properties.


  • Set hang detection policies like this

    By setting value of com.ibm.websphere.threadmonitor.threshold to 60 i am saying that if the thread is running for more than 60 seconds then it should be considered hanged and setting com.ibm.websphere.threadmonitor.dump.java to true means when application server detects hanged thread it should generate thread dump in addition to writing message in the SysetemOut.log. Setting value of com.ibm.websphere.threadmonitor.interval to 60 means saying that the thread monitor should run every 60 seconds to check for hanged thread. After setting these values restart the server for changes to take effect

  • Now deploy the HangDetectionServlet on your server and access it, after couple of minutes i could see this message in the SystemOut.log

    [7/3/09 13:33:56:375 PDT] 00000019 ThreadMonitor W WSVR0605W: Thread "WebContainer : 0" (00000023) has been active for 66156 milliseconds and may be hung. There is/are 1 thread(s) in total in the server that may be hung.

    This message shows that Thread "WebContainer : 0" (00000023) is hanged so lets look at what is causing this thread to hang

  • If you remember we configured hang detection policy so that it generates thread dump when a thread is hanged. So lets check profiles\AppSrv01\logs\server1\native_stderr.log file to find out if the thread dump was generated and if yes what is the location of the thread dump.

    ************* End Display Current Environment *************
    JVMDUMP007I JVM Requesting Java Dump using 'C:\Cert\WebSphere\AppServer\profiles\AppSrv01\javacore.20090702.231210.3220.0001.txt'
    JVMDUMP010I Java Dump written to C:\Cert\WebSphere\AppServer\profiles\AppSrv01\javacore.20090702.231210.3220.0001.txt

    The native_stderr.log file has location of the javacore.

  • Open the javacore.20090702.231210.3220.0001.txt file in Thread Dump Analyzer which is part of the IBM Support Assistant. And inside that take a look at stack trace of Thread "WebContainer : 0" (00000023) thread.


    As you can see the Thread "WebContainer : 0" (00000023) is executing the HangDetectionPolicy.doGet() method and it is executing Thred.Sleep(), so now we know what is causing the thread to hang

1 comment:

Unknown said...

Can you please provide atleast one hung thread file to analyse from IBM Thread analyser tool.