Thursday, November 12, 2009

Load Test

Past few days i encountered one serious production issue, which the web service timeout before the backend process sent back the message to the SOAP response node in WMB flow. This was due to the high load in production that caused the delay in the backend process.

The whole process is: WMB flow extracts SOAP request message, turns it into a MQ Header message, puts the message to Java adaptor's request queue, then Java adaptor processes the content and puts a response message to the response queue, then WMB flow will read the message in the response queue and will turn it into a SOAP response.

In the progress of investigation, we decided to move one database query out from the java adaptor program to the WMB flow. The reason being is sometimes the db connection will just drop and the java adaptor program has to spend some time reinitiating the db connection. In fact according to the time recorded in the adaptor's log the process was still quite fast, it's even within miliseconds. Well we just gave it a try.

After making all the necessary changes we deployed the works to the testing environment. I then performed some load tests to the web service. The result was negative. The tps was around 3.5 but the average time taken was > 10 seconds and we had to achieve < 10 seconds! So still plenty of work to do :(

We performed a few cycles of investigation, trial and error & testing until we found the solution - increasing the WMB flow instance. This is a setting in the WMB Toolkit. So we increased the number of instances to 3. This time we managed to achieve an average time of 3 seconds! We were so happy and excited!

What about the cause? See below:

There were many messages piling at the response queue which is supposed to be read by the WMB flow. These messages were sent in by the Java Adaptor after it finished processing the data. Apparently these messages were not picked up by the WMB flow in time hence causing the timeout error. By increasing the WMB flow instance, concurrently there are 3 identical WMB flows reading the content of this queue, hence problem resolved!

I was really relieved!

Thursday, September 10, 2009

Classloading in Websphere Administrative Console


I have this web service server application which has been running fine at a Websphere Process Server. One day, my application's log was writing to another application B's log. That is really weird. I checked and found out that this B was recently deployed to the server only. The cause for my application to write to B's log is due to B's log4j jar file was put in the server's common directory ! If we do not change the classloading policy, which is default to PARENT_FIRST, then Wepshere's classloader will load that common directory first only will load the jars in our application's WEB-INF/lib directory.

Hence i logged on to Websphere Administrative Console, went to my application's war file and changed the classloader policy to PARENT_LAST. By doing this Websphere will find and load the neccessary classes from your web module's WEB-INF/lib(web module classloader) first, also you must ensure that the jars you put under your application's WEB-INF/lib folder do not clash with the one at server's common directory, else you will get a java.lang.LinkageError exception due to classloader is trying to load duplicate class.

Friday, July 24, 2009

NullPointerException in SecureRandom.nextBytes

 

I was facing this issue where a web service client which had always been running fine suddenly stopped functioning properly due to an error NullPointerException in SecureRandom.nextBytes

AxisFault
faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException
faultSubcode:
faultString: java.lang.NullPointerException
faultActor:
faultNode:
faultDetail:
{http://xml.apache.org/axis/}stackTrace:java.lang.NullPointerException
at java.security.SecureRandom.nextBytes(SecureRandom.java:406)
at org.apache.axis.utils.SessionUtils.generateSessionId(SessionUtils.java:62)
As you can see the exception was thrown when some class in Axis is calling the nextBytes function from SecureRandom Class. I spent quite some time in this and i think that is a bug for IBM version of JRE 1.4.2. If you want to know more, read here.
Therefore i changed the script which starts my web service client by using the java command from Java 5 instead of java 1.4.2 and finally it worked!

Saturday, May 30, 2009

Busy with Web Services on WPS

For the past two weeks i had been very busy with building web services using java 1.4. Bad thing was i had to deploy them on Websphere Process Server 6.0,

I had no problem building the web services using Rational Software Architect(RSA). I tested my work by deploying to Websphere Application Server 6.1 runtime that comes together with the RSA. Everything was working fine.

However i started to have a lot of headaches while trying to deploy these web services by using Websphere Admin Consle to Websphere Process Server 6.0.2 which is running on a remote server. My web service application just could not start!

After trying out several solutions, in the end i migrated my source codes to Websphere Integration Developer 6.0(WID). I regenerated every web service related deployment configuration files and proxy classes by using WID.

This time my web service application started successfully! Well the next problem was it was unable to get database connection! It took me quite some time to figure out what's wrong with it and finally i realized that i had to get the JDBC connection from the WPS itself instead of creating my own OracleDataSource cache.

I created all JDBC connection that the web service applications needs by using the Websphere Admin Console. I also changed my code to get the JDBC connection via JNDI.
The method is simple. Just create a default initialContext, then within the context, get your JDBC connection by name like "JDBC/MyDB" will do.

Saturday, March 21, 2009

Run java program at backend on Windows

Normally, if you want to start a Java program to run as back end process, just type the command follow by a '&'. However, how do you do the same on Windows?

Yesterday my colleague was facing this problem and i remembered i have gone through the same problem before in my previous company. After we went through the windows command list i found something, which is the 'start' command. Then i remembered i might have used this command before when i had faced the same problem last time.

So we tried the 'start' command and it was successful.

Friday, March 20, 2009

Why IT Project Fails!

I have come across an article talking about why IT project fails!

Take a look on it!

My experience told me that business users often do not want to spend much time with the developers during the development of a system. They do not realize that it is very very critical and useful to the final result of the system by spending time reviewing developers work to find out whether something is missing at the early phase of the project! Well i do agree that agile methodology approach is very very good provided the business users are willing to allocate resource to examine the system.

Friday, March 6, 2009

Working on wrong direction

One day the client raised an issue to me. That issue would not happen if the Websphere Message Broker (WMB) flow that i deployed on the server is running fine.

I checked the queue and channel status by using a tool called MQJ explorer and found out both the sender and receiver channels' status were 'inactive'. I then turned it to 'running' again. However their status will still be 'inactive' after some time. I had gone mad about this and spent a whole day trying to find out what went wrong.


In the end i know i went into the wrong direction. It is normal that after some time the status of the channels go to 'inactive'. As long as the transmission trigger is on then there will be no problem. When the next message comes in then the sender and receiver channels will be activated again.

So what actually went wrong. Argh it is because the portal team did not set the correct queue name to the JMS header of the XML message and thus no response from the whole flow!