Junit 5 tags, maven Failsafe plugin and system integration tests

Automated system integration tests are useful for testing user acceptance criteria as a whole. In addition, we also use the tests to verify environment set up such as file, database and external web service access. We run these tests on production systems to verify deployment. Some of the automated tests in our integration test suites are not safe to be run on production. This blog posts explain how I use Junit 5 @Tag annotation and the Failsafe plugin to separate system integration tests into all enviroment and non production only.

Junit 5 allows test classes and methods to be tagged via the @Tag annotation. The tests can be used to filter test discovery and execution. I used the tag NonProductionOnly for test classes that should not be run on a production environment.

@Tag("NonProductionOnly")
public class PlaceOrderTestsIT {
   ....
}

Junit tests that are safe to be run on all environments are not annotated.

The Failsafe Plugin is designed to run integration tests while the Surefire Plugin is designed to run unit tests. It decouples failing the build if there are test failures from actually running integration tests. For unit tests to be handled by the Failsafe plugin, end the class name with IT. To filter executions with Junit 5 @Tag annotations, the plugin can be configured in the project’s pom.xml.

<build>
  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-failsafe-plugin</artifactId>
      <version>2.22.2</version>
      <executions>
        <execution>
          <goals>
            <goal>integration-test</goal>
            <goal>verify</goal>
          </goals>
        </execution>
      </executions>
      <configuration>
        <excludedGroups>${failsure.excludedGroups}</excludedGroups>
      </configuration>
    </plugin>
  </plugins> 
</build>

This configures the plugin to exclude groups defined in the pom property ${failsure.excludedGroups}. We set up the property using maven profiles. The tag NothingToExclude does not correspond to any @Tag in the test suite. It is needed in the pom because the property cannot be left empty.

<profiles>
  <profile>
    <id>uat</id>
      <properties>
	<failsure.excludedGroups>NothingToExclude</failsure.excludedGroups>
      </properties>
  </profile>	
  <profile>
    <id>production</id>
    <properties>
      <failsure.excludedGroups>NonProductionOnly</failsure.excludedGroups>
    </properties>
  </profile>
</profiles>

The profiles are passed in from Jenkins configuration for the different environments.

Combining multiple git repositories into subdirectories of a single repository

In a few projects I worked with, the legacy code base were split into multple git repositories when they were interdependent. What I mean is that to compile the microservice deployable, you need to create the libraries contained in these multiple git repositories in a specific sequence. In addition, these libraries are not used anywhere except in the microservice. The worst culprit I encountered was a microservice that was split into six different git repositories – five Java libraries and one Java WAR file. It makes a developer’s life much easier if these six repositories are structured as a single Java maven multimodule project in a single git repository.

Obviously, you can just copy all the code to one directory and commit that. However, git allows you to copy the content from one repository into another, preserving the commit history at the same time. No one wants to lose all those change histories, right?

As an example, the code base for project x is split into three repositories called project-x-client, project-x-core and project-x-ws. You want to combine them all in project-x-ws, with each individual repository in a subdirectory. (This will later be converted into a maven submodule). After the migration, you want the code in project-x-client to be in a subdirectory called client under project-x-ws, project-x-core core, and project-x-ws in webservice.

First, you need to move all the files in project-x-core into a subdirectory called core. By doing this, when you copy all the files from this repository, the files will all be neatly located inside the directory core, instead of being under the root directory. We will use the master branch for the move.

cd /home/me/git-stuff/project-x-core
git checkout master
mkdir core
git mv src core
git mv README.md core
...
git commit -am "preparing project-x-core for migration"

You don’t need to push the change to remote. The copy can be done entirely using the local repository. To copy the project-x-core repository into project-x-ws:

cd /home/me/git-stuff/project-x-ws
git remote add r /home/me/git-stuff/project-x-core
git fetch r
git merge r/master --allow-unrelated-histories
git remote rm r

You have now pulled all the files and their commit histories into the existing repository for project-x-ws. Repeat this for client and ws. When you finish. push project-x-ws to remote to share this change with your team.

Memory exhaustion, long garbage collection time and the Hibernate query plan cache

Recently, we released a new Spring Boot webservice. It uses Hibernate for database queries and persistence. (Spring Boot 2.1.10 and Hibernate 5.3.13). Within a week of release, Dynatrace alerted us to memory exhaustion and long garbage collection time on all the deployments. The app was started with the -XX:+HeapDumpOnOutOfMemoryError command line option so heap dumps were generated automatically. While the heap dumps were analysed for root cause of the memory problem, the apps were restarted with double the memory limits to keep things running.

I loaded the heap dump with Eclipse memory analyzer and its Leak Suspects Report. The tool identified one problem suspect, which used 321.5MB out of 360MB of heap memory.

One instance of “org.hibernate.internal.SessionFactoryImpl” loaded by “org.springframework.boot.loader.LaunchedURLClassLoader @ 0xe0035958” occupies 337,113,320 (89.29%) bytes. The memory is accumulated in one instance of “org.hibernate.internal.util.collections.BoundedConcurrentHashMap$Segment[]” loaded by “org.springframework.boot.loader.LaunchedURLClassLoader @ 0xe0035958”.

When I looked at the ‘Accumulated Objects in the Dominator Tree’, I noticed there were a large number of Bounded Concurrent Hash Map entries. Clicked on an individual hash map entry and use the list objects with incoming reference option, I could see that the segments of the hash map are query plan cache objects.

To look at what is stored inside the query plan cache, I used the OQL function in the memory analyzer with the follow query.

SELECT l.query.toString() 
FROM INSTANCEOF
org.hibernate.engine.query.spi.QueryPlanCache$HQLQueryPlanKey l

By inpsecting the strings returned by the query, I spotted the problem. The hibernate query cache plan cached a lot of identical queries with the numeric primary key parameter in the where clause. Like this

generatedAlias0.numberDetails as generatedAlias59 left join fetch
generatedAlias59.ddiRange6 as generatedAlias60 where generatedAlias0.id=500685

Reading the hibernate documentation, I learned that by default Criteria queries use bind parameters for literals that are not numeric only. My application used only the numeric primary key for access. Hibernate was therefore caching a new plan for every individual row in the database. As a result, the application was performing worse than not having a query cache plan at all!

To fix this, set the parameter hibernate.criteria.literal_handling_mode to BIND. This instructs hibernate to use bind variables for any literal value. In Spring Boot, the variable is spring.jpa.properties.hibernate.criteria.literal_handling_mode.

To confirm that the query cache plan problem was fixed, I used jconsole to take a heap dump

Running the OQL query again and I can see the query plans have changed to using bind parameters in the where clause.

generatedAlias0.numberDetails as generatedAlias59 left join fetch
generatedAlias59.ddiRange6 as generatedAlias60 where generatedAlias0.id=:param0

Indexed computed column for frequenty used XML values

If you query an XML element or attribute value frequently, you can promote the value into a column and create an index on the promoted column. This makes the code easier to read and queries faster.

As an example, you have a XML column called requestXML in the table Orders. Within the requestXML is a XPath element /Order/accountId which your application queries frequently. If you try to create a computed column for the element value as follow, SQL server will return the error message ‘Xml data type methods are not supported in computed column definitions. Create a scalar user-defined function to wrap the method invocation’.

ALTER TABLE dbo.Orders
  ADD accountId AS requestXML.value('(/Order/accountId)[1]', 'varchar(6)')

This is SQL server way of saying you need to create a function

create function fnGetAccountIdFromRequestXML(@request xml)
 returns varchar(6)
 with schemabinding
 begin
     declare @accountId varchar(6)
     select @accountId = @request.value('(/Order/accountId)[1]', 'varchar(6)')
     return @accountId 
 end

Don’t forget the ‘with schemabinding’ part if you want to create an index on the computed column. Now you are ready to create the computed column and then the index

ALTER TABLE dbo.Orders
ADD accountId AS dbo.fnGetAccountIdFromRequestXML(requestXML);

CREATE NONCLUSTERED INDEX IX_Orders_accountId  ON  dbo.Orders(accountId)

The accountId element in requestXML has now been promoted to an indexed computed column. It can now be used in queries just like any normal SQL database table columns.

Computing the prefix from a alphanumeric varchar column in SQL Server

Recently, I needed to do some data analysis for a project I was working on. The order references are alphanumeric strings, with a variable length prefix that indicates the origin system. For example, order references AB123456789 and MOB98765 have prefixes AB and MOB.

I have pulled the data into a SQL database table for analysis. I wanted to use the prefix in SQL group by and aggregrate functions. To accomplish this, I created a computed column for the order prefix.

ALTER TABLE datatable
ADD order_ref_prefix 
AS (stuff(order_ref, patindex('%[0-9]%', order_ref), 10000, ''))

This computed column make use of two SQL server functions patindex and stuff. Patindex returns the starting position of the first occurrence of a pattern in a string. In this example, it returns the first occurence of a number in the order reference. Stuff replaces part of a string with another. Here, we replace the substring between the start of the first number in the order reference to position 10000 with an empty string. This effectively truncates the order reference from the first number found.

Once I have the order prefix in its own column, I can use them in queries like these:

select order_ref_prefix, count(*) as occurrence
from datatable
group by order_ref_prefix
order by occurrence desc
select *
from database
where order_ref_prefix = 'MOB' and <other predicates>

Super handy!!

Logging outbound HTTP requests from JAX-RS client

In order to track down a bug, I needed to log HTTP requests sent from one of our web services to another third party web service. (We hosted the service, but the software was not developed in house).

Our web service was written in resteasy, a framework I was not especially familiar with. (I prefer to use the Spring stack, and always create new web services using Spring Boot). The code to call the third party web service looked like this

import javax.ws.rs.client.Invocation.Builder;
builder.buildPost(Entity.form(form)).invoke();

Surprisingly, there wasn’t an obvious way to get the request body sent. From various stackoverflow Q&A, the way to log JAX-RS outbound client requests was to create an implementation of ClientRequestFilter, and register it as a Provider in the container.

@Provider
public class MyClientRequestLoggingFilter implements ClientRequestFilter {
  private static final Logger LOG = LoggerFactory.getLogger(MyClientRequestLoggingFilter.class);	
    @Override
    public void filter(ClientRequestContext requestContext) throws IOException {
      LOG.info(requestContext.getEntity().toString());
    }
}

You then configure your web.xml to scan for providers

<context-param>
  <param-name>resteasy.scan.providers</param-name>
  <param-value>true</param-value>
</context-param>

There are quite a few warnings that because the function ClientRequestContext.getEntity() returns an Object, the default toString() may not work as expected. Unmarshalling of the object is required to log the request body.

After banging my head against a wall for an afternoon, I decided to take a completely different approach to the problem. I googled on how to enable request logging in apache httpd instead. This turned out to be a much more straightforward way to achieve what I needed. The module mod_dumpio can used to dump all input and output requests to the server into a log file. You need mod_dumpio present in the apache httpd installation. (In windows, check to see if mod_dumpio.so is in c:\apache-install-dir\modules). Stop the service, edit the httpd.conf file to include the following lines

LoadModule dumpio_module modules/mod_dumpio.so

ErrorLog "logs/error.log"
LogLevel debug
DumpIOInput On
DumpIOOutput On
LogLevel dumpio:trace7

The ErrorLog and LogLevel lines are already present in my httpd.conf. I changed the LogLevel to debug, and added the follwoing three lines to turn on the dumpio module. After server restart, all HTTP requests and responses were successfully logged to the file logs/error.log.

Lesson learnt here – if an approach turned out to be more complicated than expected, it’s worth taking a step back and rethink.

Using two datasources in a Spring Boot application

Using one datasource only in a Spring Boot application is very straight forward. However, using multiple datasources in an application is anything but! It took me quite a bit of googling and fiddling to find a solution that worked.

To use two datasources, you need to set one up as primary. The second datasource will then become the secondary. You set a datasource as the primary by using the primary attribute. Below is an example using XML based configuration

<bean id="greenDataSource" primary="true" class="org.apache.commons.dbcp2.BasicDataSource">
    <property name="driverClassName" value="com.microsoft.sqlserver.jdbc.SQLServerDriver"/>
    <property name="url" value="${db.green.url}"/>
    <property name="username" value="${db.green.username}"/>
    <property name="password" value="${db.green.password}"/>
</bean>

Then define the secondary datasource like this:

   
<bean id="purpleDataSource" class="org.apache.commons.dbcp2.BasicDataSource">
    <property name="driverClassName" value="com.microsoft.sqlserver.jdbc.SQLServerDriver"/>
    <property name="url" value="${db.purple.url}"/>
    <property name="username" value="${db.purple.username}"/>
    <property name="password" value="${db.purple.password}"/>
</bean>

You can then wire them into your Java classes using the @Autowired and @Primary annotations:

@Repository
public class AwesomeDaoImpl implements AwesomeDao {
    private JdbcTemplate greenJdbcTemplate;
    private JdbcTemplate purpleJdbcTemplate;

    @Autowired
    @Primary
    public void setGreenDataSource(DataSource greenDataSource) {
        this.greenJdbcTemplate = new JdbcTemplate(greenDataSource);
    }

    @Autowired
    public void setPurpleDataSource(DataSource purpleDataSource) {
        this.ipdcJdbcTemplate = new JdbcTemplate(purpleDataSource);
    }
}

I haven’t figured out how to plumb in more than two datasources without using JNDI. If JNDI is available, then your Spring Boot application can access all the JDNI datasources using the @Resource annotation.

@Repository
public class ColourDaoImpl implements ColourErrorDao {
    private JdbcTemplate jdbcTemplate;

    @Resource(mappedName = "java:jboss/datasources/Green")
    public void setGreenDataSource(DataSource greenDataSource) {
        this.jdbcTemplate = new JdbcTemplate(greenDataSource);
    }
}

Dropping unknown default constraints in SQL server

For releases, we had to provide SQL scripts to install database changes. This had to be done outside of SQL server management studio and run on multiple environments. Occasionally, I had to drop unnamed constraints in the script. (Most developers wrote their install scripts with named constraints to avoid the difficulty). Every time I encountered this problem, I googled and followed this excellent blog post by Rob Farley. It gave the SQL commands to query for the name of an unnamed constraint, given the table and column name. It stopped short of giving you the SQL code to actually drop the constraint.

So here is my take:

declare @df_name varchar(max)
select @df_name = d.name from sys.tables t
    join sys.default_constraints d
        on d.parent_object_id = t.object_id
    join sys.columns c
        on c.object_id = t.object_id
        and c.column_id = d.parent_column_id
    where t.name = 'some_db_table'
    and c.name = 'some_column_in_table'

if @df_name is not null 
begin
	declare @sql varchar(max) = 'alter table some_db_table ' + ' drop constraint ' + @df_name 
	exec (@sql)
end