Learning Plan, Motivation and Time

In the last few months, I have been in multiple conversations around finding the time to learn. (In case you have not noticed, creating and following a learning plan is part of your personal objectives). A common suggestion I hear is to set aside half a day or a day for learning.

Earlier this week, I heard a podcast by Swizec Teller on Becoming a Senior Engineer. At the beginning of the podcast, he touched on learning. What he said consolidated ideas in my head and motivated me to write this blog.

Continue reading “Learning Plan, Motivation and Time”

The Principle of Least Privilege

This blog post was originally written for a company internal blog. I have removed references and screenshots from internal applications. I have also refactored parts of the blog to be more generic.

We still deploy many of our applications on Windows VM. Part of this blog is Windows server specific.

What is it?

The principle of least privilege states that only the minimum necessary rights should be assigned to a subject that requests access to a resource and should be in effect for the shortest duration necessary. It prevents users from obtaining or changing information in unwanted ways. This is important because it helps organisations reduce risk by reducing the potential damage that excessive privilege can cause accidentally or maliciously.

As developers, we need to be aware that this principle not only applies to us not running everything as root or local admins on our development machines. It also applies to service users our applications run as, database users our applications access data with and API users our application access third party APIs with. It even applies to our application design, where we should have appropriate roles for users identified in use cases.

Continue reading “The Principle of Least Privilege”

Spring data JPA and SQL server stored procedures

According to the Spring data JPA reference documentation, stored procedures can be called from a JPA repository using the @Procedure annotation.

<pre class="wp-block-syntaxhighlighter-code">public interface OrderRepository extends Repository<Order, Long> {
  Order save(Order fr);
  Optional findById(Long id);

  @Procedure("spGetByComplicatedQuery")
    Order findByComplicatedQuery();<br />}
</pre>

In the above code, when the repository method findByComplicatedQuery() is called, the stored procedure spGetByComplicatedQuery is called instead. This allows for complex query optimisation using direct SQL. However, with MS SQL Server, a IllegalArgumentException is thrown during runtime

2021-05-28 10:25:12,460 ERROR o.a.juli.logging.DirectJDKLog http-nio-94-exec-3 Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.springframework.dao.InvalidDataAccessApiUsageException: Type cannot be null; nested exception is java.lang.IllegalArgumentException: Type cannot be null] with root cause connection id=10.6.12.118:49537 java.lang.IllegalArgumentException: Type cannot be null
at org.hibernate.query.procedure.internal.ProcedureParameterImpl.setHibernateType(ProcedureParameterImpl.java:130)

The problem is that many SQL Server stored procedures return results as a result set, and not output parameters. The @Procedure annotation cannot handle this. The work around is to use the @Query annotation with the native query flag.

<pre class="wp-block-syntaxhighlighter-code">public interface OrderRepository extends Repository<Order, Long> {
  Order save(Order fr);
  Optional findById(Long id);

  @Query(nativeQuery=true, value="exec spGetByComplicatedQuery")
    Order findByComplicatedQuery();<br />}
</pre>

Junit 5 parameterized test with Class @ValueSource

This is a follow up to a previous blog post I wrote about Junit 5 parameterized tests. The @ValueSource annotation allows you to specify an array of literal values where a single value is used per invocation of the test. @ValueSource handles all the primitives, String and Class object. The Junit 5 user guide does not have an example of how to use a java.lang.Class @ValueSource.

I have used it to test exception handling. For example, in the following test, it verifies that both JPA and Spring DAO exception for no matching database row are mapped to the OrderNotFoundException. (The exception is annotated with Spring web’s @ResponseStatus to map to the HTTP status 404 NOT FOUND).

@ParameterizedTest
@ValueSource(classes = {NoResultException.class, EmptyResultDataAccessException.class})
  public void testOrderNotFound(Class<? extends Exception> clazz) throws Exception {
    when(orderDao.getOrder(ORDER_ID)).thenThrow(clazz);
    assertThrows(OrderNotFoundException.class,
        () -> provisionService.addNumbers(ORDER_ID, USER));
}


Using awaitility for asynchronous operations in Java integration tests

Often in integration tests, we would like to check the outcome of asynchronous operations. For example, a test to check the automation read a spreadsheet attachement in an email and update the order status correctly. In legacy systems, there are two common reasons for integration with external systems to fail. A library security upgrade could bring in a newer version of the library at compile time, leading to runtime exceptions. (The infamous NoClassFoundException in Java). A system upgrade of the external system can also change the interface and break the integration.

To test the execution of aysnchronous operations, we need the ability to poll at regular intervals, and time out after waiting for long enough. This can be easily achieved with the awaitility library.

To include the library in your tests, add the following in pom.xml

<dependency>
  <groupId>org.awaitility</groupId>
  <artifactId>awaitility</artifactId>
  <version>4.0.3</version>
  <scope>test</scope>
</dependency>

You can then poll for the expected test result like this

with().pollInterval(RESULT_POLL_INTERVAL, TimeUnit.MINUTES)
  .and().with().pollDelay(RESULT_POLL_DELAY, TimeUnit.MINUTES)
  .atMost(RESULT_WAIT, TimeUnit.MINUTES)
  .until(new YourCallable(orderId, jdbcTemplate));

The code starts polling after RESULT_POLL_DELAY minutes, at an interval of RESULT_POLL_INTERVAL minutes. It will try for at most RESULT_WAIT minutes before declaring a failure. The function YourCallable(a, b) is used to determine if the test condition is met.

public class YourCallable implements Callable<Boolean> {	
  private final int orderId;
  private final JdbcTemplate jdbcTemplate;
	
  public YourCallable(final int orderId, final JdbcTemplate jdbcTemplate) {
    this.orderId = orderId;
    this.jdbcTemplate = jdbcTemplate;
  }

  @Override
  public Boolean call() throws Exception {
    boolean accepted = false;
    try {
      Integer acceptanceId = 
        jdbcTemplate().queryForObject("select o.acceptance_id from order o where o.id = ?", Integer.class, orderId);
      if (acceptanceId != null && acceptanceId > 0) {
        accepted = true;
      }
    }
    catch (IncorrectResultSizeDataAccessException e) {
      // SQL null means no acceptance id
      accepted = false;
    }
    return accepted;
  }	
}

By using awaitility, asynchronous testing code becomes a lot more readable than a DIY approach.

Junit 5 parameterized tests

My favourite feature of Junit 5 is its improvement to parameterized tests. Previously, each parameterized test must be written in its own class. In the test class, you define a method for the test and another method for its inputs and outputs. Junit 5 provides a much simpler way to define parameterized tests. You can now annotate individual test methods as parameterized, with parameters supplied via annotations.

For example, the following test verifies the method under test throws an InvalidRequestException with the specified list of ISO dates.

@ParameterizedTest
@ValueSource(strings= {"1752-12-31T21:45:00", "10000-01-01T21:45:00"})
public void invalidDate(String testdate) {
    testRequest.setDate(testdate);
    assertThrows(InvalidRequestException.class, 
        () -> testService.testMethod(ORDER_ID, testRequest));
}

With the enum value source and exclude mode, it’s now very easy to test status validation to make sure all statuses except a few are allowed. For example, the code below checks an order can be cancelled only in the PREPROCESS status.

@ParameterizedTest
@EnumSource(value = OrderStatus.class, mode = Mode.EXCLUDE, 
            names = {"PREPROCESS"})
public void testOrderStatusCancellation(OrderStatus status) throws Exception {
    testOrder.setStatus(status);
    assertThrows(InvalidRequestException.class
        () -> testService.cancel(testOrder));
}

If your parameterized test inputs and outputs are more complicated, and cannot be easily supplied inside annotation, you can use the @MethodSource annotation. This allows you to define your test method inputs and outputs with another method, similar to how things were before Junit 5.

import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.Arguments;
import org.junit.jupiter.params.provider.MethodSource;

@ParameterizedTest
@MethodSource("inputOutputProvider")
public void testCodeParsing(String input, Set<Integer> expected) {
    assertEquals(expected, codeReader.getRejectionCodes(comment));
}

private static Stream<Arguments> inputOutputProvider() {
    return Stream.of(
        Arguments.arguments("0010,0015,0041", newHashSet(10,15,41)),
        Arguments.arguments("C22", newHashSet(22)),
        Arguments.arguments("10", newHashSet(10)));
}

Dropping unknown foreign key constraints in SQL server

A few years ago, I wrote a blog post about scripting the drop of unknown default constraints in SQL server. Today, I came across a similar problem but with unnamed foreign key constraint. Below is the SQL command needed for find the name of a foreign key constraint given the table and column names.

declare @fc_name varchar(max)
select @fc_name = o.name from sys.tables t
    join sys.foreign_key_columns fk
		on fk.parent_object_id = t.object_id
    join sys.columns c
	    on c.object_id = t.object_id
		and c.column_id = fk.parent_column_id
	join sys.objects o
		on o.object_id = fk.constraint_object_id
	 where t.name = 'some_db_table'
    and c.name = 'some_column_in_table'

if @fc_name is not null
begin
   declare @sql varchar(max) = 'alter table some_db_table ' + ' drop constraint ' + @fc_name 
   exec (@sql)
end

Junit 5 tags, maven Failsafe plugin and system integration tests

Automated system integration tests are useful for testing user acceptance criteria as a whole. In addition, we also use the tests to verify environment set up such as file, database and external web service access. We run these tests on production systems to verify deployment. Some of the automated tests in our integration test suites are not safe to be run on production. This blog posts explain how I use Junit 5 @Tag annotation and the Failsafe plugin to separate system integration tests into all enviroment and non production only.

Junit 5 allows test classes and methods to be tagged via the @Tag annotation. The tests can be used to filter test discovery and execution. I used the tag NonProductionOnly for test classes that should not be run on a production environment.

@Tag("NonProductionOnly")
public class PlaceOrderTestsIT {
   ....
}

Junit tests that are safe to be run on all environments are not annotated.

The Failsafe Plugin is designed to run integration tests while the Surefire Plugin is designed to run unit tests. It decouples failing the build if there are test failures from actually running integration tests. For unit tests to be handled by the Failsafe plugin, end the class name with IT. To filter executions with Junit 5 @Tag annotations, the plugin can be configured in the project’s pom.xml.

<build>
  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-failsafe-plugin</artifactId>
      <version>2.22.2</version>
      <executions>
        <execution>
          <goals>
            <goal>integration-test</goal>
            <goal>verify</goal>
          </goals>
        </execution>
      </executions>
      <configuration>
        <excludedGroups>${failsure.excludedGroups}</excludedGroups>
      </configuration>
    </plugin>
  </plugins> 
</build>

This configures the plugin to exclude groups defined in the pom property ${failsure.excludedGroups}. We set up the property using maven profiles. The tag NothingToExclude does not correspond to any @Tag in the test suite. It is needed in the pom because the property cannot be left empty.

<profiles>
  <profile>
    <id>uat</id>
      <properties>
	<failsure.excludedGroups>NothingToExclude</failsure.excludedGroups>
      </properties>
  </profile>	
  <profile>
    <id>production</id>
    <properties>
      <failsure.excludedGroups>NonProductionOnly</failsure.excludedGroups>
    </properties>
  </profile>
</profiles>

The profiles are passed in from Jenkins configuration for the different environments.

Combining multiple git repositories into subdirectories of a single repository

In a few projects I worked with, the legacy code base were split into multple git repositories when they were interdependent. What I mean is that to compile the microservice deployable, you need to create the libraries contained in these multiple git repositories in a specific sequence. In addition, these libraries are not used anywhere except in the microservice. The worst culprit I encountered was a microservice that was split into six different git repositories – five Java libraries and one Java WAR file. It makes a developer’s life much easier if these six repositories are structured as a single Java maven multimodule project in a single git repository.

Obviously, you can just copy all the code to one directory and commit that. However, git allows you to copy the content from one repository into another, preserving the commit history at the same time. No one wants to lose all those change histories, right?

As an example, the code base for project x is split into three repositories called project-x-client, project-x-core and project-x-ws. You want to combine them all in project-x-ws, with each individual repository in a subdirectory. (This will later be converted into a maven submodule). After the migration, you want the code in project-x-client to be in a subdirectory called client under project-x-ws, project-x-core core, and project-x-ws in webservice.

First, you need to move all the files in project-x-core into a subdirectory called core. By doing this, when you copy all the files from this repository, the files will all be neatly located inside the directory core, instead of being under the root directory. We will use the master branch for the move.

cd /home/me/git-stuff/project-x-core
git checkout master
mkdir core
git mv src core
git mv README.md core
...
git commit -am "preparing project-x-core for migration"

You don’t need to push the change to remote. The copy can be done entirely using the local repository. To copy the project-x-core repository into project-x-ws:

cd /home/me/git-stuff/project-x-ws
git remote add r /home/me/git-stuff/project-x-core
git fetch r
git merge r/master --allow-unrelated-histories
git remote rm r

You have now pulled all the files and their commit histories into the existing repository for project-x-ws. Repeat this for client and ws. When you finish. push project-x-ws to remote to share this change with your team.

Memory exhaustion, long garbage collection time and the Hibernate query plan cache

Recently, we released a new Spring Boot webservice. It uses Hibernate for database queries and persistence. (Spring Boot 2.1.10 and Hibernate 5.3.13). Within a week of release, Dynatrace alerted us to memory exhaustion and long garbage collection time on all the deployments. The app was started with the -XX:+HeapDumpOnOutOfMemoryError command line option so heap dumps were generated automatically. While the heap dumps were analysed for root cause of the memory problem, the apps were restarted with double the memory limits to keep things running.

I loaded the heap dump with Eclipse memory analyzer and its Leak Suspects Report. The tool identified one problem suspect, which used 321.5MB out of 360MB of heap memory.

One instance of “org.hibernate.internal.SessionFactoryImpl” loaded by “org.springframework.boot.loader.LaunchedURLClassLoader @ 0xe0035958” occupies 337,113,320 (89.29%) bytes. The memory is accumulated in one instance of “org.hibernate.internal.util.collections.BoundedConcurrentHashMap$Segment[]” loaded by “org.springframework.boot.loader.LaunchedURLClassLoader @ 0xe0035958”.

When I looked at the ‘Accumulated Objects in the Dominator Tree’, I noticed there were a large number of Bounded Concurrent Hash Map entries. Clicked on an individual hash map entry and use the list objects with incoming reference option, I could see that the segments of the hash map are query plan cache objects.

To look at what is stored inside the query plan cache, I used the OQL function in the memory analyzer with the follow query.

SELECT l.query.toString() 
FROM INSTANCEOF
org.hibernate.engine.query.spi.QueryPlanCache$HQLQueryPlanKey l

By inpsecting the strings returned by the query, I spotted the problem. The hibernate query cache plan cached a lot of identical queries with the numeric primary key parameter in the where clause. Like this

generatedAlias0.numberDetails as generatedAlias59 left join fetch
generatedAlias59.ddiRange6 as generatedAlias60 where generatedAlias0.id=500685

Reading the hibernate documentation, I learned that by default Criteria queries use bind parameters for literals that are not numeric only. My application used only the numeric primary key for access. Hibernate was therefore caching a new plan for every individual row in the database. As a result, the application was performing worse than not having a query cache plan at all!

To fix this, set the parameter hibernate.criteria.literal_handling_mode to BIND. This instructs hibernate to use bind variables for any literal value. In Spring Boot, the variable is spring.jpa.properties.hibernate.criteria.literal_handling_mode.

To confirm that the query cache plan problem was fixed, I used jconsole to take a heap dump

Running the OQL query again and I can see the query plans have changed to using bind parameters in the where clause.

generatedAlias0.numberDetails as generatedAlias59 left join fetch
generatedAlias59.ddiRange6 as generatedAlias60 where generatedAlias0.id=:param0