Dependency Injection and Loose Coupling: How They Affect Your Ability to Test

Even though the concepts of “dependency injection” and “loose coupling” have been advocated and written about for at least the past decade, I’m still routinely finding examples that make me think, on the whole, we developers may be underestimating the power of these simple ideas and their related practices. In this post, I’ll describe a real-world, simple scenario I recently encountered and point out how a few very slight changes in thinking could have significantly improved the design and avoided a lot of challenges down the road. I’ll provide some sample code that illustrates why ignoring these principles can get us into trouble, as well as sample code that illustrates how we can clean up our design and make our lives so much easier.

I recently ran into this issue on a project… I found a need to update a Java class that represented a scheduled “job” that retrieved a very large file (contained hundreds of millions lines/records) from a remote URL and processed that file. In a nutshell, here are the primary responsibilities for this class:

  1. Download file from a URL to the local filesystem.
  2. Parse each line contained in the downloaded file, checking to see whether the line represents an entry the application is interested in.
  3. For each line of interest, update the application’s database – depending on the line’s content, that line would either be ignored, or would flag the class to delete existing records or add new records to several tables in the application’s database.

It was these last two functions that caused us to update the class’ algorithm, because when processing files containing such a huge number of records to be parsed and processed, the job would typically run for at least a couple of days. That’s the background, but the real point of this post lies with function number one above – download the file from a remote URL. With all the changes I’d made to optimize the algorithm, I needed to verify the class’ behavior – there was no unit test and I wanted to write one. The idea of unit testing is to isolate the unit (almost always a single class – the “class under test”). However, I ran into some issues:

  • The class under test called a private method to download the file. Why is this a problem? With this implementation, I cannot control what an external web site will return to me at any given time, so how can I possibly write a test that verifies my class correctly processes the returned file? Besides, it’s really a bad practice to have unit tests interact with any external system, especially if it’s a production system. And, if that system is down or broken, my unit tests will break, even though my code could be just fine.
  • The class under test directly instantiated the Data Access Objects (DAOs) it used to update the application’s database. Why is this a problem? While I do care that the data parsed from the downloaded file gets updated correctly in the application’s database, that is not the purpose of the unit test – I may write a more comprehensive integration test later to test a more complete “flow”, but that is not the scope of my unit test. I do not want to really update my database – only verify that my class under test calls the DAOs when and with the parameters I expect. By instantiating the DAOs directly, the class under test has interfered with my ability to control this.

The below code snippet is an oversimplification intended to demonstrate the original problematic approach:

public class BadSyncDBWithExternalSiteJob {
	private static final String REMOTE_SITE_URL = "http://mysite.com/update-file.txt";

	private static final String DOWNLOADED_FILE_LOCATION = "./downloadedFileToParse";

	private SiteSyncDAOImpl siteSyncDAO = new SiteSyncDAOImpl();

	public void execute() throws Exception {
		File downloadedFileToParse = retrieveRemoteFile();
		// Parse file, update local database via the siteSyncDAO
	}

	private File retrieveRemoteFile() throws Exception {
		File downloadedFile = null;
		BufferedInputStream in = null;
		FileOutputStream fout = null;
		// Yes, there are better ways to do this - e.g. Apache's FileUtils
		try {
			in = new BufferedInputStream(new URL(REMOTE_SITE_URL).openStream());
			fout = new FileOutputStream(DOWNLOADED_FILE_LOCATION);

			byte data[] = new byte[1024];
			int count;
			while ((count = in.read(data, 0, 1024)) != -1) {
				fout.write(data, 0, count);
			}

			downloadedFile = new File(DOWNLOADED_FILE_LOCATION);
		} finally {
			if (in != null)
				in.close();
			if (fout != null)
				fout.close();
		}
		return downloadedFile;
	}

}

As you can see, the class that parses the file and updates the database also decides which URL to retrieve the file from the remote site and actually downloads the file. It is as tightly bound to these functions as one can get. So, in order to test this class, my tests will always interact and be dependent upon the external URL, which means it depends on both the current state of that site, as well as the contents of the retrieved file at any given time. This is totally unacceptable. You can also see that it instantiates its own DAO. This interferes with ability to unit test this class, since it will always use this DAO and update the database – which I do not want.

So, what could the developer have done to avoid these issues? Below are some modified code snippets that shows an implementation of the above class, with the issues eliminated by decoupling it from the implementation details it should not care about and allowing dependency injection:

public class GoodSyncDBWithExternalSiteJob {
	private ExternalFileDownloader externalSiteFileDownloader = null;

	private SiteSyncDAO siteSyncDAO = null;

	public void setSiteSyncDAO(SiteSyncDAO siteSyncDAO) {
		this.siteSyncDAO = siteSyncDAO;
	}

	public void setExternalSiteFileDownloader(ExternalFileDownloader externalSiteFileDownloader) {
		this.externalSiteFileDownloader = externalSiteFileDownloader;
	}

	public void execute() throws Exception {
		File downloadedFileToParse = externalSiteFileDownloader.retrieveRemoteFile();
		// Parse file, update local database via the siteSyncDAO

	}
}

 

public interface ExternalFileDownloader {
	public void setRemoteSiteURL(String remoteSiteURL);

	public void setDownloadedFileLocation(String downloadedFileLocation);

	public File retrieveRemoteFile() throws Exception;
}

 

public class ExternalURLFileDownloader implements ExternalFileDownloader {
	private String remoteSiteURL = null;

	private String downloadedFileLocation = null;

	public void setRemoteSiteURL(String remoteSiteURL) {
		this.remoteSiteURL = remoteSiteURL;
	}

	public void setDownloadedFileLocation(String downloadedFileLocation) {
		this.downloadedFileLocation = downloadedFileLocation;
	}

	public File retrieveRemoteFile() throws Exception {
		File downloadedFile = null;
		BufferedInputStream in = null;
		FileOutputStream fout = null;
		// Yes, there are better ways to do this - e.g. Apache's FileUtils
		try {
			in = new BufferedInputStream(new URL(remoteSiteURL).openStream());
			fout = new FileOutputStream(downloadedFileLocation);

			byte data[] = new byte[1024];
			int count;
			while ((count = in.read(data, 0, 1024)) != -1) {
				fout.write(data, 0, count);
			}

			downloadedFile = new File(downloadedFileLocation);
		} finally {
			if (in != null)
				in.close();
			if (fout != null)
				fout.close();
		}
		return downloadedFile;
	}

}

Note the following about the GoodSyncDBWithExternalSiteJob class:

  • It now requires injection of an ExternalFileDownloader interface instance to handle all the details of the file download, including the where and how. In fact, GoodSyncDBWithExternalSiteJob no longer knows anything about the file, not even whether it was downloaded from a remote location. Our unit tests can inject a “mock” version of ExternalFileDownloader – one that returns a “canned” file, containing the same entries each time we execute our unit tests.
  • It also requires injection of the DAO. I won’t focus on the DAO – the same concepts discussed for the “file downloader” applies to the DAO. The job class should not decide which implementation of the DAO it should use and instantiate that – we should be able to “wire” the implementation of the DAO we want it to use at runtime.

NOTE: In order to keep the example code brief, I’ve ignored good exception handling practices. We all know it’s not a good practice to have our methods throw generic “Exception” instances. We do, don’t we? Please say “yes” 🙂

Aside from enabling testability, these simple changes provide other powerful benefits:

  • Loose coupling: The class that parses and decides what needs to be updated in the database is now completely isolated from the details of where the file is retrieved from and hot it’s retrieved. Now, I have control of the class’ “collaborators”. I can inject mock instances – my own mock objects or objects created via a “mock” framework, such as EasyMock – that behave according to my testing needs and return objects to my class under test in a predictable way every time I run my tests.
  • Cohesion: Classes now have well-defined, narrow responsibilities. As applications grow larger, the value of this attribute cannot be overstated for purposes of maintainability and comprehensibility.
  • Cleaner, simpler design: Related to the above, this is just an easier design to understand, modify and maintain.

This is an example of where following a Test-Driven Design (TDD) philosophy can really pay off. Had the developer been writing his/her unit tests first, or even in tandem with the implementation of the production code, such radical refactoring would have been avoided and this class would have been unit tested from the start. You might argue that a developer can always refactor later to achieve this goal. However, the problems with this perspective, in my experience are:

  • Developers are rarely given the time required once a given feature has been implemented to refactor existing code, much less to write unit tests for code that is perceived to be “working OK right now”.
  • It is generally much more expensive to figure out how to refactor and unit test existing code than to take some simple steps to make the design clean and write the tests from the beginning.
  • In many cases, the class under test will be used by several application components, which could mean that the developer now has to consider all the ways the class is being used. This can result in the developer getting into the “this change works for this use case and breaks another use case” scenario. I’ve personally experienced cases in which trying to refactor an existing class that seemed very simple required days of effort.

I’ll address some of these topics – loose coupling, TDD, mock frameworks, etc. – in future blog posts. For now, it’s enough for us to start simply by remembering to design our classes in such a way that their responsibilities are fairly narrow and that their collaborators/dependencies can be injected at runtime and are defined via interfaces instead of concrete implementations. Once we’re doing that as part of our standard practice, our designs will be cleaner and more flexible and our unit testing will be easier and more effective.

Advertisements
Posted in Java Programming, Object-Oriented Design, Software Development

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: