Split and Parallelize

October 3, 2014 Roy Osherove

Other Names:

Split Work

Symptoms:

The build is taking too long

Problem:

It could be that one or more tasks or build steps that are comprised of many actions are taking too much time. For example: running 1000 regression tests (each taking 1-4 minutes) is taking 24 hours, and this is slowing down the feedback rate from the build.

Solution:

Split the step into several parallel running steps, each running a part of the task. The build step will then theoretically be able to be no longer than the longest time it takes to run just one of the split parts in parallel.

The solution requires: - having multiple build agents, or workers, to be able to run things in Parallel - The split parts need to be somewhat similar in size to gain speed improvement - the amount of split parts should be at least as the amount of how many agents will be able to run each split part.

Example

Say we have 1000 regression tests and one build step called "Run Regression Tests".

Step 1: Split the tests to runnable parts

We split the regression tests into separate runnable "ranges" or categories that can be executed separately from the command line. There are many ways to do this: Split the tests to multiple assemblies with names ending with running numbers, put separate "categories" on different tests, so you can tell the test runner to run only a specific category, and more.

It is important that the size of the "chunks" you split into will be somewhat similar, or the build speed gains will be less optimal. In our case, we have 1000 tests, and 10 agents the test chunks can run on. So we create 10 test categories named "chunk1", "chunk2" etc..

An even better scheme would have been if our tests could be split across 10 different logical ideas. Say we are testing our product with 10 different languages, approx. 100 tests per language. It would have been perfect to split them based on language and name each chunk based on that language. Later on, this also makes things more readable at the build server level. If you can only find 2 or 3 logical "categories" for the tests to split based on, (say "db tests", "ui tests" and "perf tests") you might want to go ahead and split each category into chunks to make things faster if you have more than 3 agents. For example "ui-chunk1", "ui-chunk2" until you reach at least the number of agents you have.

Step 2: Create a parallel test run hierarchy in the CI server

If you're using TeamCity, you would now create the following: - new sub project called "run regression tests" - 10 steps (assuming you have 10 build agents that can run them) , one for each test chunk:

Run UI Tests Chunk 1 (A command line step calling the test runner command with the name of the category to be run)
Run UI Tests Chunk 2 (same command line, different category)
Run UI Tests Chunk 2
Run DB Tests Chunk 1
...
Trigger all Regression Tests

Note the last build step. This build step is merely an empty build step that has a snapshot dependency to all the listed "chunk" steps. when triggered, it should (assuming you have multiple agents enabled and ready to run these) run all the chunks in parallel, which will finish as quickly as it takes the longest running chunk to finish.