|
1 | | -Using code dependency analysis to decide what to test |
2 | | -=================== |
3 | | - |
4 | | -By [Patrick Kusebauch](https://github.com/patrickkusebauch) |
5 | | - |
6 | | -> [!IMPORTANT] |
7 | | -> Find out how to save 90+% of your test runtime and resources by eliminating 90+% of your test while keeping your test |
8 | | -> coverage and confidence. Save over 40% of your CI pipeline runtime overall. |
9 | | -
|
10 | | -## Introduction |
11 | | - |
12 | | -Tests are expensive to run and the larger the code base the more expensive it becomes to run them all. At some point |
13 | | -your test runtime might even become so long it will be impossible to run them all on every commit as your rate of |
14 | | -incoming commits might be higher than your ability to test them. But how else can you have confidence that your |
15 | | -introduced changes have not broken some existing code? |
16 | | - |
17 | | -Even if your situation is not that dire yet, the time it takes to run test makes it hard to get fast feedback on your |
18 | | -changes. It might even force you to compromise on other development techniques. To lump several changes into larger |
19 | | -commits, because there is no time to test each small individual change (like type fixing, refactoring, documentation |
20 | | -etc.). You might like to do trunk-based development, but have feature branches instead, so that you can open PRs and |
21 | | -test a whole slew of changes all at once. Your DORA metrics are compromised by your slow rate of development. Instead of |
22 | | -being reactive to customer needs, you have to plan your projects and releases months in advance because that's how often |
23 | | -you are able to fully test all the changes. |
24 | | - |
25 | | -Slow testing can have huge consequences on how the whole development process looks like. While speeding up test |
26 | | -execution per-se is very individual problem in every project, there is another technique that can be applied everywhere. |
27 | | -You have to become more picky about what tests to run. So how do you decide what to test? |
28 | | - |
29 | | -## Theory |
30 | | - |
31 | | -### What is code dependency analysis? |
32 | | - |
33 | | -Code dependency analysis is the process of (usually statically) analysing the code to determine what code is used by |
34 | | -other code. The most common example of this is analysing the specified dependencies of a project to determine potential |
35 | | -vulnerabilities. This is what tools like [OWASP Dependency Check](https://owasp.org/www-project-dependency-check/) do. |
36 | | -Another use case is to generate a Software Bill of Materials (SBOM) for a project. |
37 | | - |
38 | | -There is one other use case that not many people talk about. That is using code dependency analysis to create a Directed |
39 | | -Acyclic Graph (DAG) of the various components/modules/domains of a project. This DAG can then be used to determine how |
40 | | -changes to one component will affect other components. |
41 | | - |
42 | | -Imagine you have a project with the following structure of components: |
43 | | - |
44 | | - |
45 | | - |
46 | | -The `Supportive` component depends on the `Analyser` and `OutputFormatter` components. The `Analyser` in turn depends on |
47 | | -3 other components - `Ast`, `Layer` and `References`. Lastly `References` depend on the `Ast` component. |
48 | | - |
49 | | -If you make a change to the `OutputFormatter` component you will want to run the **contract tests** |
50 | | -for `OutputFormatter` and **integration tests** for `Supportive` but no tests for `Ast`. If you make changes |
51 | | -to `References` you will want to run the **contract tests** for `References`, **integration tests** for `Analyser` and |
52 | | -`Supportive` but no tests for `Layer` or `OutputFormatter`. In fact, there is no one module that you can change that |
53 | | -would require you to run all the tests. |
54 | | - |
55 | | -> [!NOTE] |
56 | | -> By **contract tests** I mean tests that test the defined API of the component. In other words what the component |
57 | | -> promises (by contract) to the outside users to always be true about the usage of the component. Such a test mocks out |
58 | | -> all outside interaction with any other component. |
59 | | -> |
60 | | -> By contrast, **integration tests** in this context mean tests that test that the interaction with a dependent |
61 | | -> component is properly programmed. For that reason the underlying (dependent) component is not mocked out. |
62 | | -
|
63 | | -### How do you create the dependency DAG? |
64 | | - |
65 | | -There are very few tools that can do this as of today, even though the concept is very simple. So simple you can do it |
66 | | -yourself if there is no tool available for your language of choice. |
67 | | - |
68 | | -You need to parse and lex the code to create an Abstract Syntax Tree (AST) and then walk the AST of every file to find |
69 | | -the dependencies. The same functionality your IDE does any time you "Find references..." or what your language server |
70 | | -sends over [LSP (Language Server Protocol)](https://en.wikipedia.org/wiki/Language_Server_Protocol). |
71 | | - |
72 | | -You group the dependencies by predefined components/modules/domains, and then combine all the dependencies into a single |
73 | | -graph. |
74 | | - |
75 | | -### How do you use the DAG to decide what to test? |
76 | | - |
77 | | -Once you have the DAG there is a 4-step process to run your testing: |
78 | | - |
79 | | -1. Get the list of changed files (for example by running `git diff`) |
80 | | -2. Feed the list to the dependency analysis tool to get the list of changed components (and optionally the list of |
81 | | - depending components as well for integration testing) |
82 | | -3. Feed the list to your testing tool of choice to run the test-suites corresponding to each changed component |
83 | | -4. Revel in how much time you have saved on testing. |
84 | | - |
85 | | -## Practice |
86 | | - |
87 | | -This is not just some theoretical idea, but rather something you can try out yourself today. If you are lucky, there is |
88 | | -already an open-source tool in your language of choice that lets you do it today. If you are not, the following |
89 | | -demonstration will give you enough guidance to write it yourself. If you do, please let me know, I would love to see it. |
90 | | - |
91 | | -The tool that I have used today for demonstration is [deptrac](https://qossmic.github.io/deptrac/), and it is written in |
92 | | -PHP and for PHP. |
93 | | - |
94 | | -All you have to do to create a DAG is to specify the modules/domains: |
95 | | - |
96 | | -```yaml |
97 | | -# deptrac.yaml |
98 | | -deptrac: |
99 | | - paths: |
100 | | - - src |
101 | | - |
102 | | - layers: |
103 | | - - name: Analyser |
104 | | - collectors: |
105 | | - - type: directory |
106 | | - value: src/Analyser/.* |
107 | | - - name: Ast |
108 | | - collectors: |
109 | | - - type: directory |
110 | | - value: src/Ast/.* |
111 | | - - name: Layer |
112 | | - collectors: |
113 | | - - type: directory |
114 | | - value: src/Layer/.* |
115 | | - - name: References |
116 | | - collectors: |
117 | | - - type: directory |
118 | | - value: src/References/.* |
119 | | - - name: Contract |
120 | | - collectors: |
121 | | - - type: directory |
122 | | - value: src/Contract/.* |
123 | | -``` |
124 | | -
|
125 | | -### The 4-step process |
126 | | -
|
127 | | -Once you have the DAG you can use combine it with the list of changed files to determine what modules/domains to test. A |
128 | | -simple git command will give you the list of changed files: |
129 | | -
|
130 | | -```bash |
131 | | -git diff --name-only |
132 | | -``` |
133 | | - |
134 | | -You can then use this list to find the modules/domains that have changed and then use the DAG to find the modules that |
135 | | -depend on those modules. |
136 | | - |
137 | | -```bash |
138 | | -# to get the list of changed components |
139 | | -git diff --name-only | xargs php deptrac.php changed-files |
140 | | - |
141 | | -# to get the list of changed modules with the depending components |
142 | | -git diff --name-only | xargs php deptrac.php changed-files --with-dependencies |
143 | | -``` |
144 | | - |
145 | | -If you pick the popular PHPUnit framework for your testing and |
146 | | -follow [their recommendation for organizing code](https://docs.phpunit.de/en/10.5/organizing-tests.html), it will be |
147 | | -very easy for you to create a test-suite per component. To run a test for a component you just have to pass the |
148 | | -parameter `--testsuite {componentName}` to the PHPUnit executable: |
149 | | - |
150 | | -```bash |
151 | | -git diff --name-only |\ |
152 | | -xargs php deptrac.php changed-files |\ |
153 | | -sed 's/;/ --testsuite /g; s/^/--testsuite /g' |\ |
154 | | -xargs ./vendor/bin/phpunit |
155 | | -``` |
156 | | - |
157 | | -Or if you have integration test for the dependent modules, and decide to name you integration test-suites |
158 | | -as `{componentName}Integration`: |
159 | | - |
160 | | -```bash |
161 | | -git diff --name-only |\ |
162 | | -xargs php deptrac.php changed-files --with-dependencies |\ |
163 | | -sed '1s/;/ --testsuite /g; 2s/;/Integration --testsuite /g; /./ { s/^/--testsuite /; 2s/$/Integration/; }' |\ |
164 | | -sed ':a;N;$!ba;s/\n/ /g' |\ |
165 | | -xargs ./vendor/bin/phpunit |
166 | | -``` |
167 | | - |
168 | | -### Real life comparison results |
169 | | - |
170 | | -I have run the following script a set of changes to compare what the saving were: |
171 | | - |
172 | | -```shell |
173 | | -# Compare timing |
174 | | -iterations=10 |
175 | | - |
176 | | -total_time_with=0 |
177 | | -for ((i = 1; i <= $iterations; i++)); do |
178 | | - # Run the command |
179 | | - runtime=$( |
180 | | - TIMEFORMAT='%R' |
181 | | - time (./vendor/bin/phpunit >/dev/null 2>&1) 2>&1 |
182 | | - ) |
183 | | - |
184 | | - miliseconds=$(echo "$runtime" | tr ',' '.') |
185 | | - total_time_with=$(echo "$total_time_with + $miliseconds * 1000" | bc) |
186 | | -done |
187 | | - |
188 | | -average_time_with=$(echo "$total_time_with / $iterations" | bc) |
189 | | -echo "Average time (not using deptrac): $average_time_with ms" |
190 | | - |
191 | | -# Compare test coverage |
192 | | -tests_with=$(./vendor/bin/phpunit | grep -oP 'OK \(\K\d+') |
193 | | -echo "Executed tests (not using deptrac): $tests_with tests" |
194 | | - |
195 | | -echo "" |
196 | | - |
197 | | -total_time_without=0 |
198 | | -for ((i = 1; i <= $iterations; i++)); do |
199 | | - # Run the command |
200 | | - runtime=$( |
201 | | - TIMEFORMAT='%R' |
202 | | - time ( |
203 | | - git diff --name-only | |
204 | | - xargs php deptrac.php changed-files --with-dependencies | |
205 | | - sed '1s/;/ --testsuite /g; 2s/;/Integration --testsuite /g; /./ { s/^/--testsuite /; 2s/$/Integration/; }' | |
206 | | - sed ':a;N;$!ba;s/\n/ /g' | |
207 | | - xargs ./vendor/bin/phpunit >/dev/null 2>&1 |
208 | | - ) 2>&1 |
209 | | - ) |
210 | | - |
211 | | - miliseconds=$(echo "$runtime" | tr ',' '.') |
212 | | - total_time_without=$(echo "$total_time_without + $miliseconds * 1000" | bc) |
213 | | -done |
214 | | - |
215 | | -average_time_without=$(echo "$total_time_without / $iterations" | bc) |
216 | | -echo "Average time (using deptrac): $average_time_without ms" |
217 | | -tests_execution_without=$(git diff --name-only | |
218 | | - xargs php deptrac.php changed-files --with-dependencies | |
219 | | - sed '1s/;/ --testsuite /g; 2s/;/Integration --testsuite /g; /./ { s/^/--testsuite /; 2s/$/Integration/; }' | |
220 | | - sed ':a;N;$!ba;s/\n/ /g' | |
221 | | - xargs ./vendor/bin/phpunit) |
222 | | -tests_without=$(echo "$tests_execution_without" | grep -oP 'OK \(\K\d+') |
223 | | -tests_execution_without_time=$(echo "$tests_execution_without" | grep -oP 'Time: 00:\K\d+\.\d+') |
224 | | -echo "Executed tests (using deptrac): $tests_without tests" |
225 | | - |
226 | | -execution_time=$(echo "$tests_execution_without_time * 1000" | bc | awk '{gsub(/\.?0+$/, ""); print}') |
227 | | -echo "Time to find tests to execute (using deptrac): $(echo "$average_time_without - $tests_execution_without_time * 1000" | bc | awk '{gsub(/\.?0+$/, ""); print}') ms" |
228 | | -echo "Time to execute tests (using deptrac): $execution_time ms" |
229 | | - |
230 | | -echo "" |
231 | | - |
232 | | -percentage=$(echo "scale=3; $tests_without / $tests_with * 100" | bc | awk '{gsub(/\.?0+$/, ""); print}') |
233 | | -echo "Percentage of tests not needing execution given the changed files: $(echo "100 - $percentage" | bc)%" |
234 | | -percentage=$(echo "scale=3; $execution_time / $average_time_with * 100" | bc | awk '{gsub(/\.?0+$/, ""); print}') |
235 | | -echo "Time saved on testing: $(echo "$average_time_with - $execution_time" | bc) ms ($(echo "100 - $percentage" | bc)%)" |
236 | | -percentage=$(echo "scale=3; $average_time_without / $average_time_with * 100" | bc | awk '{gsub(/\.?0+$/, ""); print}') |
237 | | -echo "Time saved overall: $(echo "$average_time_with - $average_time_without" | bc) ms ($(echo "100 - $percentage" | bc)%)" |
238 | | -``` |
239 | | - |
240 | | -with the following results: |
241 | | - |
242 | | -``` |
243 | | -Average time (not using deptrac): 984 ms |
244 | | -Executed tests (not using deptrac): 721 tests |
245 | | -
|
246 | | -Average time (using deptrac): 559 ms |
247 | | -Executed tests (using deptrac): 21 tests |
248 | | -Time to find tests to execute (using deptrac): 491 ms |
249 | | -Time to execute tests (using deptrac): 68 ms |
250 | | -
|
251 | | -Percentage of tests not needing execution given the changed files: 97.1% |
252 | | -Time saved on testing: 916 ms (93.1%) |
253 | | -Time saved overall: 425 ms (43.2%) |
254 | | -``` |
255 | | - |
256 | | -Some interesting observations: |
257 | | - |
258 | | -- Only **3% of the tests** that normally run on the PR needed to be run to cover the change with tests. That is a |
259 | | - **saving of 700 tests** in this case. |
260 | | -- **Test execution time has decreased by 93%**. You are mostly left with the constant cost of set-up and tear-down of |
261 | | - the testing framework. |
262 | | -- **Pipeline overall time has decreased by 43%**. Since the analysis time grows orders of magnitude slower that test |
263 | | - runtime (it is not completely constant more files still means more to statically analyse), the number is only bound to |
264 | | - be better the larger the codebase is. |
265 | | - |
266 | | -And these saving apply to arguable the worst possible SUT (System Under Test): |
267 | | - |
268 | | -- It is a **small application**, so it is hard to get the saving of skipping testing of vast number of components as it |
269 | | - would be the case for large codebases. |
270 | | -- It is a **CLI script**, so it has no database, no external APIs to call, minimal slow I/O tests. Those are the tests |
271 | | - you want skipping the most, and they are barely present here. |
272 | | - |
273 | | -## Conclusion |
274 | | - |
275 | | -Code dependency analysis is a very useful tool for deciding what to test. It is not a silver bullet, but it can help you |
276 | | -reduce the number of tests you run and the time it takes to run them. It can also help you decide what tests to run in |
277 | | -your CI pipeline. It is not a replacement for a good test suite, but it can help you make your test suite more |
278 | | -efficient. |
279 | | - |
280 | | -## References |
281 | | - |
282 | | -- [deptrac](https://qossmic.github.io/deptrac/) |
283 | | -- [deptracpy](https://patrickkusebauch.github.io/deptracpy/) |
284 | | - |
285 | | -See you on [Day 36](day36.md). |
0 commit comments