Uncodedtech
diff --git a/‎2024/Images/day15-01.png‎
35.9 KB b/‎2024/Images/day15-01.png‎
35.9 KB
diff --git a/‎2024/day15.md‎
Lines changed: 285 additions & 0 deletions b/‎2024/day15.md‎
Lines changed: 285 additions & 0 deletions
@@ -0,0 +1,285 @@
+Using code dependency analysis to decide what to test
+===================
+
+By [Patrick Kusebauch](https://github.com/patrickkusebauch)
+
+> [!IMPORTANT]
+> Find out how to save 90+% of your test runtime and resources by eliminating 90+% of your tests while keeping your test
+> coverage and confidence. Save over 40% of your CI pipeline runtime overall.
+
+## Introduction
+
+Tests are expensive to run and the larger the code base the more expensive it becomes to run them all. At some point
+your test runtime might even become so long it will be impossible to run them all on every commit as your rate of
+incoming commits might be higher than your ability to test them. But how else can you have confidence that your
+introduced changes have not broken some existing code?
+
+Even if your situation is not that dire yet, the time it takes to run test makes it hard to get fast feedback on your
+changes. It might even force you to compromise on other development techniques. To lump several changes into larger
+commits, because there is no time to test each small individual change (like type fixing, refactoring, documentation
+etc.). You might like to do trunk-based development, but have feature branches instead, so that you can open PRs and
+test a whole slew of changes all at once. Your DORA metrics are compromised by your slow rate of development. Instead of
+being reactive to customer needs, you have to plan your projects and releases months in advance because that's how often
+you are able to fully test all the changes.
+
+Slow testing can have huge consequences on how the whole development process looks like. While speeding up test
+execution per-se is very individual problem in every project, there is another technique that can be applied everywhere.
+You have to become more picky about what tests to run. So how do you decide what to test?
+
+## Theory
+
+### What is code dependency analysis?
+
+Code dependency analysis is the process of (usually statically) analysing the code to determine what code is used by
+other code. The most common example of this is analysing the specified dependencies of a project to determine potential
+vulnerabilities. This is what tools like [OWASP Dependency Check](https://owasp.org/www-project-dependency-check/) do.
+Another use case is to generate a Software Bill of Materials (SBOM) for a project.
+
+There is one other use case that not many people talk about. That is using code dependency analysis to create a Directed
+Acyclic Graph (DAG) of the various components/modules/domains of a project. This DAG can then be used to determine how
+changes to one component will affect other components.
+
+Imagine you have a project with the following structure of components:
+
+![Project Structure](Images/day15-01.png)
+
+The `Supportive` component depends on the `Analyser` and `OutputFormatter` components. The `Analyser` in turn depends on
+3 other components - `Ast`, `Layer` and `References`. Lastly `References` depend on the `Ast` component.
+
+If you make a change to the `OutputFormatter` component you will want to run the **contract tests**
+for `OutputFormatter` and **integration tests** for `Supportive` but no tests for `Ast`. If you make changes
+to `References` you will want to run the **contract tests** for `References`, **integration tests** for `Analyser` and
+`Supportive` but no tests for `Layer` or `OutputFormatter`. In fact, there is no one module that you can change that
+would require you to run all the tests.
+
+> [!NOTE]
+> By **contract tests** I mean tests that test the defined API of the component. In other words what the component
+> promises (by contract) to the outside users to always be true about the usage of the component. Such a test mocks out
+> all outside interaction with any other component.
+>
+> By contrast, **integration tests** in this context mean tests that test that the interaction with a dependent
+> component is properly programmed. For that reason the underlying (dependent) component is not mocked out.
+
+### How do you create the dependency DAG?
+
+There are very few tools that can do this as of today, even though the concept is very simple. So simple you can do it
+yourself if there is no tool available for your language of choice.
+
+You need to parse and lex the code to create an Abstract Syntax Tree (AST) and then walk the AST of every file to find
+the dependencies. The same functionality your IDE does any time you "Find references..." or what your language server
+sends over [LSP (Language Server Protocol)](https://en.wikipedia.org/wiki/Language_Server_Protocol).
+
+You group the dependencies by predefined components/modules/domains, and then combine all the dependencies into a single
+graph.
+
+### How do you use the DAG to decide what to test?
+
+Once you have the DAG there is a 4-step process to run your testing:
+
+1. Get the list of changed files (for example by running `git diff`)
+2. Feed the list to the dependency analysis tool to get the list of changed components (and optionally the list of
+   depending components as well for integration testing)
+3. Feed the list to your testing tool of choice to run the test-suites corresponding to each changed component
+4. Revel in how much time you have saved on testing.
+
+## Practice
+
+This is not just some theoretical idea, but rather something you can try out yourself today. If you are lucky, there is
+already an open-source tool in your language of choice that lets you do it today. If you are not, the following
+demonstration will give you enough guidance to write it yourself. If you do, please let me know, I would love to see it.
+
+The tool that I have used today for demonstration is [deptrac](https://qossmic.github.io/deptrac/), and it is written in
+PHP and for PHP.
+
+All you have to do to create a DAG is to specify the modules/domains:
+
+```yaml
+# deptrac.yaml
+deptrac:
+  paths:
+    - src
+
+  layers:
+    - name: Analyser
+      collectors:
+        - type: directory
+          value: src/Analyser/.*
+    - name: Ast
+      collectors:
+        - type: directory
+          value: src/Ast/.*
+    - name: Layer
+      collectors:
+        - type: directory
+          value: src/Layer/.*
+    - name: References
+      collectors:
+        - type: directory
+          value: src/References/.*
+    - name: Contract
+      collectors:
+        - type: directory
+          value: src/Contract/.*
+```
+
+### The 4-step process
+
+Once you have the DAG you can use combine it with the list of changed files to determine what modules/domains to test. A
+simple git command will give you the list of changed files:
+
+```bash
+git diff --name-only
+```
+
+You can then use this list to find the modules/domains that have changed and then use the DAG to find the modules that
+depend on those modules.
+
+```bash
+# to get the list of changed components
+git diff --name-only | xargs php deptrac.php changed-files
+
+# to get the list of changed modules with the depending components
+git diff --name-only | xargs php deptrac.php changed-files --with-dependencies
+```
+
+If you pick the popular PHPUnit framework for your testing and
+follow [their recommendation for organizing code](https://docs.phpunit.de/en/10.5/organizing-tests.html), it will be
+very easy for you to create a test-suite per component. To run a test for a component you just have to pass the
+parameter `--testsuite {componentName}` to the PHPUnit executable:
+
+```bash
+git diff --name-only |\
+xargs php deptrac.php changed-files |\
+sed 's/;/ --testsuite /g; s/^/--testsuite /g' |\
+xargs ./vendor/bin/phpunit
+```
+
+Or if you have integration test for the dependent modules, and decide to name you integration test-suites
+as `{componentName}Integration`:
+
+```bash
+git diff --name-only |\
+xargs php deptrac.php changed-files --with-dependencies |\
+sed '1s/;/ --testsuite /g; 2s/;/Integration --testsuite /g; /./ { s/^/--testsuite /; 2s/$/Integration/; }' |\
+sed ':a;N;$!ba;s/\n/ /g' |\
+xargs ./vendor/bin/phpunit
+```
+
+### Real life comparison results
+
+I have run the following script a set of changes to compare what the saving were:
+
+```shell
+# Compare timing
+iterations=10
+
+total_time_with=0
+for ((i = 1; i <= $iterations; i++)); do
+  # Run the command
+  runtime=$(
+    TIMEFORMAT='%R'
+    time (./vendor/bin/phpunit >/dev/null 2>&1) 2>&1
+  )
+
+  miliseconds=$(echo "$runtime" | tr ',' '.')
+  total_time_with=$(echo "$total_time_with + $miliseconds * 1000" | bc)
+done
+
+average_time_with=$(echo "$total_time_with / $iterations" | bc)
+echo "Average time (not using deptrac): $average_time_with ms"
+
+# Compare test coverage
+tests_with=$(./vendor/bin/phpunit | grep -oP 'OK \(\K\d+')
+echo "Executed tests (not using deptrac): $tests_with tests"
+
+echo ""
+
+total_time_without=0
+for ((i = 1; i <= $iterations; i++)); do
+  # Run the command
+  runtime=$(
+    TIMEFORMAT='%R'
+    time (
+      git diff --name-only |
+        xargs php deptrac.php changed-files --with-dependencies |
+        sed '1s/;/ --testsuite /g; 2s/;/Integration --testsuite /g; /./ { s/^/--testsuite /; 2s/$/Integration/; }' |
+        sed ':a;N;$!ba;s/\n/ /g' |
+        xargs ./vendor/bin/phpunit >/dev/null 2>&1
+    ) 2>&1
+  )
+
+  miliseconds=$(echo "$runtime" | tr ',' '.')
+  total_time_without=$(echo "$total_time_without + $miliseconds * 1000" | bc)
+done
+
+average_time_without=$(echo "$total_time_without / $iterations" | bc)
+echo "Average time (using deptrac): $average_time_without ms"
+tests_execution_without=$(git diff --name-only |
+  xargs php deptrac.php changed-files --with-dependencies |
+  sed '1s/;/ --testsuite /g; 2s/;/Integration --testsuite /g; /./ { s/^/--testsuite /; 2s/$/Integration/; }' |
+  sed ':a;N;$!ba;s/\n/ /g' |
+  xargs ./vendor/bin/phpunit)
+tests_without=$(echo "$tests_execution_without" | grep -oP 'OK \(\K\d+')
+tests_execution_without_time=$(echo "$tests_execution_without" | grep -oP 'Time: 00:\K\d+\.\d+')
+echo "Executed tests (using deptrac): $tests_without tests"
+
+execution_time=$(echo "$tests_execution_without_time * 1000" | bc | awk '{gsub(/\.?0+$/, ""); print}')
+echo "Time to find tests to execute (using deptrac): $(echo "$average_time_without - $tests_execution_without_time * 1000" | bc | awk '{gsub(/\.?0+$/, ""); print}') ms"
+echo "Time to execute tests (using deptrac): $execution_time ms"
+
+echo ""
+
+percentage=$(echo "scale=3; $tests_without / $tests_with * 100" | bc | awk '{gsub(/\.?0+$/, ""); print}')
+echo "Percentage of tests not needing execution given the changed files: $(echo "100 - $percentage" | bc)%"
+percentage=$(echo "scale=3; $execution_time / $average_time_with * 100" | bc | awk '{gsub(/\.?0+$/, ""); print}')
+echo "Time saved on testing: $(echo "$average_time_with - $execution_time" | bc) ms ($(echo "100 - $percentage" | bc)%)"
+percentage=$(echo "scale=3; $average_time_without / $average_time_with * 100" | bc | awk '{gsub(/\.?0+$/, ""); print}')
+echo "Time saved overall: $(echo "$average_time_with - $average_time_without" | bc) ms ($(echo "100 - $percentage" | bc)%)"
+```
+
+with the following results:
+
+```
+Average time (not using deptrac): 984 ms
+Executed tests (not using deptrac): 721 tests
+
+Average time (using deptrac): 559 ms
+Executed tests (using deptrac): 21 tests
+Time to find tests to execute (using deptrac): 491 ms
+Time to execute tests (using deptrac): 68 ms
+
+Percentage of tests not needing execution given the changed files: 97.1%
+Time saved on testing: 916 ms (93.1%)
+Time saved overall: 425 ms (43.2%)
+```
+
+Some interesting observations:
+
+- Only **3% of the tests** that normally run on the PR needed to be run to cover the change with tests. That is a
+  **saving of 700 tests** in this case.
+- **Test execution time has decreased by 93%**. You are mostly left with the constant cost of set-up and tear-down of
+  the testing framework.
+- **Pipeline overall time has decreased by 43%**. Since the analysis time grows orders of magnitude slower that test
+  runtime (it is not completely constant more files still means more to statically analyse), the number is only bound to
+  be better the larger the codebase is.
+
+And these saving apply to arguable the worst possible SUT (System Under Test):
+
+- It is a **small application**, so it is hard to get the saving of skipping testing of vast number of components as it
+  would be the case for large codebases.
+- It is a **CLI script**, so it has no database, no external APIs to call, minimal slow I/O tests. Those are the tests
+  you want skipping the most, and they are barely present here.
+
+## Conclusion
+
+Code dependency analysis is a very useful tool for deciding what to test. It is not a silver bullet, but it can help you
+reduce the number of tests you run and the time it takes to run them. It can also help you decide what tests to run in
+your CI pipeline. It is not a replacement for a good test suite, but it can help you make your test suite more
+efficient.
+
+## References
+
+- [deptrac](https://qossmic.github.io/deptrac/)
+- [deptracpy](https://patrickkusebauch.github.io/deptracpy/)
+
+See you on [Day 16](day16.md).