Skip to content

WIP: NETOBSERV-2812: Remove origin dependancy#2806

Open
Amoghrd wants to merge 53 commits into
netobserv:mainfrom
Amoghrd:netobserv-2812
Open

WIP: NETOBSERV-2812: Remove origin dependancy#2806
Amoghrd wants to merge 53 commits into
netobserv:mainfrom
Amoghrd:netobserv-2812

Conversation

@Amoghrd

@Amoghrd Amoghrd commented Jun 15, 2026

Copy link
Copy Markdown
Member

Description

Move away from openshift/origin dependency in integration-tests

Dependencies

n/a

Checklist

  • Does the changes in PR need specific configuration or environment set up for testing?
    • if so please describe it in PR description.
  • I have added thorough unit tests for the change.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Summary by CodeRabbit

  • New Features

    • Expanded integration-test support for deploying, checking, and cleaning up more resources directly through the Kubernetes API.
    • Added broader cloud and cluster detection helpers for AWS, Azure, GCP, OpenStack, and workload identity environments.
    • Introduced new test utilities for processing templates and managing storage buckets and containers.
  • Bug Fixes

    • Improved alert, metric, and readiness checks with more reliable polling and error handling.
    • Updated several end-to-end flows to use direct resource lookups, reducing dependence on CLI-based operations.

@openshift-ci-robot

openshift-ci-robot commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

@Amoghrd: This pull request references NETOBSERV-2812 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the spike to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Description

Move away from openshift/origin dependency in integration-tests

Dependencies

n/a

Checklist

  • Does the changes in PR need specific configuration or environment set up for testing?
    • if so please describe it in PR description.
  • I have added thorough unit tests for the change.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci

openshift-ci Bot commented Jun 15, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign oliviercazade for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Important

Review skipped

Ignore keyword(s) in the title.

⛔ Ignored keywords (1)
  • WIP

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: dc304719-38dd-4ac0-b501-9ad610a6183f

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 41.07% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title uses vague language ('Remove origin dependancy') without specifying the scope or main change in the changeset. The PR migrates multiple test files to a new K8sClient implementation, but the title doesn't convey this primary change clearly. Use a more specific title that describes the main change, such as 'Refactor integration tests to use K8sClient instead of exutil.CLI' or 'Migrate integration tests from OpenShift CLI to Kubernetes dynamic client'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The description covers the core change (removing openshift/origin dependency), notes no additional dependencies, marks it as regression testing, and completes the required checklist sections.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
integration-tests/backend/util.go (1)

539-563: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pod readiness check can pass before containers are initialized.

When ContainerStatuses is empty, ready stays true and polling exits early. This can mark pods ready while they are still pending.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/util.go` around lines 539 - 563, The readiness
check in the WaitForPodsReadyWithLabel function has a logic flaw where an empty
ContainerStatuses list causes the ready variable to remain true because the
inner for loop never executes, incorrectly marking pods as ready before
containers are initialized. Fix this by explicitly checking if the
ContainerStatuses slice is empty and setting ready to false in that case,
ensuring the function only reports pods as ready when containers have been
initialized and their status shows they are ready.
🧹 Nitpick comments (3)
integration-tests/backend/operator.go (2)

217-217: 💤 Low value

Remove or restore commented code.

WaitForDeploymentPodsToBeReady is commented out. Either remove it if no longer needed, or uncomment if the wait is still required for reliable operator deployment.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/operator.go` at line 217, The call to
WaitForDeploymentPodsToBeReady is currently commented out in the operator
initialization code. Determine whether this wait is still necessary for reliable
operator deployment: if it is required, uncomment the line to restore the wait
functionality; if it is no longer needed, remove the commented line entirely to
keep the code clean.

653-663: ⚖️ Poor tradeoff

Hardcoded JSON patch path and CSV name are fragile.

The patch path /spec/install/spec/deployments/0/spec/template/spec/containers/0/env/4/value and CSV name netobserv-operator.v0.0.0-sha-main will break silently if the CSV structure changes. Consider extracting the env var by name or documenting the expected CSV structure.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/operator.go` around lines 653 - 663, The hardcoded
JSON patch path with array indices (index 0 for deployments, index 0 for
containers, index 4 for env) and the hardcoded CSV name
`netobserv-operator.v0.0.0-sha-main` in the Patch call are fragile and will
silently break if the CSV structure changes. Instead of using hardcoded indices,
fetch the ClusterServiceVersion resource first, locate the specific deployment
and container by name rather than position, find the target environment variable
by name instead of using index 4, and then dynamically construct the patch path
based on the actual resource structure. This will make the test resilient to
changes in the CSV structure while maintaining clarity about which specific
resources and variables are being modified.
integration-tests/backend/k8s_client_test.go (1)

11-40: ⚡ Quick win

Add missing edge-case coverage for this _test.go file.

Current tests cover valid and not-found paths, but not empty/nil-style edges (for example empty namespace name and constructor behavior when kubeconfig input is invalid/missing). Please add those cases in this file’s test suite.

As per coding guidelines: **/*_test.go: “Use Ginkgo/Gomega patterns for unit tests - include tests for valid cases, invalid input, and edge cases (empty, nil values)”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/k8s_client_test.go` around lines 11 - 40, The test
suite in the k8s_client_test.go file is missing edge-case coverage for both the
NewK8sClient constructor and the GetNamespace method. Add a test case that calls
NewK8sClient with invalid or missing kubeconfig input to verify the constructor
handles this scenario appropriately. Additionally, add a test case that calls
GetNamespace with an empty string as the namespace name to verify it handles
this edge case correctly. These tests should follow the Ginkgo/Gomega patterns
referenced in the coding guidelines and ensure empty/nil values and invalid
inputs are properly tested alongside the existing valid and not-found test
cases.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@integration-tests/backend/alerts.go`:
- Around line 71-79: The type assertion on alertStatus[0] at line 79 is
unchecked and will panic if amtool returns an unexpected payload shape. Replace
the unchecked type assertion with a checked one using the ok pattern, and return
a proper error if the assertion fails instead of allowing a panic. The same
issue exists at line 91 and needs the same fix applied there as well.

In `@integration-tests/backend/flowcollectorslice.go`:
- Around line 67-73: The readiness polling in wait.PollUntilContextTimeout
treats all GET errors from
client.dynamic.Resource(gvr).Namespace(flowSlice.Namespace).Get(ctx,
flowSlice.Name, metav1.GetOptions{}) as retryable by returning false, nil, which
causes permanent API failures to wait the full 600 seconds instead of failing
fast. Fix this by checking the error type to distinguish between permanent
errors (like permission denied, not found, invalid request) and transient errors
- for permanent errors return true with the error to fail immediately, and for
transient errors continue returning false, nil to retry the polling.

In `@integration-tests/backend/go.mod`:
- Line 3: The go.mod file declares Go 1.26.3, but the Dockerfile is using an
older version specification (golang:1.25), causing a version mismatch that will
break the build. Update the Docker image version specification in the Dockerfile
from golang:1.25 to golang:1.26 (or a more specific 1.26.x version) to align
with the declared Go version in go.mod.

In `@integration-tests/backend/k8s_client.go`:
- Around line 343-354: The current error handling in the resource creation block
does not distinguish between different types of errors from the Get() call.
Instead of attempting to Create() on any error, you need to explicitly check if
the error is a NotFound error using apierrors.IsNotFound(). If the error is not
a NotFound error (such as RBAC errors, transport errors, or timeouts), return
that error immediately. Only proceed with the Create() attempt when the Get()
call returns a NotFound error. This prevents masking real errors with failed
create attempts.

In `@integration-tests/backend/loki_storage.go`:
- Around line 958-983: The Resource.clear method is incomplete after migration
to use K8sClient. Add the missing resource kinds `secret` and
`objectbucketclaims` to the switch statement in the clear method with their
appropriate schema.GroupVersionResource mappings (secret belongs to the core API
group with Version "v1" and Resource "secrets", while objectbucketclaims belongs
to the "objectbucket.io" group with Version "v1alpha1" and Resource
"objectbucketclaims"). Additionally, update all call sites to the clear method
to pass the correct K8sClient parameter instead of the old oc parameter,
ensuring consistency with the updated method signature.

In `@integration-tests/backend/metrics.go`:
- Around line 190-203: The code treats optional ServiceMonitor endpoint fields
as required and returns errors when they are absent. Update the logic where
`scheme` is retrieved from the endpoint map to default to "http" instead of
returning an error when the field is not found. Similarly, apply the same
approach to `tlsConfig.serverName` extraction by only using it when present
rather than erroring when absent. This ensures both optional fields are handled
gracefully with appropriate defaults.

In `@integration-tests/backend/operator.go`:
- Around line 359-368: The polling callback functions used in
wait.PollUntilContextTimeout (affecting the code at lines 359-368 and 376-384)
are using o.Expect to assert on errors returned from client.GetSubscription,
which causes immediate test failure on any transient error instead of allowing
the poll to retry. Remove the o.Expect assertion and instead return the error
directly from the callback function (similar to how CheckOperatorStatus handles
errors), so the polling mechanism can properly retry on transient errors before
failing the test.

In `@integration-tests/backend/template_processor.go`:
- Around line 107-116: The function applyResourceFromTemplateByAdmin creates a
temporary file through the processTemplate call and applies it via
client.ApplyYAML, but never removes the file afterward, causing repeated runs to
leak temporary files. Add cleanup code after the ApplyYAML call to delete the
temporary configFile using os.Remove. Consider using a defer statement to ensure
the cleanup happens regardless of whether ApplyYAML succeeds or fails, placing
it immediately after the configFile is successfully created.
- Around line 81-86: The parameter replacement loop in template_processor.go
iterates over an unordered map, causing nondeterministic replacement order and
potential corruption when overlapping keys exist (e.g., "NAME" and "NAMESPACE").
Fix this by extracting all keys from the params map, sorting them by length in
descending order (longest first) to prevent shorter keys from matching within
longer keys, then iterate over the sorted keys to perform replacements in a
deterministic and safe order.

In `@integration-tests/backend/test_flowcollectorslice.go`:
- Around line 81-84: The variable err is declared at line 81 in the BeforeEach
block and then redeclared again at line 118-120, which causes a compilation
error. At line 118-120, remove the redeclaration (change the `:=` operator to
`=` for assignment) instead of declaring a new variable, since err is already
declared at line 81 in the same BeforeEach block. This will allow both uses of
err in the function to reference the same variable without conflict.
- Around line 82-84: The K8sClient created at line 82 with NewK8sClient()
retains the original user identity and is reused for CreateFlowcollectorSlice,
DeleteFlowcollectorSlice, and WaitForFlowcollectorSliceReady operations after
oc.ChangeUser() is called at line 480, which bypasses RBAC checks since these
operations execute with the original identity rather than the switched user.
After the oc.ChangeUser() call, create a new K8sClient by calling NewK8sClient()
again and use this newly created client for all flowcollectorslice operations
that follow the user switch to ensure RBAC checks execute with the correct
switched user identity.

In `@integration-tests/backend/util.go`:
- Around line 246-249: The function ApplyResourceFromFile accepts a namespace
parameter ns but does not use it when calling client.ApplyYAML, breaking the
namespace override contract for existing callers. Either pass the ns parameter
to the ApplyYAML method call (if the method signature supports a namespace
argument), or if the namespace parameter is no longer needed, remove it from the
ApplyResourceFromFile function signature and update all call sites accordingly
to ensure namespace handling is applied correctly where needed.
- Around line 137-138: The applyFromTemplate function currently ignores errors
from the WaitForResourceToAppear method call by using the blank identifier, then
unconditionally returns nil. This masks resource readiness failures and falsely
reports success. Capture the error returned by r.WaitForResourceToAppear(client)
instead of discarding it, check if the error is not nil, and return that error.
Only return nil if WaitForResourceToAppear succeeds without error.
- Around line 301-307: Remove the unused `gvr` variable declaration in the
`getSAToken` function. The variable is declared with the GroupVersionResource
for authentication.k8s.io tokenrequests but is never actually used in the
function (the code uses `saGvr` instead), which causes a Go compilation failure.
Delete the entire unused `gvr` variable declaration block.
- Around line 82-97: The getRouteAddress function signature was changed to
accept a *K8sClient parameter instead of its previous type, but this change was
not propagated to all callers. Update all callers of getRouteAddress to pass the
correct *K8sClient type instead of *exutil.CLI. There are three callers that
need to be updated (identified in loki_storage.go, test_flowcollectorslice.go,
and test_flowcollector.go), while the caller in metrics.go is already correct
and requires no changes.

---

Outside diff comments:
In `@integration-tests/backend/util.go`:
- Around line 539-563: The readiness check in the WaitForPodsReadyWithLabel
function has a logic flaw where an empty ContainerStatuses list causes the ready
variable to remain true because the inner for loop never executes, incorrectly
marking pods as ready before containers are initialized. Fix this by explicitly
checking if the ContainerStatuses slice is empty and setting ready to false in
that case, ensuring the function only reports pods as ready when containers have
been initialized and their status shows they are ready.

---

Nitpick comments:
In `@integration-tests/backend/k8s_client_test.go`:
- Around line 11-40: The test suite in the k8s_client_test.go file is missing
edge-case coverage for both the NewK8sClient constructor and the GetNamespace
method. Add a test case that calls NewK8sClient with invalid or missing
kubeconfig input to verify the constructor handles this scenario appropriately.
Additionally, add a test case that calls GetNamespace with an empty string as
the namespace name to verify it handles this edge case correctly. These tests
should follow the Ginkgo/Gomega patterns referenced in the coding guidelines and
ensure empty/nil values and invalid inputs are properly tested alongside the
existing valid and not-found test cases.

In `@integration-tests/backend/operator.go`:
- Line 217: The call to WaitForDeploymentPodsToBeReady is currently commented
out in the operator initialization code. Determine whether this wait is still
necessary for reliable operator deployment: if it is required, uncomment the
line to restore the wait functionality; if it is no longer needed, remove the
commented line entirely to keep the code clean.
- Around line 653-663: The hardcoded JSON patch path with array indices (index 0
for deployments, index 0 for containers, index 4 for env) and the hardcoded CSV
name `netobserv-operator.v0.0.0-sha-main` in the Patch call are fragile and will
silently break if the CSV structure changes. Instead of using hardcoded indices,
fetch the ClusterServiceVersion resource first, locate the specific deployment
and container by name rather than position, find the target environment variable
by name instead of using index 4, and then dynamically construct the patch path
based on the actual resource structure. This will make the test resilient to
changes in the CSV structure while maintaining clarity about which specific
resources and variables are being modified.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2ef5ef27-20b4-4824-b6ed-94fd26c86200

📥 Commits

Reviewing files that changed from the base of the PR and between 5ac0631 and c22cb9a.

⛔ Files ignored due to path filters (1)
  • integration-tests/backend/go.sum is excluded by !**/*.sum
📒 Files selected for processing (14)
  • integration-tests/backend/alerts.go
  • integration-tests/backend/flowcollector_utils.go
  • integration-tests/backend/flowcollectorslice.go
  • integration-tests/backend/go.mod
  • integration-tests/backend/k8s_client.go
  • integration-tests/backend/k8s_client_test.go
  • integration-tests/backend/loki.go
  • integration-tests/backend/loki_storage.go
  • integration-tests/backend/metrics.go
  • integration-tests/backend/operator.go
  • integration-tests/backend/template_processor.go
  • integration-tests/backend/test_flowcollectorslice.go
  • integration-tests/backend/util.go
  • integration-tests/backend/version_checker.go

Comment thread integration-tests/backend/alerts.go Outdated
Comment thread integration-tests/backend/flowcollectorslice.go
Comment thread integration-tests/backend/go.mod Outdated
Comment thread integration-tests/backend/k8s_client.go Outdated
Comment thread integration-tests/backend/loki_storage.go
Comment thread integration-tests/backend/test_flowcollectorslice.go Outdated
Comment thread integration-tests/backend/util.go
Comment thread integration-tests/backend/util.go Outdated
Comment thread integration-tests/backend/util.go
Comment thread integration-tests/backend/util.go Outdated
@Amoghrd Amoghrd changed the title WIP: NETOBSERV-2812: Remove origin dependancy in operator.go WIP: NETOBSERV-2812: Remove origin dependancy Jun 16, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
integration-tests/backend/util.go (2)

107-143: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Same naive pluralization issue as template_processor.go.

Line 130: strings.ToLower(gvk.Kind) + "s" breaks for irregular plurals. See comment on template_processor.go.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/util.go` around lines 107 - 143, The
Resource.applyFromTemplate function uses naive string concatenation to pluralize
the Kubernetes resource kind by simply appending "s" to the lowercased kind
name. This approach fails for irregular plurals and should be replaced with
proper pluralization logic. Locate the line in applyFromTemplate where the
GroupVersionResource is constructed with `strings.ToLower(gvk.Kind) + "s"` and
replace this naive pluralization with the same pluralization solution that was
applied to the similar issue in template_processor.go, ensuring the resource
name is correctly pluralized according to Kubernetes conventions.

127-131: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Naive Kind-to-Resource pluralization in both files. Both locations use strings.ToLower(gvk.Kind) + "s" which fails for irregular plurals (e.g., Ingressingresss, NetworkPolicynetworkpolicys).

  • integration-tests/backend/util.go#L127-L131: Use a proper pluralization helper or lookup map in applyFromTemplate.
  • integration-tests/backend/template_processor.go#L126-L131: Apply same fix in processTemplateFile if this code path is kept.

Consider using k8s.io/apimachinery/pkg/api/meta.UnsafeGuessKindToResource or a custom mapping for known CRDs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/util.go` around lines 127 - 131, The
Kind-to-Resource conversion at integration-tests/backend/util.go#L127-L131 in
the applyFromTemplate function uses naive pluralization with
strings.ToLower(gvk.Kind) + "s", which fails for irregular plurals like Ingress
and NetworkPolicy. Replace this logic with a proper pluralization approach such
as using k8s.io/apimachinery/pkg/api/meta.UnsafeGuessKindToResource or a custom
mapping for known CRDs. Apply the identical fix to
integration-tests/backend/template_processor.go#L126-L131 in the
processTemplateFile function where the same naive pluralization pattern exists.
♻️ Duplicate comments (3)
integration-tests/backend/util.go (1)

250-254: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

ns parameter is unused.

The ns parameter is accepted but client.ApplyYAML is called without it. Either use client.ApplyYAMLWithNamespace(ctx, file, ns) or remove the parameter.

-func ApplyResourceFromFile(client *K8sClient, ns, file string) {
+func ApplyResourceFromFile(client *K8sClient, file string) {
     ctx := context.Background()
     err := client.ApplyYAML(ctx, file)

Or if namespace override is needed:

-    err := client.ApplyYAML(ctx, file)
+    err := client.ApplyYAMLWithNamespace(ctx, file, ns)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/util.go` around lines 250 - 254, The
`ApplyResourceFromFile` function accepts a `ns` namespace parameter but does not
use it when calling the client method. Either pass the namespace to the client
by replacing the `client.ApplyYAML(ctx, file)` call with
`client.ApplyYAMLWithNamespace(ctx, file, ns)` to apply the YAML within the
specified namespace, or remove the unused `ns` parameter from the function
signature if namespace handling is not required.
integration-tests/backend/loki_storage.go (1)

988-997: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Add missing GVR mappings used by current call sites.

Resource.clear does not handle secret and objectbucketclaims, so current migrated paths fail or silently skip cleanup.

Proposed fix
 	switch r.Kind {
 	case "lokistack":
 		gvr = schema.GroupVersionResource{Group: "loki.grafana.com", Version: "v1", Resource: "lokistacks"}
 	case "catalogsource":
 		gvr = schema.GroupVersionResource{Group: "operators.coreos.com", Version: "v1alpha1", Resource: "catalogsources"}
 	case "subscription":
 		gvr = schema.GroupVersionResource{Group: "operators.coreos.com", Version: "v1alpha1", Resource: "subscriptions"}
+	case "secret":
+		gvr = schema.GroupVersionResource{Group: "", Version: "v1", Resource: "secrets"}
+	case "objectbucketclaims":
+		gvr = schema.GroupVersionResource{Group: "objectbucket.io", Version: "v1alpha1", Resource: "objectbucketclaims"}
 	default:
 		return fmt.Errorf("unsupported resource kind: %s", r.Kind)
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/loki_storage.go` around lines 988 - 997, Add
missing case statements to the switch block on r.Kind to handle "secret" and
"objectbucketclaims" resource types. Currently the switch statement only handles
"lokistack", "catalogsource", and "subscription", but the Resource.clear method
needs to support these additional resource kinds to properly clean up resources
at call sites. Insert the two new cases before the default case, each with the
appropriate GroupVersionResource mapping (you will need to determine the correct
API group, version, and resource name for each type based on your Kubernetes
cluster configuration).
integration-tests/backend/test_flowcollectorslice.go (1)

518-520: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Recreate K8sClient after oc.ChangeUser(...) in the multi-tenancy flow.

oc.ChangeUser(testUserName) occurs at Line 480, but client-backed create/wait/delete calls at Lines 518-520 and 594-596 still use the client initialized at Line 82. That can execute API calls with the wrong identity and invalidate RBAC assertions.

Also applies to: 594-596

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/test_flowcollectorslice.go` around lines 518 - 520,
The K8sClient initialized at line 82 is being reused after
oc.ChangeUser(testUserName) is called at line 480, causing API calls to execute
with the wrong user identity. Recreate the K8sClient immediately after the
oc.ChangeUser(testUserName) call to ensure the client reflects the new user
context. This recreation must happen before the subsequent calls to
flowSliceA.CreateFlowcollectorSlice(client),
flowSliceA.WaitForFlowcollectorSliceReady(client), and
flowSliceA.DeleteFlowcollectorSlice(client) at lines 518-520, and before any
equivalent calls at lines 594-596. Update all affected locations to use the
newly created client instead of the one from line 82.
🧹 Nitpick comments (3)
integration-tests/backend/template_processor.go (1)

28-40: 💤 Low value

Unused client parameter in processTemplate.

The client *K8sClient parameter is never used in this function or passed to processTemplateFile. Either remove it or document why it's reserved for future use.

-func processTemplate(client *K8sClient, parameters ...string) (string, error) {
+func processTemplate(parameters ...string) (string, error) {

This would require updating callers accordingly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/template_processor.go` around lines 28 - 40, Remove
the unused client parameter from the processTemplate function signature since it
is never referenced within the function body or passed to processTemplateFile.
After removing the client *K8sClient parameter from the function definition,
identify and update all call sites of processTemplate to remove the
corresponding argument being passed to this function.
integration-tests/backend/util.go (2)

908-935: ⚡ Quick win

New K8sClient created inside getClientServerInfo instead of accepting it as parameter.

This function creates its own K8sClient (lines 910-911) while other refactored functions accept *K8sClient as a parameter. This is inconsistent and prevents the caller from reusing an existing client.

-func getClientServerInfo(oc *exutil.CLI, serverNS, clientNS, ipStackType string) (map[string]map[string]string, error) {
-    client, err := NewK8sClient()
-    o.Expect(err).NotTo(o.HaveOccurred())
+func getClientServerInfo(oc *exutil.CLI, client *K8sClient, serverNS, clientNS, ipStackType string) (map[string]map[string]string, error) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/util.go` around lines 908 - 935, The
getClientServerInfo function creates its own K8sClient instance inside the
function body (lines 910-911) instead of accepting it as a parameter, which is
inconsistent with other refactored functions and prevents callers from reusing
an existing client. Modify the function signature to accept a *K8sClient
parameter, remove the NewK8sClient call and error check inside the function, and
use the passed client parameter instead. Update all callers of
getClientServerInfo to pass their existing K8sClient instance.

461-488: 💤 Low value

Hard-coded GVR mappings will require maintenance as new resource types are added.

Both WaitForResourceToAppear and WaitUntilResourceIsGone have separate switch statements mapping r.Kind to GVRs. Consider extracting a shared kindToGVR helper to reduce duplication and ensure consistency.

func kindToGVR(kind string) (schema.GroupVersionResource, error) {
    switch kind {
    case "lokistack":
        return schema.GroupVersionResource{Group: "loki.grafana.com", Version: "v1", Resource: "lokistacks"}, nil
    // ... other cases
    default:
        return schema.GroupVersionResource{}, fmt.Errorf("unsupported resource kind: %s", kind)
    }
}

Also applies to: 490-521

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/util.go` around lines 461 - 488, Extract the
hard-coded GVR mapping logic from both WaitForResourceToAppear and
WaitUntilResourceIsGone functions into a shared helper function kindToGVR.
Create a new kindToGVR helper function that accepts a kind string parameter and
returns a schema.GroupVersionResource and error, containing the switch statement
that maps resource kinds like "objectbucketclaims", "objectbuckets", and
"catalogsource" to their corresponding GVRs. Then replace the duplicate switch
statement in WaitForResourceToAppear (lines 461-488) with a call to this helper
function, and do the same for the duplicate switch statement in
WaitUntilResourceIsGone (lines 490-521) to eliminate duplication and ensure
consistent GVR mappings across both functions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@integration-tests/backend/template_processor.go`:
- Around line 113-132: The current code treats each rawObj as a single YAML
document when it may contain multiple documents. When rawObj.Raw contains
multi-document YAML, only the first document gets processed by the parameter
substitution. Instead of passing the entire objYAML to substituteParameters as a
single block, use a YAML decoder loop (such as
k8s.io/apimachinery/pkg/util/yaml.NewYAMLOrJSONDecoder) to iterate through all
YAML documents within objYAML, apply parameter substitution to each document
individually, and append each processed document to processedYAMLs.

In `@integration-tests/backend/test_flowcollector.go`:
- Around line 386-388: After the migration to use *K8sClient for runtime
operator operations, the deferred cleanup calls for operator uninstallation
still reference the old `oc` parameter instead of the new `client` parameter.
Fix all four locations where this occurs: In
integration-tests/backend/test_flowcollector.go at lines 386-388 (anchor site in
the deferred cleanup call following ensureOperatorDeployed), change the
`g.DeferCleanup(LO.uninstallOperator, oc)` call to pass `client` instead of
`oc`. Apply the same fix in integration-tests/backend/test_flowcollector.go at
lines 379-380 in the !Lokiexisting branch. Similarly, fix
integration-tests/backend/test_flowcollectorslice.go at lines 132-133 in the
!Lokiexisting branch and at lines 139-141 in the reinstall branch after channel
mismatch, replacing `oc` with `client` in all deferred uninstallOperator calls.

In `@integration-tests/backend/test_flowmetrics.go`:
- Around line 74-76: Uncomment the AfterEach block (currently commented on lines
74-76) that contains the flow.DeleteFlowcollector(oc) call to re-enable proper
FlowCollector teardown after each test specification. This will prevent
FlowCollector state from leaking across test specs and causing subsequent tests
to become unstable.

---

Outside diff comments:
In `@integration-tests/backend/util.go`:
- Around line 107-143: The Resource.applyFromTemplate function uses naive string
concatenation to pluralize the Kubernetes resource kind by simply appending "s"
to the lowercased kind name. This approach fails for irregular plurals and
should be replaced with proper pluralization logic. Locate the line in
applyFromTemplate where the GroupVersionResource is constructed with
`strings.ToLower(gvk.Kind) + "s"` and replace this naive pluralization with the
same pluralization solution that was applied to the similar issue in
template_processor.go, ensuring the resource name is correctly pluralized
according to Kubernetes conventions.
- Around line 127-131: The Kind-to-Resource conversion at
integration-tests/backend/util.go#L127-L131 in the applyFromTemplate function
uses naive pluralization with strings.ToLower(gvk.Kind) + "s", which fails for
irregular plurals like Ingress and NetworkPolicy. Replace this logic with a
proper pluralization approach such as using
k8s.io/apimachinery/pkg/api/meta.UnsafeGuessKindToResource or a custom mapping
for known CRDs. Apply the identical fix to
integration-tests/backend/template_processor.go#L126-L131 in the
processTemplateFile function where the same naive pluralization pattern exists.

---

Duplicate comments:
In `@integration-tests/backend/loki_storage.go`:
- Around line 988-997: Add missing case statements to the switch block on r.Kind
to handle "secret" and "objectbucketclaims" resource types. Currently the switch
statement only handles "lokistack", "catalogsource", and "subscription", but the
Resource.clear method needs to support these additional resource kinds to
properly clean up resources at call sites. Insert the two new cases before the
default case, each with the appropriate GroupVersionResource mapping (you will
need to determine the correct API group, version, and resource name for each
type based on your Kubernetes cluster configuration).

In `@integration-tests/backend/test_flowcollectorslice.go`:
- Around line 518-520: The K8sClient initialized at line 82 is being reused
after oc.ChangeUser(testUserName) is called at line 480, causing API calls to
execute with the wrong user identity. Recreate the K8sClient immediately after
the oc.ChangeUser(testUserName) call to ensure the client reflects the new user
context. This recreation must happen before the subsequent calls to
flowSliceA.CreateFlowcollectorSlice(client),
flowSliceA.WaitForFlowcollectorSliceReady(client), and
flowSliceA.DeleteFlowcollectorSlice(client) at lines 518-520, and before any
equivalent calls at lines 594-596. Update all affected locations to use the
newly created client instead of the one from line 82.

In `@integration-tests/backend/util.go`:
- Around line 250-254: The `ApplyResourceFromFile` function accepts a `ns`
namespace parameter but does not use it when calling the client method. Either
pass the namespace to the client by replacing the `client.ApplyYAML(ctx, file)`
call with `client.ApplyYAMLWithNamespace(ctx, file, ns)` to apply the YAML
within the specified namespace, or remove the unused `ns` parameter from the
function signature if namespace handling is not required.

---

Nitpick comments:
In `@integration-tests/backend/template_processor.go`:
- Around line 28-40: Remove the unused client parameter from the processTemplate
function signature since it is never referenced within the function body or
passed to processTemplateFile. After removing the client *K8sClient parameter
from the function definition, identify and update all call sites of
processTemplate to remove the corresponding argument being passed to this
function.

In `@integration-tests/backend/util.go`:
- Around line 908-935: The getClientServerInfo function creates its own
K8sClient instance inside the function body (lines 910-911) instead of accepting
it as a parameter, which is inconsistent with other refactored functions and
prevents callers from reusing an existing client. Modify the function signature
to accept a *K8sClient parameter, remove the NewK8sClient call and error check
inside the function, and use the passed client parameter instead. Update all
callers of getClientServerInfo to pass their existing K8sClient instance.
- Around line 461-488: Extract the hard-coded GVR mapping logic from both
WaitForResourceToAppear and WaitUntilResourceIsGone functions into a shared
helper function kindToGVR. Create a new kindToGVR helper function that accepts a
kind string parameter and returns a schema.GroupVersionResource and error,
containing the switch statement that maps resource kinds like
"objectbucketclaims", "objectbuckets", and "catalogsource" to their
corresponding GVRs. Then replace the duplicate switch statement in
WaitForResourceToAppear (lines 461-488) with a call to this helper function, and
do the same for the duplicate switch statement in WaitUntilResourceIsGone (lines
490-521) to eliminate duplication and ensure consistent GVR mappings across both
functions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c1af67c2-dea7-4d3b-8a7e-8defe30cbed1

📥 Commits

Reviewing files that changed from the base of the PR and between c22cb9a and e1eac74.

📒 Files selected for processing (18)
  • integration-tests/backend/alerts.go
  • integration-tests/backend/azure_utils.go
  • integration-tests/backend/backend_suite_test.go
  • integration-tests/backend/custom_metrics.go
  • integration-tests/backend/flowcollectorslice.go
  • integration-tests/backend/ip_utils.go
  • integration-tests/backend/k8s_client.go
  • integration-tests/backend/loki.go
  • integration-tests/backend/loki_storage.go
  • integration-tests/backend/metrics.go
  • integration-tests/backend/operator.go
  • integration-tests/backend/template_processor.go
  • integration-tests/backend/test_exporters.go
  • integration-tests/backend/test_flowcollector.go
  • integration-tests/backend/test_flowcollectorslice.go
  • integration-tests/backend/test_flowmetrics.go
  • integration-tests/backend/udn.go
  • integration-tests/backend/util.go
🚧 Files skipped from review as they are similar to previous changes (5)
  • integration-tests/backend/flowcollectorslice.go
  • integration-tests/backend/k8s_client.go
  • integration-tests/backend/alerts.go
  • integration-tests/backend/metrics.go
  • integration-tests/backend/operator.go

Comment thread integration-tests/backend/template_processor.go
Comment thread integration-tests/backend/test_flowcollector.go
Comment thread integration-tests/backend/test_flowmetrics.go Outdated
@github-actions github-actions Bot added the integration-tests Changes that are only for integration-tests. label Jun 29, 2026
Amoghrd added 27 commits June 30, 2026 15:46
The previous implementation tried to use username/password authentication
directly in kubeconfig, which doesn't work in OpenShift. The correct
approach is to use the OAuth password grant flow.

This implementation replicates what 'oc login -u kubeadmin -p <password>' does:
1. Make HTTP request to /oauth/authorize with basic auth credentials
2. Extract access_token from the 302 redirect Location header
3. Return the token for use in API calls

The OAuth endpoint uses the 'openshift-challenging-client' client_id with
response_type=token, which is the standard OpenShift OAuth flow for CLI clients.

Fixes: All 41 tests were failing in BeforeEach at util.go:1008 because
the previous implementation couldn't authenticate to list secrets.

Related: netobserv#2806
Migrated 17 compat_otp helper function calls to use native Ginkgo
and our existing client-go implementations:

Template Operations (10 calls):
- Replace 5x compat_otp.ProcessTemplate with processTemplate(client, ...)
- Replace 3x compat_otp.ApplyClusterResourceFromTemplateWithError
  with applyResourceFromTemplateByAdmin(client, ...)
- Replace 2x compat_otp.ApplyNsResourceFromTemplate
  with applyResourceFromTemplateByAdmin(client, ...)

Test Helpers (12 calls):
- Replace 12x compat_otp.By with g.By (native Ginkgo)

Files modified:
- loki.go: Removed compat_otp import, added g import for g.By
- test_exporters.go: 1 ApplyNsResourceFromTemplate replacement
- test_flowcollector.go: 5 ProcessTemplate + 3 ApplyCluster + 1 ApplyNs + 11 By

All template operations now use our processTemplate and
applyResourceFromTemplateByAdmin functions with client-go instead of oc.

Remaining compat_otp usage: ~100 calls (mostly pod operations)
Added helper functions in util.go:
- getAllPods(client, namespace) - lists all pod names
- getAllPodsWithLabel(client, namespace, label) - lists pods matching label
- getPodNodeName(client, namespace, podName) - gets node name for pod

Replaced 24 compat_otp helper calls in test_flowcollector.go:
- 5x GetAllPods → getAllPods
- 11x GetAllPodsWithLabel → getAllPodsWithLabel
- 3x GetPodNodeName → getPodNodeName
- 4x DebugNodeWithChroot → debugNodeWithCommand
  (converted variadic args to single command string)

All pod operations now use client.clientset.CoreV1().Pods() instead of oc.

Remaining compat_otp usage: ~76 calls
Added assertAllPodsToBeReady function in util.go:
- Polls with 10s interval, 4min timeout
- Checks all pods have Ready condition = True
- Skips completed pods (PodSucceeded phase)
- Uses client.clientset.CoreV1().Pods().List()

Replaced 33 compat_otp.AssertAllPodsToBeReady calls:
- 25 in test_flowcollector.go
- 8 in test_flowcollectorslice.go

All pod readiness checks now use native Kubernetes client.

Remaining compat_otp usage: ~43 calls
Added helper functions in util.go:
- waitAndGetSpecificPodLogs(client, namespace, podName, filter)
  Polls for logs containing filter string (20s interval, 10min timeout)
- getFirstWorkerNode(client) - returns first worker node
- addLabelToNode(client, node, label, value) - adds label to node
- deleteLabelFromNode(client, node, label) - removes label from node

Replaced 8 compat_otp helper calls:
- 2x WaitAndGetSpecificPodLogs → waitAndGetSpecificPodLogs
- 3x GetFirstWorkerNode → getFirstWorkerNode
- 1x AddLabelToNode → addLabelToNode
- 1x DeleteLabelFromNode → deleteLabelFromNode
- Fixed label name in defer (was 'test', should be 'netobserv-agent')

All node operations now use client.clientset.CoreV1().Nodes().

Remaining compat_otp usage: ~35 calls
Added helper functions in util.go:
- setNamespacePrivileged(client, namespace) - sets pod security labels
- getNodeLabelValue(client, node, label) - gets label value from node
- isHypershiftHostedCluster(client) - checks if cluster is hypershift
- isTechPreviewNoUpgrade(client) - checks TechPreviewNoUpgrade feature set

Replaced 13 compat_otp helper calls:
- 2x SetNamespacePrivileged → setNamespacePrivileged
- 2x GetResourceSpecificLabelValue → getNodeLabelValue
- 2x IsHypershiftHostedCluster → isHypershiftHostedCluster
- 1x CheckNetworkType → checkNetworkType (existing function)
- 1x IsTechPreviewNoUpgrade → isTechPreviewNoUpgrade
- 4x AssertWaitPollNoErr → o.Expect().NotTo(o.HaveOccurred())
- 1x GetRandomString → getRandomString (already migrated earlier)

All cluster detection and assertions now use client-go.

Remaining compat_otp usage: ~22 calls (mostly cloud provider operations)
Replaced remaining compat_otp.GetRandomString() with getRandomString().

All test helper operations now use client-go.

Remaining compat_otp usage (22 total):
- 4x NewCLI - initialization of oc variable in test files
- 4x KubeConfigPath - used with NewCLI
- 14x cloud provider operations (OpenStack, Azure, GCS)
  for Loki storage backend testing - these wrap third-party SDKs
Migrated performManagedIdentityAndSecretSetupForAzureWIF to use client-go:
- Replaced oc call to get serviceAccountIssuer with getOIDC(client)
- Removed oc parameter from function signature
- Removed exutil import from azure_utils.go

This function now uses only client-go for Kubernetes operations.

Remaining oc usage in loki_storage.go and multitenants.go is for:
- Cloud provider SDK operations (OpenStack, Azure, GCS)
- OAuth/HTPasswd secret operations with --from-file
- Subscription patching operations
Added helper functions in util.go:
- getSubscriptionByPackageName(client, namespace, packageName)
  Finds subscription metadata.name by spec.name (package name)
- patchSubscriptionRemoveConfig(client, namespace, subscriptionName)
  Removes spec.config from subscription using JSON patch

Replaced oc calls in loki_storage.go removeObjectStorage:
- oc get sub + jsonpath → getSubscriptionByPackageName
- oc patch sub --type=json → patchSubscriptionRemoveConfig

Added types import for JSONPatchType.

Remaining oc usage in removeObjectStorage is only for cloud provider
SDK operations (Azure, GCS, OpenStack blob container deletion).
Added helper functions in util.go:
- getOAuthHTPasswdSecretName(client) - get htpasswd secret name from OAuth
- createSecretFromFile(client, namespace, name, key, filePath) - create secret
- extractSecretToFile(client, namespace, name, key, outputPath) - extract secret
- updateSecretFromFile(client, namespace, name, key, filePath) - update secret
- patchOAuthAddHTPasswdIdentityProvider(client, secretName) - add HTPasswd IDP

Migrated multitenants.go functions:
- getNewUser: removed oc parameter, now uses client-go for all operations
- userCleanup: removed oc parameter, now uses client-go
- Removed exutil import from multitenants.go

Replaced oc operations:
- oc get oauth/cluster -o jsonpath → getOAuthHTPasswdSecretName
- oc create secret --from-file → createSecretFromFile
- oc extract secret → extractSecretToFile
- oc set data secret --from-file → updateSecretFromFile
- oc patch oauth → patchOAuthAddHTPasswdIdentityProvider

All OAuth and HTPasswd secret operations now use native client-go!
Added functions in azure_utils.go:
- getAzureCloudStorageURISuffix(client) - get blob URI suffix by cloud type
- newAzureContainerClient(client, accountName, accountKey, containerName)
- createAzureStorageBlobContainer(client, accountName, accountKey, containerName)
- deleteAzureStorageBlobContainer(client, accountName, accountKey, containerName)
- emptyAzureBlobContainer(azClient, containerName)

Replaced compat_otp Azure operations in loki_storage.go:
- compat_otp.NewAzureContainerClient → newAzureContainerClient
- compat_otp.CreateAzureStorageBlobContainer → createAzureStorageBlobContainer
- compat_otp.DeleteAzureStorageBlobContainer → deleteAzureStorageBlobContainer

All Azure blob container operations now use native Azure SDK directly
without oc CLI dependency. Cloud name detection now uses client-go to
query infrastructure.status.platformStatus.azure.cloudName.
Created gcp_utils.go with functions:
- listGCSBuckets(client, projectID) - list all buckets in project
- emptyGCSBucket(client, bucketName) - remove all objects from bucket
- createGCSBucket(projectID, bucketName) - create or empty bucket
- deleteGCSBucket(bucketName) - delete bucket and all objects

Replaced compat_otp GCS operations in loki_storage.go:
- compat_otp.CreateGCSBucket → createGCSBucket
- compat_otp.DeleteGCSBucket → deleteGCSBucket

All GCS bucket operations now use native Google Cloud Storage SDK
directly without any oc CLI or origin dependency. Credentials are
loaded from GOOGLE_APPLICATION_CREDENTIALS environment variable.
Created openstack_utils.go with functions:
- openstackCredentials struct (replaces compat_otp.OpenstackCredentials)
- getOpenStackCredentials(client) - get credentials from kube-system secret
- newOpenStackClient(cred, serviceType) - create gophercloud client
- getAuthenticatedUserID(providerClient) - extract user ID from auth
- getOpenStackUserIDAndDomainID(cred) - get user and domain IDs
- createOpenStackContainer(client, name) - create or empty container
- deleteOpenStackContainer(client, name) - delete container
- emptyOpenStackContainer(client, name) - remove all objects

Replaced compat_otp OpenStack operations in loki_storage.go:
- compat_otp.GetOpenStackCredentials → getOpenStackCredentials
- compat_otp.NewOpenStackClient → newOpenStackClient
- compat_otp.GetOpenStackUserIDAndDomainID → getOpenStackUserIDAndDomainID
- compat_otp.CreateOpenStackContainer → createOpenStackContainer
- compat_otp.DeleteOpenStackContainer → deleteOpenStackContainer
- compat_otp.OpenstackCredentials → openstackCredentials

Removed compat_otp import from loki_storage.go!

All OpenStack Swift operations now use native gophercloud SDK with
credentials from client-go secret access instead of oc extract.
Added helper functions in util.go:
- generateTestNamespace(prefix) - creates unique test namespace names
- createNamespace(client, name) - creates namespace
- getServerURL(client) - gets API server URL from client config
- getCurrentContext() - gets current kubeconfig context
- getNodeArchitecture(client) - gets first node's architecture
- getPodNameWithLabel(client, namespace, label) - finds pod by label
- getServiceClusterIP(client, namespace, label) - gets service IP by label
- execInPod(client, namespace, pod, command) - executes command in pod

Migrated test_flowmetrics.go:
- Removed oc variable and compat_otp import
- Replaced oc.NewCLI/KubeConfigPath with generateTestNamespace
- Now uses namespace variable instead of oc.Namespace()

All operations now use client-go. Ready to migrate remaining test files.
- Replace oc variable with namespace string
- Replace all oc.AsAdmin().Run('get') calls with client-go APIs
- Replace oc.AsAdmin().Run('delete') calls with typed/dynamic clients
- Replace oc.AsAdmin().Run('exec') with execInPod helper
- Use getPodNameWithLabel and getServiceClusterIP helpers
- Remove compat_otp import entirely

All oc calls eliminated from test_exporters.go.
- Remove oc variable declaration
- Add patchFlowCollector and patchFlowCollectorMerge helpers
- Add scaleDeployment helper function
- Replace namespace generation with generateTestNamespace
- Replace server URL and context functions
- Replace flowcollector patch operations (JSON and merge patches)
- Replace deployment scale operations
- Replace pod listing with field selectors
- Replace HPA deletion with client-go
- Add metav1 import

Still has ~100+ oc calls remaining that need migration:
- must-gather operations
- taint operations
- namespace creation/deletion
- user/context management
- various kubectl run/exec operations
- Add taintNode and removeTaintFromNode helpers
- Add getPodEnvValue and getPodEnvByIndex helpers
- Replace taint operations with client-go node updates
- Replace namespace creation/deletion with helpers
- Replace must-gather with kubectl exec
- Replace pod env queries with client-go helpers
- Replace pod logs with getPodLogs helper
- Replace service queries with client-go
- Replace kubectl apply/config with exec.Command
- Fix namespace generation to use generateTestNamespace
- Remove oc parameter from loki_storage methods
- Fix syntax errors in namespace defer cleanup

Still has ~30+ oc calls remaining to complete migration.
- Remove oc parameter from loki_storage methods
- Add patchSubscription helper for subscription patches
- Add labelNamespace and removeLabelFromNamespace helpers
- Replace subscription patch with client-go
- Replace namespace label operations with client-go
- Replace oc.Namespace() with generateTestNamespace
- Replace user context operations with kubectl commands
- Fix g.DeferCleanup syntax for deleteNamespace

Remaining oc calls: ~25 (UDN operations, scale, get operations, etc.)
- Fix serverURL scope issue in user context test
- Fix deleteNamespace spacing issues
- Now builds successfully with 30 oc calls remaining

Remaining oc operations are complex:
- UDN namespace creation (CreateSpecificNamespaceUDN)
- Background exec operations
- CRD/ClusterVersion queries with jsonpath
- Pause/resume flowcollector operations
- Various kubectl create/delete/get operations
Replaced all remaining oc calls (30 → 0):
- Background exec operations with exec.Command().Start()
- CRD checks with apiextClient (added to K8sClient)
- Secret checks with client-go APIs
- Pod logs with getPodLogs helper
- FlowCollector queries with dynamic client + unstructured
- Pause/resume operations with patchFlowCollectorMerge
- Component listing with kubectl commands
- UDN namespace creation with kubectl
- Various kubectl create/delete operations
- Namespace labeling with labelNamespace helper

Fixes:
- Added apiextClient to K8sClient struct
- Fixed removeLabelFromNamespace bug
- Removed compat_otp import entirely
- Removed exutil import from loki_storage.go
- Fixed undefined ctx/gvr declarations
- Updated test_flowcollectorslice.go loki method calls

test_flowcollector.go: ✅ 0 oc calls (100% migrated - 3714 lines)
Build: ✅ Success

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
integration-tests/backend/util.go (1)

130-142: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Apply all template documents through the shared YAML path.

yaml.Unmarshal only loads one document, and the local Kind + "s" GVR derivation has the same pluralization issue as applyObject. Use the multi-doc apply path, then wait for r to appear.

Proposed direction
-	data, err := os.ReadFile(file)
-	if err != nil {
-		return fmt.Errorf("failed to read file %s: %w", file, err)
-	}
-
-	var obj unstructured.Unstructured
-	if err := yaml.Unmarshal(data, &obj); err != nil {
-		return fmt.Errorf("failed to unmarshal YAML: %w", err)
-	}
-
-	gvk := obj.GroupVersionKind()
-	gvr := schema.GroupVersionResource{
-		Group:    gvk.Group,
-		Version:  gvk.Version,
-		Resource: strings.ToLower(gvk.Kind) + "s",
-	}
-
-	_, err = client.dynamic.Resource(gvr).Namespace(r.Namespace).Create(ctx, &obj, metav1.CreateOptions{})
-	if err != nil && !apierrors.IsAlreadyExists(err) {
-		return fmt.Errorf("failed to create resource: %w", err)
-	}
+	if err := client.ApplyYAMLWithNamespace(ctx, file, r.Namespace); err != nil {
+		return fmt.Errorf("failed to apply resource template: %w", err)
+	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/util.go` around lines 130 - 142, The
single-document YAML handling in the apply flow is incomplete: `yaml.Unmarshal`
only processes one document, and the ad hoc `GroupVersionResource` construction
from `obj.GroupVersionKind()` duplicates the same pluralization problem as
`applyObject`. Update this path to use the shared multi-document YAML apply
logic already used elsewhere, so every template document is applied through the
same helper and resource mapping, then wait for the target `r` to become present
after all documents are created.
integration-tests/backend/udn.go (1)

327-350: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Wait for layer2 dual-stack UDN readiness.

The layer2 dual-stack branch creates the UDN and exits without waitUDNCRDApplied; only the default branch waits.

Proposed fix
 		case "dualstack":
 			udncrd = udnCRDResource{
 				crdname:   crdName,
 				namespace: namespace,
@@
 				template:  udnCRDLayer2dualStack,
 			}
 			udncrd.createLayer2DualStackUDNCRD(client)
 
 		default:
@@
 			}
 			udncrd.createLayer2SingleStackUDNCRD(client)
-			err := waitUDNCRDApplied(client, namespace, udncrd.crdname)
-			o.Expect(err).NotTo(o.HaveOccurred())
 		}
+		err := waitUDNCRDApplied(client, namespace, udncrd.crdname)
+		o.Expect(err).NotTo(o.HaveOccurred())
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/udn.go` around lines 327 - 350, The layer2
dual-stack path in the UDN creation flow returns immediately after
createLayer2DualStackUDNCRD without waiting for readiness. Update the "layer2"
switch branch in udnCRDResource handling so the dualstack case also calls
waitUDNCRDApplied, matching the single-stack path; use the existing
createLayer2DualStackUDNCRD and waitUDNCRDApplied helpers to keep behavior
consistent.
integration-tests/backend/loki_client.go (1)

33-57: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Compare memory using the same unit.

remainingMemory is converted to Mi, but requiredMemory.MilliValue() is milli-bytes. This makes the Loki resource gate wrong.

Proposed fix
-	var remainingCPU, remainingMemory int64
+	var remainingCPU, remainingMemoryBytes int64
@@
 		cpu := node.Status.Allocatable.Cpu().MilliValue()
 		mem := node.Status.Allocatable.Memory().Value()
 		remainingCPU += cpu
-		remainingMemory += mem / (1024 * 1024) // Convert to Mi
+		remainingMemoryBytes += mem
 	}
@@
-	e2e.Logf("the remaining cpu is: %d, and the remaning memory is: %d", remainingCPU, remainingMemory)
-	return remainingCPU > requiredCPU.MilliValue() && remainingMemory > requiredMemory.MilliValue()
+	e2e.Logf("the remaining cpu is: %d, and the remaning memory is: %d", remainingCPU, remainingMemoryBytes)
+	return remainingCPU > requiredCPU.MilliValue() && remainingMemoryBytes > requiredMemory.Value()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/loki_client.go` around lines 33 - 57, The memory
comparison in compareClusterResources is using mismatched units, since
remainingMemory is accumulated in Mi while requiredMemory is logged and compared
via MilliValue(). Update the Loki resource gate so both sides use the same unit
consistently, using the parsed quantity from k8sresource.ParseQuantity and a
matching conversion for remainingMemory before the final comparison.
🟠 Major comments (29)
integration-tests/backend/test_flowcollectorslice.go-488-510 (1)

488-510: 🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Create the rolebindings before switching to the test user.

Line 490 moves kubectl to userContext, but Lines 495-509 and 644-647 still use kubectl to grant RBAC. Those commands now run as the unprivileged user they are supposed to authorize, so setup will fail if the switch worked, and will silently test the wrong identity if it did not. Keep the rolebinding creation on the admin context, then switch to userContext for the permission checks.

Also applies to: 644-647

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/test_flowcollectorslice.go` around lines 488 - 510,
The RBAC setup is happening after kubectl switches to userContext, so the
rolebinding creation in the test flow is executed as the target unprivileged
user instead of the admin context. Move the rolebinding creation calls in the
test flow around the kubectl config use-context step so `exec.Command("kubectl",
"create", "rolebinding", ...)` runs before switching contexts, then keep the
context switch only for the subsequent permission checks. Apply the same fix
anywhere else in this test that creates RBAC bindings after the context switch,
including the later rolebinding block referenced by the same pattern.
integration-tests/backend/test_exporters.go-139-147 (1)

139-147: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Read the FlowCollector as cluster-scoped.

DeleteFlowcollector in integration-tests/backend/flowcollector.go deletes "cluster" without a namespace, but this check adds .Namespace(namespace). That makes the verification path inconsistent with the resource contract and can 404 before the exporter assertion runs. Also check ResolveResource instead of discarding it.

Suggested fix
 		g.By("Verify flowcollector is deployed with IPFIX exporter")
 		ctx := context.Background()
-		gvr, _ := client.ResolveResource("flowcollector")
-		fc, err := client.dynamic.Resource(gvr).Namespace(namespace).Get(ctx, "cluster", metav1.GetOptions{})
+		gvr, err := client.ResolveResource("flowcollector")
+		o.Expect(err).NotTo(o.HaveOccurred())
+		fc, err := client.dynamic.Resource(gvr).Get(ctx, "cluster", metav1.GetOptions{})
 		o.Expect(err).ToNot(o.HaveOccurred())
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/test_exporters.go` around lines 139 - 147, The
FlowCollector lookup in the exporter test is using a namespace even though the
resource is cluster-scoped, which makes the verification inconsistent with the
delete path and can cause a 404. Update the
`ResolveResource`/`dynamic.Resource(...).Get(...)` flow in `TestExporters` to
fetch the `cluster` FlowCollector without `.Namespace(namespace)`, and handle
the result from `ResolveResource` instead of ignoring it. Keep the existing
exporter assertions after the corrected cluster-scoped fetch.
integration-tests/backend/test_flowmetrics.go-67-78 (1)

67-78: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Register teardown only after the FlowCollector is created.
integration-tests/backend/test_flowmetrics.go:67-78 g.AfterEach still runs on BeforeEach failures and skips, but DeleteFlowcollector always deletes the cluster-scoped "cluster" FlowCollector. If setup exits before flow.CreateFlowcollector(client), this cleanup can remove a collector created by another spec. Move the cleanup registration to after the create call succeeds.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/test_flowmetrics.go` around lines 67 - 78, The
teardown in test_flowmetrics.go is registered too early and can delete the
shared cluster-scoped FlowCollector even when setup fails. Update the
BeforeEach/AfterEach flow around Flowcollector.CreateFlowcollector and
Flowcollector.DeleteFlowcollector so the cleanup registration happens only after
CreateFlowcollector(client) succeeds, ensuring DeleteFlowcollector is not run
for skipped or failed setup.
integration-tests/backend/template_processor.go-143-145 (1)

143-145: 🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Write processed manifests with private temp-file permissions.

Template output can include Secrets or credentials. 0644 makes the processed manifest readable by other local users while the test is running; use os.CreateTemp, which creates files with 0600.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/template_processor.go` around lines 143 - 145, The
temporary manifest file in the template processing flow is being created with
world-readable permissions, which can expose Secrets or credentials. Update the
write logic in the template processor helper to use os.CreateTemp instead of
building a path with filepath.Join/os.TempDir and os.WriteFile, so the file is
created with private 0600 permissions; keep the surrounding output serialization
and cleanup behavior in the same processing path.
integration-tests/backend/util.go-173-175 (1)

173-175: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Strip the leading dot before NestedString.

For jsonpaths like {.status.phase}, strings.Trim(jsonPath, "{}") leaves .status.phase, so strings.Split produces an empty first field and lookups never match.

Proposed fix
-		fields := strings.Split(strings.Trim(jsonPath, "{}"), ".")
+		path := strings.TrimPrefix(strings.Trim(jsonPath, "{}"), ".")
+		fields := strings.Split(path, ".")

Apply the same change to assertResourceStatus.

Also applies to: 261-263

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/util.go` around lines 173 - 175, The jsonpath
parsing in the `NestedString` lookup still leaves a leading dot for expressions
like `{.status.phase}`, causing the first path segment to be empty and the
lookup to fail. Update the parsing in the helper that builds `fields` before
calling `unstructured.NestedString` to strip any leading dot after removing `{}`
so the split path starts with `status`; apply the same correction in
`assertResourceStatus` as well.
integration-tests/backend/template_processor.go-105-109 (1)

105-109: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Do not let ignoreUnknown bypass required parameters.

--ignore-unknown-parameters=true should ignore extra supplied keys, not allow required template parameters to become empty strings. Missing required values should still return an error.

Proposed fix
-		} else if param.Required && !ignoreUnknown {
+		} else if param.Required {
 			return "", fmt.Errorf("parameter %s is required but not provided", param.Name)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/template_processor.go` around lines 105 - 109,
`ignoreUnknown` in `processTemplate` is incorrectly allowing required parameters
to fall through to the empty-string branch. Update the required-parameter
handling so `param.Required` always returns an error when the value is missing,
regardless of `ignoreUnknown`, and keep `ignoreUnknown` limited to skipping
unexpected extra keys; check the `param.Required` / `paramValues` logic in
`template_processor.go` to make sure only optional parameters get default empty
strings.
integration-tests/backend/template_processor.go-134-140 (1)

134-140: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Separate generated YAML documents with a newline.

When rawObj.Raw does not end with \n, the next --- is appended to the previous document body, producing invalid YAML.

Proposed fix
 	for i, objYAML := range processedYAMLs {
 		if i > 0 {
-			output.WriteString("---\n")
+			output.WriteString("\n---\n")
 		}
 		output.Write(objYAML)
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/template_processor.go` around lines 134 - 140, The
multi-document assembly in template_processor.go is missing a newline before
separators when a prior YAML chunk does not end with one, causing the next
document marker to be glued to the body. Update the output-building logic in the
loop that writes processedYAMLs so each document separator is always on its own
line, preserving valid YAML regardless of whether rawObj.Raw ends with a
trailing newline.
integration-tests/backend/k8s_client.go-373-376 (1)

373-376: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Default namespace only after scope resolution

ApplyYAMLWithNamespace and ApplyResourceFromFile set a namespace before resource scope is known, so cluster-scoped manifests end up on namespaced requests and fail. Also replace the Kind + "s" GVR construction in integration-tests/backend/k8s_client.go and integration-tests/backend/util.go with REST-mapper lookup so non-standard plurals resolve correctly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/k8s_client.go` around lines 373 - 376, Update
ApplyYAMLWithNamespace and ApplyResourceFromFile so the default namespace is
applied only after resource scope is resolved, not before, and skip namespace
assignment for cluster-scoped objects. Also replace the manual Kind + "s" GVR
construction in the k8s client and helper code with REST-mapper-based lookup so
resource pluralization is resolved correctly for non-standard kinds.
integration-tests/backend/backend_suite_test.go-122-126 (1)

122-126: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Use JUNIT_REPORT_FILE for the filtered JUnit path. reporterConfig.JUnitReport still routes through the built-in --junit-report path, so Ginkgo can write the unfiltered report over this file. Read only JUNIT_REPORT_FILE here, or clear the flag value before RunSpecs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/backend_suite_test.go` around lines 122 - 126, The
JUnit report path in the backend test setup is still being taken from
g.GinkgoConfiguration().JUnitReport, which can let Ginkgo write the unfiltered
report over the filtered file. Update the setup around g.GinkgoConfiguration()
and RunSpecs so the filtered report path is sourced only from JUNIT_REPORT_FILE,
or explicitly clear the built-in junit-report value before RunSpecs to prevent
the flag from being reused.
integration-tests/backend/util.go-1541-1544 (1)

1541-1544: 🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Reuse the kubeconfig transport for OAuth
This request sends kubeadmin credentials, so it should reuse the existing kubeconfig-backed transport instead of a bare http.Transport with InsecureSkipVerify, and set an HTTP timeout.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/util.go` around lines 1541 - 1544, The OAuth
request setup in the test helper is using a bare http.Transport with
InsecureSkipVerify instead of the existing kubeconfig-backed transport, and it
also lacks an HTTP timeout. Update the transport creation in the helper that
builds the OAuth client to reuse the kubeconfig transport used elsewhere in the
integration tests, then apply a reasonable timeout on the HTTP client so
kubeadmin credential requests do not hang indefinitely.

Source: Linters/SAST tools

integration-tests/backend/k8s_client.go-486-493 (1)

486-493: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Use the mapper API that matches a resource string. RESTMapping(schema.GroupKind{}, resource) won’t resolve the inputs passed to ResolveResource; switch this helper to ResourceFor, and add ShortcutExpander only if short names like cm are meant to work too.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/k8s_client.go` around lines 486 - 493, The
ResolveResource helper is using the wrong mapper call for a raw resource string;
update K8sClient.ResolveResource to use the mapper API that resolves a resource
name directly, rather than RESTMapping with an empty GroupKind. If short names
are expected to work, wrap the mapper with ShortcutExpander before calling
ResourceFor; otherwise keep the lookup limited to direct resource names and
preserve the existing error handling in ResolveResource.
integration-tests/backend/virtualization.go-94-107 (1)

94-107: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Select readiness conditions by type, not array position.

status.conditions[0] is not guaranteed to be the Available/Ready condition. This can pass on the wrong condition or keep polling after the resource is ready.

Proposed fix pattern
-		firstCondition, ok := conditions[0].(map[string]interface{})
-		if !ok {
-			return false, nil
-		}
-
-		status, ok := firstCondition["status"].(string)
-		if !ok || status != "True" {
-			return false, nil
+		for _, condition := range conditions {
+			conditionMap, ok := condition.(map[string]interface{})
+			if !ok {
+				continue
+			}
+			conditionType, _, _ := unstructured.NestedString(conditionMap, "type")
+			status, _, _ := unstructured.NestedString(conditionMap, "status")
+			if conditionType == "Available" && status == "True" {
+				return true, nil
+			}
 		}
-
-		return true, nil
+		return false, nil

Use conditionType == "Ready" in waitUntilVMReady.

Also applies to: 151-166

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/virtualization.go` around lines 94 - 107, The
readiness check in waitUntilVMReady is using the first entry in
status.conditions, which can miss the actual Ready/Available condition. Update
the logic to iterate the conditions slice and select the condition by its type
field (for example, type == "Ready") before checking its status, and apply the
same change to the duplicate condition check elsewhere in the file. Use the
existing waitUntilVMReady helper and the status.conditions handling as the
anchor points for the fix.
integration-tests/backend/udn.go-364-369 (1)

364-369: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Only retry not-found errors during readiness polling.

These loops currently swallow every GET error, including RBAC, mapper, and API failures, turning actionable failures into slow timeouts.

Proposed fix
 		obj, efErr := client.dynamic.Resource(gvr).Namespace(ns).Get(ctx, crdName, metav1.GetOptions{})
 		if efErr != nil {
+			if !apierrors.IsNotFound(efErr) {
+				return false, efErr
+			}
 			e2e.Logf("Failed to get UDN %v, error: %s. Trying again", crdName, efErr)
 			return false, nil
 		}
@@
 		obj, efErr := client.dynamic.Resource(gvr).Get(ctx, crdName, metav1.GetOptions{})
 		if efErr != nil {
+			if !apierrors.IsNotFound(efErr) {
+				return false, efErr
+			}
 			e2e.Logf("Failed to get CUDN %v, error: %s. Trying again", crdName, efErr)
 			return false, nil
 		}

Also applies to: 403-408

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/udn.go` around lines 364 - 369, The readiness
polling in the UDN test helper is currently retrying every GET failure, which
hides real failures. Update the polling logic in the get-and-check block used by
the UDN readiness wait so it only returns false/continues when the error is a
not-found condition, and otherwise surfaces the error immediately. Use the
existing dynamic client GET call and the polling callback in the UDN helper to
detect and special-case not-found while letting RBAC, discovery/mapper, and API
errors fail fast.
integration-tests/backend/sctp.go-20-28 (1)

20-28: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Bound kubectl debug calls with a timeout.

Eventually cannot interrupt a callback stuck inside debugNodeWithCommand; the provided helper uses exec.CommandContext(context.Background()), so a hung debug command can hang the suite. Add a timeout in debugNodeWithCommand.

Proposed fix in `integration-tests/backend/util.go`
-	ctx := context.Background()
+	ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute)
+	defer cancel()

Also applies to: 38-38

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/sctp.go` around lines 20 - 28, The
`debugNodeWithCommand` helper can block forever because it runs `kubectl debug`
with an uncancelable background context, so `Eventually` cannot recover from a
hung command. Update `debugNodeWithCommand` in the util helper to use a real
timeout-capable context for `exec.CommandContext`, and make sure callers like
the SCTP install flow in `sctp.go` can rely on it returning within a bounded
time. Keep the existing call sites unchanged aside from any needed timeout
parameter wiring if you add one.
integration-tests/backend/virtualization.go-44-51 (1)

44-51: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Scan for any metal worker instead of requiring every worker to be metal.

The current loop returns false on the first non-metal worker and returns true when the worker list is empty.

Proposed fix
 	for _, node := range nodes.Items {
 		instanceType, ok := node.Labels["node.kubernetes.io/instance-type"]
-		if !ok || !strings.Contains(instanceType, "metal") {
-			e2e.Logf("Cluster does not have metal worker nodes")
-			return false
+		if ok && strings.Contains(instanceType, "metal") {
+			return true
 		}
 	}
-	return true
+	e2e.Logf("Cluster does not have metal worker nodes")
+	return false
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/virtualization.go` around lines 44 - 51, The
metal-node check in the worker scan is too strict because it returns false on
the first non-metal node and true when there are no nodes; update the logic in
the function that iterates over nodes.Items to return true as soon as any node
has a node.kubernetes.io/instance-type containing "metal", and only return false
after checking all nodes. Keep the existing e2e.Logf message for the no-metal
case, but move it to the final failure path rather than inside the loop.
integration-tests/backend/sctp.go-74-92 (1)

74-92: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Do not ignore node list errors.

If either OS-specific node list fails, SCTP preparation silently continues with an empty slice and the test proceeds without enabling SCTP.

Proposed fix
 	rhelWorkers, err := client.clientset.CoreV1().Nodes().List(ctx, metav1.ListOptions{
 		LabelSelector: "node-role.kubernetes.io/worker,node.openshift.io/os_id=rhel",
 	})
+	o.Expect(err).NotTo(o.HaveOccurred())
 	var rhelWorkerNames []string
-	if err == nil && len(rhelWorkers.Items) > 0 {
+	if len(rhelWorkers.Items) > 0 {
 		for _, node := range rhelWorkers.Items {
 			rhelWorkerNames = append(rhelWorkerNames, node.Name)
 		}
 	}
 
 	rhcosWorkers, err := client.clientset.CoreV1().Nodes().List(ctx, metav1.ListOptions{
 		LabelSelector: "node-role.kubernetes.io/worker,node.openshift.io/os_id=rhcos",
 	})
+	o.Expect(err).NotTo(o.HaveOccurred())
 	var rhcosWorkerNames []string
-	if err == nil && len(rhcosWorkers.Items) > 0 {
+	if len(rhcosWorkers.Items) > 0 {
 		for _, node := range rhcosWorkers.Items {
 			rhcosWorkerNames = append(rhcosWorkerNames, node.Name)
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/sctp.go` around lines 74 - 92, The SCTP node
discovery in the preparation logic is swallowing errors from the
CoreV1().Nodes().List calls and continuing with empty worker lists, which can
hide setup failures. In the SCTP setup path, make the code in the node-listing
block fail fast when either the rhel or rhcos node query returns an error
instead of only checking len(items); surface the error from the
client.clientset.CoreV1().Nodes().List calls so SCTP preparation stops if node
discovery fails.
integration-tests/backend/azure_utils.go-290-298 (1)

290-298: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Keep sovereign cloud suffix handling consistent.

getStorageAccountURISuffixAndEnvForAzure handles Gov, China, and Germany clouds, but getAzureCloudStorageURISuffix only handles Gov. The shared-key helpers will hit the public Azure endpoint on China/Germany clusters.

Proposed fix
 	cloudName, _, _ := unstructured.NestedString(obj.Object, "status", "platformStatus", "azure", "cloudName")
-	if strings.ToLower(cloudName) == "azureusgovernmentcloud" {
+	switch strings.ToLower(cloudName) {
+	case "azureusgovernmentcloud":
 		return ".blob.core.usgovcloudapi.net"
+	case "azurechinacloud":
+		return ".blob.core.chinacloudapi.cn"
+	case "azuregermancloud":
+		return ".blob.core.cloudapi.de"
 	}
-	// Add other clouds as needed:
-	// "azurechinacloud" -> ".blob.core.chinacloudapi.cn"
-	// "azuregermancloud" -> ".blob.core.cloudapi.de"
 
 	return ".blob.core.windows.net" // AzurePublicCloud default
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/azure_utils.go` around lines 290 - 298, The cloud
suffix handling is inconsistent between getStorageAccountURISuffixAndEnvForAzure
and getAzureCloudStorageURISuffix, causing shared-key helpers to fall back to
the public Azure endpoint for sovereign clouds. Update
getAzureCloudStorageURISuffix to recognize the same Azure cloudName values
already handled in getStorageAccountURISuffixAndEnvForAzure, especially China
and Germany in addition to Gov, and return the matching blob URI suffix for each
case with the public suffix as the default.
integration-tests/backend/azure_utils.go-168-175 (1)

168-175: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Fix ResolveResource to use a real kind/GVK lookup. RESTMapping(schema.GroupKind{}, gvr.Resource) passes the resource name into the version slot, so helpers like getAzureResourceGroupFromCluster will fail to resolve infrastructure and other dynamic resources.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/azure_utils.go` around lines 168 - 175,
`ResolveResource` is doing a REST mapping with the resource name in the wrong
place, so `getAzureResourceGroupFromCluster` cannot reliably resolve dynamic
resources like `infrastructure`. Update the `K8sClient.ResolveResource` path to
look up the real kind/GVK (for example via the resource’s GroupKind/metadata
rather than passing `gvr.Resource` as the version), and keep the
`getAzureResourceGroupFromCluster` call site unchanged so it can resolve the
correct mapping.
integration-tests/backend/flowcollector_utils.go-96-98 (1)

96-98: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Bound this in-pod curl.

execCommandInSpecificPod already uses an unbounded context upstream, so this curl -s can hang the suite if the collector endpoint stops responding. Add --fail --max-time ... here and/or pass a timeout into the exec helper.

Suggested fix
-	cmd := "curl -s http://localhost:8080/records?format=json"
+	cmd := "curl --fail --silent --show-error --max-time 10 http://localhost:8080/records?format=json"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/flowcollector_utils.go` around lines 96 - 98, The
in-pod curl in flowcollector_utils.go can hang the test suite because
execCommandInSpecificPod is using an unbounded context. Update the command built
near the collector HTTP query to include curl timeout/failure flags such as
--fail and --max-time, and/or change execCommandInSpecificPod to accept and
enforce a timeout context. Keep the fix localized to the collector polling path
so the helper and its callers remain easy to identify.
integration-tests/backend/gcp_utils.go-64-75 (1)

64-75: 🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Avoid project-wide bucket listing for the existence check.

listGCSBuckets(client, projectID) requires storage.buckets.list, which is broader than needed and can fail for identities that can manage a specific bucket but cannot enumerate the project. Use Bucket(bucketName).Attrs(ctx) to check existence, or create and handle AlreadyExists.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/gcp_utils.go` around lines 64 - 75, The bucket
existence check in the GCS helper is too broad because listGCSBuckets(projectID)
requires project-wide bucket listing permissions. Update the logic in the
bucket-check flow to use the storage client’s Bucket(bucketName).Attrs(ctx) for
existence verification, or switch the creation path to handle AlreadyExists
directly, so the check works with identities that can access a single bucket but
not enumerate all buckets.
integration-tests/backend/flowcollector.go-206-210 (1)

206-210: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

ResolveResource can't map flowcollector as written.
RESTMapping(schema.GroupKind{}, gvr.Resource) passes the resource name where the mapper expects a GroupKind/version, so these delete/readiness paths won't get a valid GVR. Use the FlowCollector GVK or a direct known GVR instead.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/flowcollector.go` around lines 206 - 210, The
delete/readiness flow is resolving the Flowcollector resource with the wrong
identifier, so the REST mapper cannot produce a valid mapping. Update
DeleteFlowcollector and the related ResolveResource/RESTMapping path to use the
FlowCollector GroupVersionKind (or a known GVR directly) instead of passing only
the resource name, and make sure the lookup targets the correct kind for
FlowCollector so the delete/readiness operations resolve successfully.
integration-tests/backend/openstack_utils.go-37-37 (1)

37-37: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Wire clouds.yaml.verify into the OpenStack client.

Verify is parsed but never applied when building AuthenticatedClient, so verify: false still leaves TLS verification on. That breaks OpenStack setups with self-signed or internal certs; set the provider HTTPClient from this flag or load the cloud config via gophercloud’s clouds.yaml helpers.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/openstack_utils.go` at line 37, Wire the parsed
clouds.yaml Verify flag through the OpenStack client setup so it actually
controls TLS verification. Update the logic around AuthenticatedClient in
openstack_utils.go to use the provider HTTPClient configured from this field, or
switch to gophercloud’s clouds.yaml helper path so Verify=false disables
certificate verification as intended.
integration-tests/backend/flowcollector.go-255-267 (1)

255-267: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Don't key readiness off reason substring matches.

strings.Contains(reason, "Ready") returns true for values like NotReady, and only checking conditions[0] can miss the actual Ready condition entirely. Match the Ready condition by type/status across all conditions instead.

Suggested fix
-		condition := conditions[0].(map[string]interface{})
-		reason, _, _ := unstructured.NestedString(condition, "reason")
-		message, _, _ := unstructured.NestedString(condition, "message")
-
-		if strings.Contains(reason, "Ready") {
-			return true, nil
-		}
-
-		e2e.Logf("flowcollector status is %s due to %s", reason, message)
+		for _, raw := range conditions {
+			condition, ok := raw.(map[string]interface{})
+			if !ok {
+				continue
+			}
+			conditionType, _, _ := unstructured.NestedString(condition, "type")
+			status, _, _ := unstructured.NestedString(condition, "status")
+			reason, _, _ := unstructured.NestedString(condition, "reason")
+			message, _, _ := unstructured.NestedString(condition, "message")
+
+			if conditionType == "Ready" && status == "True" {
+				return true, nil
+			}
+
+			e2e.Logf("flowcollector condition %s is %s due to %s", conditionType, status, message)
+			_ = reason
+		}
 		return false, nil
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/flowcollector.go` around lines 255 - 267, The
readiness check in flowcollector.go is incorrectly using a substring match on
the first status condition’s reason, which can treat non-ready states as ready.
Update the logic in the FlowCollector polling code to iterate over all entries
from unstructured.NestedSlice(fc.Object, "status", "conditions") and identify
the condition whose type is Ready, then verify its status rather than inspecting
reason. Keep the existing condition parsing helpers, but replace the
strings.Contains(reason, "Ready") check with an explicit Ready condition/status
match.
integration-tests/backend/kafka.go-187-195 (1)

187-195: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Check Ready=True, not just condition presence.

Strimzi resources can have a Ready condition with status: "False"/"Unknown". This currently reports readiness as soon as the condition exists.

Proposed fix
 			condType, ok := condMap["type"].(string)
-			if ok && condType == "Ready" {
+			condStatus, statusOK := condMap["status"].(string)
+			if ok && statusOK && condType == "Ready" && condStatus == "True" {
 				return true, nil
 			}
 			condType, ok := condMap["type"].(string)
-			if ok && condType == "Ready" {
+			condStatus, statusOK := condMap["status"].(string)
+			if ok && statusOK && condType == "Ready" && condStatus == "True" {
 				e2e.Logf("Waiting for kafka status %s", condType)
 				return true, nil
 			}

Also applies to: 221-229

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/kafka.go` around lines 187 - 195, The readiness
check in `hasReadyCondition` (and the similar condition scan in the other
referenced block) currently returns true as soon as it finds a `Ready`
condition, even if its status is not `True`. Update the condition handling to
inspect the `status` field on the `Ready` condition and only report ready when
it is explicitly `True`; otherwise keep scanning or return false. Use the
existing `condMap`/`condType` logic to locate the `Ready` condition and add the
status check there.
integration-tests/backend/util_cloud.go-23-24 (1)

23-24: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Return errors when cloud values are missing.

These helpers currently return empty region/project/account/key with nil error when fields or Secret keys are absent.

Proposed fix
 import (
 	"context"
+	"fmt"
 
 	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
 	"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
 )
@@
-	region, _, err := unstructured.NestedString(obj.Object, "status", "platformStatus", "aws", "region")
-	return region, err
+	region, found, err := unstructured.NestedString(obj.Object, "status", "platformStatus", "aws", "region")
+	if err != nil {
+		return "", err
+	}
+	if !found || region == "" {
+		return "", fmt.Errorf("aws region not found in infrastructure status")
+	}
+	return region, nil
@@
-	projectID, _, err := unstructured.NestedString(obj.Object, "status", "platformStatus", "gcp", "projectID")
-	return projectID, err
+	projectID, found, err := unstructured.NestedString(obj.Object, "status", "platformStatus", "gcp", "projectID")
+	if err != nil {
+		return "", err
+	}
+	if !found || projectID == "" {
+		return "", fmt.Errorf("gcp projectID not found in infrastructure status")
+	}
+	return projectID, nil
@@
 	for _, container := range imageRegistry.Spec.Template.Spec.Containers {
 		for _, env := range container.Env {
 			if env.Name == "REGISTRY_STORAGE_AZURE_ACCOUNTNAME" {
 				accountName = env.Value
 				break
 			}
 		}
 	}
+	if accountName == "" {
+		return "", "", fmt.Errorf("azure storage account name not found in image-registry deployment")
+	}
@@
-	accountKey := secret.Data["REGISTRY_STORAGE_AZURE_ACCOUNTKEY"]
+	accountKey, ok := secret.Data["REGISTRY_STORAGE_AZURE_ACCOUNTKEY"]
+	if !ok || len(accountKey) == 0 {
+		return accountName, "", fmt.Errorf("azure storage account key not found in image registry secret")
+	}
 	return accountName, string(accountKey), nil
 }

Also applies to: 40-41, 54-69

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/util_cloud.go` around lines 23 - 24, The cloud
helper functions currently treat missing nested values or Secret keys as
success, returning empty strings with nil errors; update the helpers in
util_cloud.go to validate the extracted values and return a non-nil error when
region/project/account/key fields are absent. Use the existing nested lookup
helpers around the region/project/account/key accessors (the functions covering
the aws/gcp/azure paths) to detect missing fields after
unstructured.NestedString or Secret key reads, and propagate a descriptive error
instead of returning an empty value.
integration-tests/backend/multitenants.go-202-206 (1)

202-206: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Preserve the NotFound-tolerant cleanup path.

Line 206 re-asserts err, so a NotFound still fails after being intentionally ignored.

Proposed fix
 	err = client.dynamic.Resource(gvr).Delete(ctx, crbName, metav1.DeleteOptions{})
 	if err != nil && !apierrors.IsNotFound(err) {
 		o.Expect(err).NotTo(o.HaveOccurred())
 	}
-	o.Expect(err).NotTo(o.HaveOccurred())
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/multitenants.go` around lines 202 - 206, The
cleanup in the dynamic Delete path re-checks err after intentionally ignoring
NotFound, which causes the tolerant teardown to fail anyway. Update the Delete
handling in the multitenants test cleanup so the NotFound case is treated as
success and only unexpected errors are asserted. Keep the logic centered around
client.dynamic.Resource(gvr).Delete and the existing apierrors.IsNotFound check,
and remove the redundant final expectation on err.
integration-tests/backend/loki.go-75-90 (1)

75-90: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Return on the first workload readiness error.

err is overwritten on each loop iteration, so an earlier failed deployment/statefulset can be masked by a later successful one.

Proposed fix
 func (l lokiStack) waitForLokiStackToBeReady(client *K8sClient) error {
-	var err error
 	for _, deploy := range []string{l.Name + "-distributor", l.Name + "-gateway", l.Name + "-querier", l.Name + "-query-frontend"} {
-		err = waitForDeploymentPodsToBeReady(client, l.Namespace, deploy)
+		if err := waitForDeploymentPodsToBeReady(client, l.Namespace, deploy); err != nil {
+			return err
+		}
 	}
 	for _, ss := range []string{l.Name + "-compactor", l.Name + "-index-gateway", l.Name + "-ingester"} {
-		err = waitForStatefulsetReady(client, l.Namespace, ss)
+		if err := waitForStatefulsetReady(client, l.Namespace, ss); err != nil {
+			return err
+		}
 	}
 	if isWorkloadIdentityCluster(client) {
 		currentPlatform := checkPlatform(client)
@@
 		}
 	}
-	return err
+	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/loki.go` around lines 75 - 90, The readiness check
in lokiStack.waitForLokiStackToBeReady overwrites err on each
deployment/statefulset iteration, so a failure can be hidden by a later success.
Update the loops to stop and return immediately on the first non-nil result from
waitForDeploymentPodsToBeReady or waitForStatefulsetReady, preserving the
earliest readiness error. Keep the workload-identity validation block unchanged,
and ensure the function returns the first encountered error instead of the last
one.
integration-tests/backend/ip_utils.go-67-88 (1)

67-88: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Fail fast when the expected pod IPs are missing.

Returning ("", "") lets callers build curl :8080, so the connectivity check no longer targets the destination pod.

Proposed fix
 	if (ipStack == "ipv6single") || (ipStack == "ipv4single") {
-		if len(pod.Status.PodIPs) > 0 {
-			podIP := pod.Status.PodIPs[0].IP
-			e2e.Logf("The pod %s IP in namespace %s is %q", podName, namespace, podIP)
-			return podIP, ""
-		}
-		return "", ""
+		o.Expect(pod.Status.PodIPs).NotTo(o.BeEmpty(), "pod %s/%s has no assigned IP", namespace, podName)
+		podIP := pod.Status.PodIPs[0].IP
+		e2e.Logf("The pod %s IP in namespace %s is %q", podName, namespace, podIP)
+		return podIP, ""
 	} else if ipStack == "dualstack" {
-		if len(pod.Status.PodIPs) >= 2 {
-			podIP1 := pod.Status.PodIPs[1].IP
-			e2e.Logf("The pod's %s 1st IP in namespace %s is %q", podName, namespace, podIP1)
-			podIP2 := pod.Status.PodIPs[0].IP
-			e2e.Logf("The pod's %s 2nd IP in namespace %s is %q", podName, namespace, podIP2)
-			if netutils.IsIPv6String(podIP1) {
-				e2e.Logf("This is IPv4 primary dual stack cluster with IP %s", podIP1)
-				return podIP1, podIP2
-			}
-			e2e.Logf("This is IPv6 primary dual stack cluster with IP %s", podIP2)
-			return podIP2, podIP1
+		o.Expect(len(pod.Status.PodIPs)).To(o.BeNumerically(">=", 2), "pod %s/%s does not have dual-stack IPs", namespace, podName)
+		podIP1 := pod.Status.PodIPs[1].IP
+		e2e.Logf("The pod's %s 1st IP in namespace %s is %q", podName, namespace, podIP1)
+		podIP2 := pod.Status.PodIPs[0].IP
+		e2e.Logf("The pod's %s 2nd IP in namespace %s is %q", podName, namespace, podIP2)
+		if netutils.IsIPv6String(podIP1) {
+			e2e.Logf("This is IPv4 primary dual stack cluster with IP %s", podIP1)
+			return podIP1, podIP2
 		}
+		e2e.Logf("This is IPv6 primary dual stack cluster with IP %s", podIP2)
+		return podIP2, podIP1
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/ip_utils.go` around lines 67 - 88, The pod IP
lookup in the IP helper returns empty strings when the expected addresses are
missing, which lets callers continue with an invalid destination. Update the pod
IP retrieval logic in the function handling ipStack cases to fail fast instead
of returning ("", "") when PodIPs is empty or dual-stack is incomplete, and
surface a clear error to the caller so connectivity checks in this path can stop
immediately.
integration-tests/backend/kafka.go-128-164 (1)

128-164: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Resolve resources with the correct mapper input. ResolveResource passes the resource string into RESTMapping(schema.GroupKind{}, ...), but that API resolves a kind, not a resource name. These Kafka deletes will fail to resolve before hitting the API.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/kafka.go` around lines 128 - 164, The Kafka delete
helpers are calling ResolveResource with resource names, but RESTMapping in that
path expects a Kind, so resource resolution can fail before the delete request
is sent. Update the delete methods in Kafka, KafkaTopic, KafkaNodePool, and the
related user cleanup to pass the correct kind identifier into ResolveResource
(or otherwise map the kind-to-resource correctly) so the dynamic client can
resolve the GVR before calling client.dynamic.Resource(...).Delete.
🟡 Minor comments (2)
integration-tests/backend/backend_suite_test.go-162-165 (1)

162-165: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Do not count pending specs as failed.

SpecStatePending is not a failed execution state, so this summary can report failures while the suite itself succeeds. Move pending specs to the skipped bucket.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/backend_suite_test.go` around lines 162 - 165, The
suite summary in backend_suite_test.go is treating SpecStatePending as a failure
in the failed-spec accounting. Update the switch in the report aggregation logic
so only true failure states are appended to failedSpecs, and move
SpecStatePending into the skipped branch alongside SpecStateSkipped; use the
spec report handling around the failedSpecs and skipped buckets to locate the
change.
integration-tests/backend/loki.go-116-124 (1)

116-124: 🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Do not ignore PVC cleanup errors.

A failed PVC delete leaves storage behind and can pollute later tests.

Proposed fix
-	_ = client.clientset.CoreV1().PersistentVolumeClaims(l.Namespace).DeleteCollection(ctx, metav1.DeleteOptions{}, metav1.ListOptions{
+	err := client.clientset.CoreV1().PersistentVolumeClaims(l.Namespace).DeleteCollection(ctx, metav1.DeleteOptions{}, metav1.ListOptions{
 		LabelSelector: labelSelector,
 	})
+	o.Expect(err).NotTo(o.HaveOccurred())
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/backend/loki.go` around lines 116 - 124, The PVC cleanup in
removeLokiStack is currently ignoring delete failures, which can leave storage
behind and affect later tests. Update lokiStack.removeLokiStack to handle and
surface the error returned by
client.clientset.CoreV1().PersistentVolumeClaims(...).DeleteCollection instead
of discarding it, and ensure the Resource.clear cleanup path is treated
consistently so cleanup failures are not silently lost.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 461bf04d-180b-4c3d-83be-ef90c41ee880

📥 Commits

Reviewing files that changed from the base of the PR and between c22cb9a and 3a28f28.

⛔ Files ignored due to path filters (1)
  • integration-tests/backend/go.sum is excluded by !**/*.sum
📒 Files selected for processing (33)
  • integration-tests/backend/Dockerfile
  • integration-tests/backend/Makefile
  • integration-tests/backend/alerts.go
  • integration-tests/backend/aws_sts.go
  • integration-tests/backend/azure_utils.go
  • integration-tests/backend/backend_suite_test.go
  • integration-tests/backend/custom_metrics.go
  • integration-tests/backend/flowcollector.go
  • integration-tests/backend/flowcollector_utils.go
  • integration-tests/backend/flowcollectorslice.go
  • integration-tests/backend/gcp_utils.go
  • integration-tests/backend/go.mod
  • integration-tests/backend/ip_utils.go
  • integration-tests/backend/k8s_client.go
  • integration-tests/backend/kafka.go
  • integration-tests/backend/loki.go
  • integration-tests/backend/loki_client.go
  • integration-tests/backend/loki_storage.go
  • integration-tests/backend/metrics.go
  • integration-tests/backend/multitenants.go
  • integration-tests/backend/openstack_utils.go
  • integration-tests/backend/operator.go
  • integration-tests/backend/sctp.go
  • integration-tests/backend/template_processor.go
  • integration-tests/backend/test_exporters.go
  • integration-tests/backend/test_flowcollector.go
  • integration-tests/backend/test_flowcollectorslice.go
  • integration-tests/backend/test_flowmetrics.go
  • integration-tests/backend/udn.go
  • integration-tests/backend/util.go
  • integration-tests/backend/util_cloud.go
  • integration-tests/backend/util_workload_identity.go
  • integration-tests/backend/virtualization.go
🚧 Files skipped from review as they are similar to previous changes (5)
  • integration-tests/backend/flowcollectorslice.go
  • integration-tests/backend/alerts.go
  • integration-tests/backend/metrics.go
  • integration-tests/backend/operator.go
  • integration-tests/backend/loki_storage.go

@openshift-ci

openshift-ci Bot commented Jul 1, 2026

Copy link
Copy Markdown

@Amoghrd: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-operator 0333834 link false /test e2e-operator
ci/prow/e2etest 0333834 link true /test e2etest

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants