WIP: NETOBSERV-2812: Remove origin dependancy#2806
Conversation
|
@Amoghrd: This pull request references NETOBSERV-2812 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the spike to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Important Review skippedIgnore keyword(s) in the title. ⛔ Ignored keywords (1)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 15
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
integration-tests/backend/util.go (1)
539-563:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPod readiness check can pass before containers are initialized.
When
ContainerStatusesis empty,readystays true and polling exits early. This can mark pods ready while they are still pending.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/util.go` around lines 539 - 563, The readiness check in the WaitForPodsReadyWithLabel function has a logic flaw where an empty ContainerStatuses list causes the ready variable to remain true because the inner for loop never executes, incorrectly marking pods as ready before containers are initialized. Fix this by explicitly checking if the ContainerStatuses slice is empty and setting ready to false in that case, ensuring the function only reports pods as ready when containers have been initialized and their status shows they are ready.
🧹 Nitpick comments (3)
integration-tests/backend/operator.go (2)
217-217: 💤 Low valueRemove or restore commented code.
WaitForDeploymentPodsToBeReadyis commented out. Either remove it if no longer needed, or uncomment if the wait is still required for reliable operator deployment.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/operator.go` at line 217, The call to WaitForDeploymentPodsToBeReady is currently commented out in the operator initialization code. Determine whether this wait is still necessary for reliable operator deployment: if it is required, uncomment the line to restore the wait functionality; if it is no longer needed, remove the commented line entirely to keep the code clean.
653-663: ⚖️ Poor tradeoffHardcoded JSON patch path and CSV name are fragile.
The patch path
/spec/install/spec/deployments/0/spec/template/spec/containers/0/env/4/valueand CSV namenetobserv-operator.v0.0.0-sha-mainwill break silently if the CSV structure changes. Consider extracting the env var by name or documenting the expected CSV structure.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/operator.go` around lines 653 - 663, The hardcoded JSON patch path with array indices (index 0 for deployments, index 0 for containers, index 4 for env) and the hardcoded CSV name `netobserv-operator.v0.0.0-sha-main` in the Patch call are fragile and will silently break if the CSV structure changes. Instead of using hardcoded indices, fetch the ClusterServiceVersion resource first, locate the specific deployment and container by name rather than position, find the target environment variable by name instead of using index 4, and then dynamically construct the patch path based on the actual resource structure. This will make the test resilient to changes in the CSV structure while maintaining clarity about which specific resources and variables are being modified.integration-tests/backend/k8s_client_test.go (1)
11-40: ⚡ Quick winAdd missing edge-case coverage for this
_test.gofile.Current tests cover valid and not-found paths, but not empty/nil-style edges (for example empty namespace name and constructor behavior when kubeconfig input is invalid/missing). Please add those cases in this file’s test suite.
As per coding guidelines:
**/*_test.go: “Use Ginkgo/Gomega patterns for unit tests - include tests for valid cases, invalid input, and edge cases (empty, nil values)”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/k8s_client_test.go` around lines 11 - 40, The test suite in the k8s_client_test.go file is missing edge-case coverage for both the NewK8sClient constructor and the GetNamespace method. Add a test case that calls NewK8sClient with invalid or missing kubeconfig input to verify the constructor handles this scenario appropriately. Additionally, add a test case that calls GetNamespace with an empty string as the namespace name to verify it handles this edge case correctly. These tests should follow the Ginkgo/Gomega patterns referenced in the coding guidelines and ensure empty/nil values and invalid inputs are properly tested alongside the existing valid and not-found test cases.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@integration-tests/backend/alerts.go`:
- Around line 71-79: The type assertion on alertStatus[0] at line 79 is
unchecked and will panic if amtool returns an unexpected payload shape. Replace
the unchecked type assertion with a checked one using the ok pattern, and return
a proper error if the assertion fails instead of allowing a panic. The same
issue exists at line 91 and needs the same fix applied there as well.
In `@integration-tests/backend/flowcollectorslice.go`:
- Around line 67-73: The readiness polling in wait.PollUntilContextTimeout
treats all GET errors from
client.dynamic.Resource(gvr).Namespace(flowSlice.Namespace).Get(ctx,
flowSlice.Name, metav1.GetOptions{}) as retryable by returning false, nil, which
causes permanent API failures to wait the full 600 seconds instead of failing
fast. Fix this by checking the error type to distinguish between permanent
errors (like permission denied, not found, invalid request) and transient errors
- for permanent errors return true with the error to fail immediately, and for
transient errors continue returning false, nil to retry the polling.
In `@integration-tests/backend/go.mod`:
- Line 3: The go.mod file declares Go 1.26.3, but the Dockerfile is using an
older version specification (golang:1.25), causing a version mismatch that will
break the build. Update the Docker image version specification in the Dockerfile
from golang:1.25 to golang:1.26 (or a more specific 1.26.x version) to align
with the declared Go version in go.mod.
In `@integration-tests/backend/k8s_client.go`:
- Around line 343-354: The current error handling in the resource creation block
does not distinguish between different types of errors from the Get() call.
Instead of attempting to Create() on any error, you need to explicitly check if
the error is a NotFound error using apierrors.IsNotFound(). If the error is not
a NotFound error (such as RBAC errors, transport errors, or timeouts), return
that error immediately. Only proceed with the Create() attempt when the Get()
call returns a NotFound error. This prevents masking real errors with failed
create attempts.
In `@integration-tests/backend/loki_storage.go`:
- Around line 958-983: The Resource.clear method is incomplete after migration
to use K8sClient. Add the missing resource kinds `secret` and
`objectbucketclaims` to the switch statement in the clear method with their
appropriate schema.GroupVersionResource mappings (secret belongs to the core API
group with Version "v1" and Resource "secrets", while objectbucketclaims belongs
to the "objectbucket.io" group with Version "v1alpha1" and Resource
"objectbucketclaims"). Additionally, update all call sites to the clear method
to pass the correct K8sClient parameter instead of the old oc parameter,
ensuring consistency with the updated method signature.
In `@integration-tests/backend/metrics.go`:
- Around line 190-203: The code treats optional ServiceMonitor endpoint fields
as required and returns errors when they are absent. Update the logic where
`scheme` is retrieved from the endpoint map to default to "http" instead of
returning an error when the field is not found. Similarly, apply the same
approach to `tlsConfig.serverName` extraction by only using it when present
rather than erroring when absent. This ensures both optional fields are handled
gracefully with appropriate defaults.
In `@integration-tests/backend/operator.go`:
- Around line 359-368: The polling callback functions used in
wait.PollUntilContextTimeout (affecting the code at lines 359-368 and 376-384)
are using o.Expect to assert on errors returned from client.GetSubscription,
which causes immediate test failure on any transient error instead of allowing
the poll to retry. Remove the o.Expect assertion and instead return the error
directly from the callback function (similar to how CheckOperatorStatus handles
errors), so the polling mechanism can properly retry on transient errors before
failing the test.
In `@integration-tests/backend/template_processor.go`:
- Around line 107-116: The function applyResourceFromTemplateByAdmin creates a
temporary file through the processTemplate call and applies it via
client.ApplyYAML, but never removes the file afterward, causing repeated runs to
leak temporary files. Add cleanup code after the ApplyYAML call to delete the
temporary configFile using os.Remove. Consider using a defer statement to ensure
the cleanup happens regardless of whether ApplyYAML succeeds or fails, placing
it immediately after the configFile is successfully created.
- Around line 81-86: The parameter replacement loop in template_processor.go
iterates over an unordered map, causing nondeterministic replacement order and
potential corruption when overlapping keys exist (e.g., "NAME" and "NAMESPACE").
Fix this by extracting all keys from the params map, sorting them by length in
descending order (longest first) to prevent shorter keys from matching within
longer keys, then iterate over the sorted keys to perform replacements in a
deterministic and safe order.
In `@integration-tests/backend/test_flowcollectorslice.go`:
- Around line 81-84: The variable err is declared at line 81 in the BeforeEach
block and then redeclared again at line 118-120, which causes a compilation
error. At line 118-120, remove the redeclaration (change the `:=` operator to
`=` for assignment) instead of declaring a new variable, since err is already
declared at line 81 in the same BeforeEach block. This will allow both uses of
err in the function to reference the same variable without conflict.
- Around line 82-84: The K8sClient created at line 82 with NewK8sClient()
retains the original user identity and is reused for CreateFlowcollectorSlice,
DeleteFlowcollectorSlice, and WaitForFlowcollectorSliceReady operations after
oc.ChangeUser() is called at line 480, which bypasses RBAC checks since these
operations execute with the original identity rather than the switched user.
After the oc.ChangeUser() call, create a new K8sClient by calling NewK8sClient()
again and use this newly created client for all flowcollectorslice operations
that follow the user switch to ensure RBAC checks execute with the correct
switched user identity.
In `@integration-tests/backend/util.go`:
- Around line 246-249: The function ApplyResourceFromFile accepts a namespace
parameter ns but does not use it when calling client.ApplyYAML, breaking the
namespace override contract for existing callers. Either pass the ns parameter
to the ApplyYAML method call (if the method signature supports a namespace
argument), or if the namespace parameter is no longer needed, remove it from the
ApplyResourceFromFile function signature and update all call sites accordingly
to ensure namespace handling is applied correctly where needed.
- Around line 137-138: The applyFromTemplate function currently ignores errors
from the WaitForResourceToAppear method call by using the blank identifier, then
unconditionally returns nil. This masks resource readiness failures and falsely
reports success. Capture the error returned by r.WaitForResourceToAppear(client)
instead of discarding it, check if the error is not nil, and return that error.
Only return nil if WaitForResourceToAppear succeeds without error.
- Around line 301-307: Remove the unused `gvr` variable declaration in the
`getSAToken` function. The variable is declared with the GroupVersionResource
for authentication.k8s.io tokenrequests but is never actually used in the
function (the code uses `saGvr` instead), which causes a Go compilation failure.
Delete the entire unused `gvr` variable declaration block.
- Around line 82-97: The getRouteAddress function signature was changed to
accept a *K8sClient parameter instead of its previous type, but this change was
not propagated to all callers. Update all callers of getRouteAddress to pass the
correct *K8sClient type instead of *exutil.CLI. There are three callers that
need to be updated (identified in loki_storage.go, test_flowcollectorslice.go,
and test_flowcollector.go), while the caller in metrics.go is already correct
and requires no changes.
---
Outside diff comments:
In `@integration-tests/backend/util.go`:
- Around line 539-563: The readiness check in the WaitForPodsReadyWithLabel
function has a logic flaw where an empty ContainerStatuses list causes the ready
variable to remain true because the inner for loop never executes, incorrectly
marking pods as ready before containers are initialized. Fix this by explicitly
checking if the ContainerStatuses slice is empty and setting ready to false in
that case, ensuring the function only reports pods as ready when containers have
been initialized and their status shows they are ready.
---
Nitpick comments:
In `@integration-tests/backend/k8s_client_test.go`:
- Around line 11-40: The test suite in the k8s_client_test.go file is missing
edge-case coverage for both the NewK8sClient constructor and the GetNamespace
method. Add a test case that calls NewK8sClient with invalid or missing
kubeconfig input to verify the constructor handles this scenario appropriately.
Additionally, add a test case that calls GetNamespace with an empty string as
the namespace name to verify it handles this edge case correctly. These tests
should follow the Ginkgo/Gomega patterns referenced in the coding guidelines and
ensure empty/nil values and invalid inputs are properly tested alongside the
existing valid and not-found test cases.
In `@integration-tests/backend/operator.go`:
- Line 217: The call to WaitForDeploymentPodsToBeReady is currently commented
out in the operator initialization code. Determine whether this wait is still
necessary for reliable operator deployment: if it is required, uncomment the
line to restore the wait functionality; if it is no longer needed, remove the
commented line entirely to keep the code clean.
- Around line 653-663: The hardcoded JSON patch path with array indices (index 0
for deployments, index 0 for containers, index 4 for env) and the hardcoded CSV
name `netobserv-operator.v0.0.0-sha-main` in the Patch call are fragile and will
silently break if the CSV structure changes. Instead of using hardcoded indices,
fetch the ClusterServiceVersion resource first, locate the specific deployment
and container by name rather than position, find the target environment variable
by name instead of using index 4, and then dynamically construct the patch path
based on the actual resource structure. This will make the test resilient to
changes in the CSV structure while maintaining clarity about which specific
resources and variables are being modified.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2ef5ef27-20b4-4824-b6ed-94fd26c86200
⛔ Files ignored due to path filters (1)
integration-tests/backend/go.sumis excluded by!**/*.sum
📒 Files selected for processing (14)
integration-tests/backend/alerts.gointegration-tests/backend/flowcollector_utils.gointegration-tests/backend/flowcollectorslice.gointegration-tests/backend/go.modintegration-tests/backend/k8s_client.gointegration-tests/backend/k8s_client_test.gointegration-tests/backend/loki.gointegration-tests/backend/loki_storage.gointegration-tests/backend/metrics.gointegration-tests/backend/operator.gointegration-tests/backend/template_processor.gointegration-tests/backend/test_flowcollectorslice.gointegration-tests/backend/util.gointegration-tests/backend/version_checker.go
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
integration-tests/backend/util.go (2)
107-143:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winSame naive pluralization issue as template_processor.go.
Line 130:
strings.ToLower(gvk.Kind) + "s"breaks for irregular plurals. See comment on template_processor.go.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/util.go` around lines 107 - 143, The Resource.applyFromTemplate function uses naive string concatenation to pluralize the Kubernetes resource kind by simply appending "s" to the lowercased kind name. This approach fails for irregular plurals and should be replaced with proper pluralization logic. Locate the line in applyFromTemplate where the GroupVersionResource is constructed with `strings.ToLower(gvk.Kind) + "s"` and replace this naive pluralization with the same pluralization solution that was applied to the similar issue in template_processor.go, ensuring the resource name is correctly pluralized according to Kubernetes conventions.
127-131:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winNaive Kind-to-Resource pluralization in both files. Both locations use
strings.ToLower(gvk.Kind) + "s"which fails for irregular plurals (e.g.,Ingress→ingresss,NetworkPolicy→networkpolicys).
integration-tests/backend/util.go#L127-L131: Use a proper pluralization helper or lookup map inapplyFromTemplate.integration-tests/backend/template_processor.go#L126-L131: Apply same fix inprocessTemplateFileif this code path is kept.Consider using
k8s.io/apimachinery/pkg/api/meta.UnsafeGuessKindToResourceor a custom mapping for known CRDs.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/util.go` around lines 127 - 131, The Kind-to-Resource conversion at integration-tests/backend/util.go#L127-L131 in the applyFromTemplate function uses naive pluralization with strings.ToLower(gvk.Kind) + "s", which fails for irregular plurals like Ingress and NetworkPolicy. Replace this logic with a proper pluralization approach such as using k8s.io/apimachinery/pkg/api/meta.UnsafeGuessKindToResource or a custom mapping for known CRDs. Apply the identical fix to integration-tests/backend/template_processor.go#L126-L131 in the processTemplateFile function where the same naive pluralization pattern exists.
♻️ Duplicate comments (3)
integration-tests/backend/util.go (1)
250-254:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
nsparameter is unused.The
nsparameter is accepted butclient.ApplyYAMLis called without it. Either useclient.ApplyYAMLWithNamespace(ctx, file, ns)or remove the parameter.-func ApplyResourceFromFile(client *K8sClient, ns, file string) { +func ApplyResourceFromFile(client *K8sClient, file string) { ctx := context.Background() err := client.ApplyYAML(ctx, file)Or if namespace override is needed:
- err := client.ApplyYAML(ctx, file) + err := client.ApplyYAMLWithNamespace(ctx, file, ns)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/util.go` around lines 250 - 254, The `ApplyResourceFromFile` function accepts a `ns` namespace parameter but does not use it when calling the client method. Either pass the namespace to the client by replacing the `client.ApplyYAML(ctx, file)` call with `client.ApplyYAMLWithNamespace(ctx, file, ns)` to apply the YAML within the specified namespace, or remove the unused `ns` parameter from the function signature if namespace handling is not required.integration-tests/backend/loki_storage.go (1)
988-997:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winAdd missing GVR mappings used by current call sites.
Resource.cleardoes not handlesecretandobjectbucketclaims, so current migrated paths fail or silently skip cleanup.Proposed fix
switch r.Kind { case "lokistack": gvr = schema.GroupVersionResource{Group: "loki.grafana.com", Version: "v1", Resource: "lokistacks"} case "catalogsource": gvr = schema.GroupVersionResource{Group: "operators.coreos.com", Version: "v1alpha1", Resource: "catalogsources"} case "subscription": gvr = schema.GroupVersionResource{Group: "operators.coreos.com", Version: "v1alpha1", Resource: "subscriptions"} + case "secret": + gvr = schema.GroupVersionResource{Group: "", Version: "v1", Resource: "secrets"} + case "objectbucketclaims": + gvr = schema.GroupVersionResource{Group: "objectbucket.io", Version: "v1alpha1", Resource: "objectbucketclaims"} default: return fmt.Errorf("unsupported resource kind: %s", r.Kind) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/loki_storage.go` around lines 988 - 997, Add missing case statements to the switch block on r.Kind to handle "secret" and "objectbucketclaims" resource types. Currently the switch statement only handles "lokistack", "catalogsource", and "subscription", but the Resource.clear method needs to support these additional resource kinds to properly clean up resources at call sites. Insert the two new cases before the default case, each with the appropriate GroupVersionResource mapping (you will need to determine the correct API group, version, and resource name for each type based on your Kubernetes cluster configuration).integration-tests/backend/test_flowcollectorslice.go (1)
518-520:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRecreate
K8sClientafteroc.ChangeUser(...)in the multi-tenancy flow.
oc.ChangeUser(testUserName)occurs at Line 480, but client-backed create/wait/delete calls at Lines 518-520 and 594-596 still use the client initialized at Line 82. That can execute API calls with the wrong identity and invalidate RBAC assertions.Also applies to: 594-596
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/test_flowcollectorslice.go` around lines 518 - 520, The K8sClient initialized at line 82 is being reused after oc.ChangeUser(testUserName) is called at line 480, causing API calls to execute with the wrong user identity. Recreate the K8sClient immediately after the oc.ChangeUser(testUserName) call to ensure the client reflects the new user context. This recreation must happen before the subsequent calls to flowSliceA.CreateFlowcollectorSlice(client), flowSliceA.WaitForFlowcollectorSliceReady(client), and flowSliceA.DeleteFlowcollectorSlice(client) at lines 518-520, and before any equivalent calls at lines 594-596. Update all affected locations to use the newly created client instead of the one from line 82.
🧹 Nitpick comments (3)
integration-tests/backend/template_processor.go (1)
28-40: 💤 Low valueUnused
clientparameter inprocessTemplate.The
client *K8sClientparameter is never used in this function or passed toprocessTemplateFile. Either remove it or document why it's reserved for future use.-func processTemplate(client *K8sClient, parameters ...string) (string, error) { +func processTemplate(parameters ...string) (string, error) {This would require updating callers accordingly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/template_processor.go` around lines 28 - 40, Remove the unused client parameter from the processTemplate function signature since it is never referenced within the function body or passed to processTemplateFile. After removing the client *K8sClient parameter from the function definition, identify and update all call sites of processTemplate to remove the corresponding argument being passed to this function.integration-tests/backend/util.go (2)
908-935: ⚡ Quick winNew K8sClient created inside
getClientServerInfoinstead of accepting it as parameter.This function creates its own
K8sClient(lines 910-911) while other refactored functions accept*K8sClientas a parameter. This is inconsistent and prevents the caller from reusing an existing client.-func getClientServerInfo(oc *exutil.CLI, serverNS, clientNS, ipStackType string) (map[string]map[string]string, error) { - client, err := NewK8sClient() - o.Expect(err).NotTo(o.HaveOccurred()) +func getClientServerInfo(oc *exutil.CLI, client *K8sClient, serverNS, clientNS, ipStackType string) (map[string]map[string]string, error) {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/util.go` around lines 908 - 935, The getClientServerInfo function creates its own K8sClient instance inside the function body (lines 910-911) instead of accepting it as a parameter, which is inconsistent with other refactored functions and prevents callers from reusing an existing client. Modify the function signature to accept a *K8sClient parameter, remove the NewK8sClient call and error check inside the function, and use the passed client parameter instead. Update all callers of getClientServerInfo to pass their existing K8sClient instance.
461-488: 💤 Low valueHard-coded GVR mappings will require maintenance as new resource types are added.
Both
WaitForResourceToAppearandWaitUntilResourceIsGonehave separate switch statements mappingr.Kindto GVRs. Consider extracting a sharedkindToGVRhelper to reduce duplication and ensure consistency.func kindToGVR(kind string) (schema.GroupVersionResource, error) { switch kind { case "lokistack": return schema.GroupVersionResource{Group: "loki.grafana.com", Version: "v1", Resource: "lokistacks"}, nil // ... other cases default: return schema.GroupVersionResource{}, fmt.Errorf("unsupported resource kind: %s", kind) } }Also applies to: 490-521
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/util.go` around lines 461 - 488, Extract the hard-coded GVR mapping logic from both WaitForResourceToAppear and WaitUntilResourceIsGone functions into a shared helper function kindToGVR. Create a new kindToGVR helper function that accepts a kind string parameter and returns a schema.GroupVersionResource and error, containing the switch statement that maps resource kinds like "objectbucketclaims", "objectbuckets", and "catalogsource" to their corresponding GVRs. Then replace the duplicate switch statement in WaitForResourceToAppear (lines 461-488) with a call to this helper function, and do the same for the duplicate switch statement in WaitUntilResourceIsGone (lines 490-521) to eliminate duplication and ensure consistent GVR mappings across both functions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@integration-tests/backend/template_processor.go`:
- Around line 113-132: The current code treats each rawObj as a single YAML
document when it may contain multiple documents. When rawObj.Raw contains
multi-document YAML, only the first document gets processed by the parameter
substitution. Instead of passing the entire objYAML to substituteParameters as a
single block, use a YAML decoder loop (such as
k8s.io/apimachinery/pkg/util/yaml.NewYAMLOrJSONDecoder) to iterate through all
YAML documents within objYAML, apply parameter substitution to each document
individually, and append each processed document to processedYAMLs.
In `@integration-tests/backend/test_flowcollector.go`:
- Around line 386-388: After the migration to use *K8sClient for runtime
operator operations, the deferred cleanup calls for operator uninstallation
still reference the old `oc` parameter instead of the new `client` parameter.
Fix all four locations where this occurs: In
integration-tests/backend/test_flowcollector.go at lines 386-388 (anchor site in
the deferred cleanup call following ensureOperatorDeployed), change the
`g.DeferCleanup(LO.uninstallOperator, oc)` call to pass `client` instead of
`oc`. Apply the same fix in integration-tests/backend/test_flowcollector.go at
lines 379-380 in the !Lokiexisting branch. Similarly, fix
integration-tests/backend/test_flowcollectorslice.go at lines 132-133 in the
!Lokiexisting branch and at lines 139-141 in the reinstall branch after channel
mismatch, replacing `oc` with `client` in all deferred uninstallOperator calls.
In `@integration-tests/backend/test_flowmetrics.go`:
- Around line 74-76: Uncomment the AfterEach block (currently commented on lines
74-76) that contains the flow.DeleteFlowcollector(oc) call to re-enable proper
FlowCollector teardown after each test specification. This will prevent
FlowCollector state from leaking across test specs and causing subsequent tests
to become unstable.
---
Outside diff comments:
In `@integration-tests/backend/util.go`:
- Around line 107-143: The Resource.applyFromTemplate function uses naive string
concatenation to pluralize the Kubernetes resource kind by simply appending "s"
to the lowercased kind name. This approach fails for irregular plurals and
should be replaced with proper pluralization logic. Locate the line in
applyFromTemplate where the GroupVersionResource is constructed with
`strings.ToLower(gvk.Kind) + "s"` and replace this naive pluralization with the
same pluralization solution that was applied to the similar issue in
template_processor.go, ensuring the resource name is correctly pluralized
according to Kubernetes conventions.
- Around line 127-131: The Kind-to-Resource conversion at
integration-tests/backend/util.go#L127-L131 in the applyFromTemplate function
uses naive pluralization with strings.ToLower(gvk.Kind) + "s", which fails for
irregular plurals like Ingress and NetworkPolicy. Replace this logic with a
proper pluralization approach such as using
k8s.io/apimachinery/pkg/api/meta.UnsafeGuessKindToResource or a custom mapping
for known CRDs. Apply the identical fix to
integration-tests/backend/template_processor.go#L126-L131 in the
processTemplateFile function where the same naive pluralization pattern exists.
---
Duplicate comments:
In `@integration-tests/backend/loki_storage.go`:
- Around line 988-997: Add missing case statements to the switch block on r.Kind
to handle "secret" and "objectbucketclaims" resource types. Currently the switch
statement only handles "lokistack", "catalogsource", and "subscription", but the
Resource.clear method needs to support these additional resource kinds to
properly clean up resources at call sites. Insert the two new cases before the
default case, each with the appropriate GroupVersionResource mapping (you will
need to determine the correct API group, version, and resource name for each
type based on your Kubernetes cluster configuration).
In `@integration-tests/backend/test_flowcollectorslice.go`:
- Around line 518-520: The K8sClient initialized at line 82 is being reused
after oc.ChangeUser(testUserName) is called at line 480, causing API calls to
execute with the wrong user identity. Recreate the K8sClient immediately after
the oc.ChangeUser(testUserName) call to ensure the client reflects the new user
context. This recreation must happen before the subsequent calls to
flowSliceA.CreateFlowcollectorSlice(client),
flowSliceA.WaitForFlowcollectorSliceReady(client), and
flowSliceA.DeleteFlowcollectorSlice(client) at lines 518-520, and before any
equivalent calls at lines 594-596. Update all affected locations to use the
newly created client instead of the one from line 82.
In `@integration-tests/backend/util.go`:
- Around line 250-254: The `ApplyResourceFromFile` function accepts a `ns`
namespace parameter but does not use it when calling the client method. Either
pass the namespace to the client by replacing the `client.ApplyYAML(ctx, file)`
call with `client.ApplyYAMLWithNamespace(ctx, file, ns)` to apply the YAML
within the specified namespace, or remove the unused `ns` parameter from the
function signature if namespace handling is not required.
---
Nitpick comments:
In `@integration-tests/backend/template_processor.go`:
- Around line 28-40: Remove the unused client parameter from the processTemplate
function signature since it is never referenced within the function body or
passed to processTemplateFile. After removing the client *K8sClient parameter
from the function definition, identify and update all call sites of
processTemplate to remove the corresponding argument being passed to this
function.
In `@integration-tests/backend/util.go`:
- Around line 908-935: The getClientServerInfo function creates its own
K8sClient instance inside the function body (lines 910-911) instead of accepting
it as a parameter, which is inconsistent with other refactored functions and
prevents callers from reusing an existing client. Modify the function signature
to accept a *K8sClient parameter, remove the NewK8sClient call and error check
inside the function, and use the passed client parameter instead. Update all
callers of getClientServerInfo to pass their existing K8sClient instance.
- Around line 461-488: Extract the hard-coded GVR mapping logic from both
WaitForResourceToAppear and WaitUntilResourceIsGone functions into a shared
helper function kindToGVR. Create a new kindToGVR helper function that accepts a
kind string parameter and returns a schema.GroupVersionResource and error,
containing the switch statement that maps resource kinds like
"objectbucketclaims", "objectbuckets", and "catalogsource" to their
corresponding GVRs. Then replace the duplicate switch statement in
WaitForResourceToAppear (lines 461-488) with a call to this helper function, and
do the same for the duplicate switch statement in WaitUntilResourceIsGone (lines
490-521) to eliminate duplication and ensure consistent GVR mappings across both
functions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c1af67c2-dea7-4d3b-8a7e-8defe30cbed1
📒 Files selected for processing (18)
integration-tests/backend/alerts.gointegration-tests/backend/azure_utils.gointegration-tests/backend/backend_suite_test.gointegration-tests/backend/custom_metrics.gointegration-tests/backend/flowcollectorslice.gointegration-tests/backend/ip_utils.gointegration-tests/backend/k8s_client.gointegration-tests/backend/loki.gointegration-tests/backend/loki_storage.gointegration-tests/backend/metrics.gointegration-tests/backend/operator.gointegration-tests/backend/template_processor.gointegration-tests/backend/test_exporters.gointegration-tests/backend/test_flowcollector.gointegration-tests/backend/test_flowcollectorslice.gointegration-tests/backend/test_flowmetrics.gointegration-tests/backend/udn.gointegration-tests/backend/util.go
🚧 Files skipped from review as they are similar to previous changes (5)
- integration-tests/backend/flowcollectorslice.go
- integration-tests/backend/k8s_client.go
- integration-tests/backend/alerts.go
- integration-tests/backend/metrics.go
- integration-tests/backend/operator.go
The previous implementation tried to use username/password authentication directly in kubeconfig, which doesn't work in OpenShift. The correct approach is to use the OAuth password grant flow. This implementation replicates what 'oc login -u kubeadmin -p <password>' does: 1. Make HTTP request to /oauth/authorize with basic auth credentials 2. Extract access_token from the 302 redirect Location header 3. Return the token for use in API calls The OAuth endpoint uses the 'openshift-challenging-client' client_id with response_type=token, which is the standard OpenShift OAuth flow for CLI clients. Fixes: All 41 tests were failing in BeforeEach at util.go:1008 because the previous implementation couldn't authenticate to list secrets. Related: netobserv#2806
Migrated 17 compat_otp helper function calls to use native Ginkgo and our existing client-go implementations: Template Operations (10 calls): - Replace 5x compat_otp.ProcessTemplate with processTemplate(client, ...) - Replace 3x compat_otp.ApplyClusterResourceFromTemplateWithError with applyResourceFromTemplateByAdmin(client, ...) - Replace 2x compat_otp.ApplyNsResourceFromTemplate with applyResourceFromTemplateByAdmin(client, ...) Test Helpers (12 calls): - Replace 12x compat_otp.By with g.By (native Ginkgo) Files modified: - loki.go: Removed compat_otp import, added g import for g.By - test_exporters.go: 1 ApplyNsResourceFromTemplate replacement - test_flowcollector.go: 5 ProcessTemplate + 3 ApplyCluster + 1 ApplyNs + 11 By All template operations now use our processTemplate and applyResourceFromTemplateByAdmin functions with client-go instead of oc. Remaining compat_otp usage: ~100 calls (mostly pod operations)
Added helper functions in util.go: - getAllPods(client, namespace) - lists all pod names - getAllPodsWithLabel(client, namespace, label) - lists pods matching label - getPodNodeName(client, namespace, podName) - gets node name for pod Replaced 24 compat_otp helper calls in test_flowcollector.go: - 5x GetAllPods → getAllPods - 11x GetAllPodsWithLabel → getAllPodsWithLabel - 3x GetPodNodeName → getPodNodeName - 4x DebugNodeWithChroot → debugNodeWithCommand (converted variadic args to single command string) All pod operations now use client.clientset.CoreV1().Pods() instead of oc. Remaining compat_otp usage: ~76 calls
Added assertAllPodsToBeReady function in util.go: - Polls with 10s interval, 4min timeout - Checks all pods have Ready condition = True - Skips completed pods (PodSucceeded phase) - Uses client.clientset.CoreV1().Pods().List() Replaced 33 compat_otp.AssertAllPodsToBeReady calls: - 25 in test_flowcollector.go - 8 in test_flowcollectorslice.go All pod readiness checks now use native Kubernetes client. Remaining compat_otp usage: ~43 calls
Added helper functions in util.go: - waitAndGetSpecificPodLogs(client, namespace, podName, filter) Polls for logs containing filter string (20s interval, 10min timeout) - getFirstWorkerNode(client) - returns first worker node - addLabelToNode(client, node, label, value) - adds label to node - deleteLabelFromNode(client, node, label) - removes label from node Replaced 8 compat_otp helper calls: - 2x WaitAndGetSpecificPodLogs → waitAndGetSpecificPodLogs - 3x GetFirstWorkerNode → getFirstWorkerNode - 1x AddLabelToNode → addLabelToNode - 1x DeleteLabelFromNode → deleteLabelFromNode - Fixed label name in defer (was 'test', should be 'netobserv-agent') All node operations now use client.clientset.CoreV1().Nodes(). Remaining compat_otp usage: ~35 calls
Added helper functions in util.go: - setNamespacePrivileged(client, namespace) - sets pod security labels - getNodeLabelValue(client, node, label) - gets label value from node - isHypershiftHostedCluster(client) - checks if cluster is hypershift - isTechPreviewNoUpgrade(client) - checks TechPreviewNoUpgrade feature set Replaced 13 compat_otp helper calls: - 2x SetNamespacePrivileged → setNamespacePrivileged - 2x GetResourceSpecificLabelValue → getNodeLabelValue - 2x IsHypershiftHostedCluster → isHypershiftHostedCluster - 1x CheckNetworkType → checkNetworkType (existing function) - 1x IsTechPreviewNoUpgrade → isTechPreviewNoUpgrade - 4x AssertWaitPollNoErr → o.Expect().NotTo(o.HaveOccurred()) - 1x GetRandomString → getRandomString (already migrated earlier) All cluster detection and assertions now use client-go. Remaining compat_otp usage: ~22 calls (mostly cloud provider operations)
Replaced remaining compat_otp.GetRandomString() with getRandomString(). All test helper operations now use client-go. Remaining compat_otp usage (22 total): - 4x NewCLI - initialization of oc variable in test files - 4x KubeConfigPath - used with NewCLI - 14x cloud provider operations (OpenStack, Azure, GCS) for Loki storage backend testing - these wrap third-party SDKs
Migrated performManagedIdentityAndSecretSetupForAzureWIF to use client-go: - Replaced oc call to get serviceAccountIssuer with getOIDC(client) - Removed oc parameter from function signature - Removed exutil import from azure_utils.go This function now uses only client-go for Kubernetes operations. Remaining oc usage in loki_storage.go and multitenants.go is for: - Cloud provider SDK operations (OpenStack, Azure, GCS) - OAuth/HTPasswd secret operations with --from-file - Subscription patching operations
Added helper functions in util.go: - getSubscriptionByPackageName(client, namespace, packageName) Finds subscription metadata.name by spec.name (package name) - patchSubscriptionRemoveConfig(client, namespace, subscriptionName) Removes spec.config from subscription using JSON patch Replaced oc calls in loki_storage.go removeObjectStorage: - oc get sub + jsonpath → getSubscriptionByPackageName - oc patch sub --type=json → patchSubscriptionRemoveConfig Added types import for JSONPatchType. Remaining oc usage in removeObjectStorage is only for cloud provider SDK operations (Azure, GCS, OpenStack blob container deletion).
Added helper functions in util.go: - getOAuthHTPasswdSecretName(client) - get htpasswd secret name from OAuth - createSecretFromFile(client, namespace, name, key, filePath) - create secret - extractSecretToFile(client, namespace, name, key, outputPath) - extract secret - updateSecretFromFile(client, namespace, name, key, filePath) - update secret - patchOAuthAddHTPasswdIdentityProvider(client, secretName) - add HTPasswd IDP Migrated multitenants.go functions: - getNewUser: removed oc parameter, now uses client-go for all operations - userCleanup: removed oc parameter, now uses client-go - Removed exutil import from multitenants.go Replaced oc operations: - oc get oauth/cluster -o jsonpath → getOAuthHTPasswdSecretName - oc create secret --from-file → createSecretFromFile - oc extract secret → extractSecretToFile - oc set data secret --from-file → updateSecretFromFile - oc patch oauth → patchOAuthAddHTPasswdIdentityProvider All OAuth and HTPasswd secret operations now use native client-go!
Added functions in azure_utils.go: - getAzureCloudStorageURISuffix(client) - get blob URI suffix by cloud type - newAzureContainerClient(client, accountName, accountKey, containerName) - createAzureStorageBlobContainer(client, accountName, accountKey, containerName) - deleteAzureStorageBlobContainer(client, accountName, accountKey, containerName) - emptyAzureBlobContainer(azClient, containerName) Replaced compat_otp Azure operations in loki_storage.go: - compat_otp.NewAzureContainerClient → newAzureContainerClient - compat_otp.CreateAzureStorageBlobContainer → createAzureStorageBlobContainer - compat_otp.DeleteAzureStorageBlobContainer → deleteAzureStorageBlobContainer All Azure blob container operations now use native Azure SDK directly without oc CLI dependency. Cloud name detection now uses client-go to query infrastructure.status.platformStatus.azure.cloudName.
Created gcp_utils.go with functions: - listGCSBuckets(client, projectID) - list all buckets in project - emptyGCSBucket(client, bucketName) - remove all objects from bucket - createGCSBucket(projectID, bucketName) - create or empty bucket - deleteGCSBucket(bucketName) - delete bucket and all objects Replaced compat_otp GCS operations in loki_storage.go: - compat_otp.CreateGCSBucket → createGCSBucket - compat_otp.DeleteGCSBucket → deleteGCSBucket All GCS bucket operations now use native Google Cloud Storage SDK directly without any oc CLI or origin dependency. Credentials are loaded from GOOGLE_APPLICATION_CREDENTIALS environment variable.
Created openstack_utils.go with functions: - openstackCredentials struct (replaces compat_otp.OpenstackCredentials) - getOpenStackCredentials(client) - get credentials from kube-system secret - newOpenStackClient(cred, serviceType) - create gophercloud client - getAuthenticatedUserID(providerClient) - extract user ID from auth - getOpenStackUserIDAndDomainID(cred) - get user and domain IDs - createOpenStackContainer(client, name) - create or empty container - deleteOpenStackContainer(client, name) - delete container - emptyOpenStackContainer(client, name) - remove all objects Replaced compat_otp OpenStack operations in loki_storage.go: - compat_otp.GetOpenStackCredentials → getOpenStackCredentials - compat_otp.NewOpenStackClient → newOpenStackClient - compat_otp.GetOpenStackUserIDAndDomainID → getOpenStackUserIDAndDomainID - compat_otp.CreateOpenStackContainer → createOpenStackContainer - compat_otp.DeleteOpenStackContainer → deleteOpenStackContainer - compat_otp.OpenstackCredentials → openstackCredentials Removed compat_otp import from loki_storage.go! All OpenStack Swift operations now use native gophercloud SDK with credentials from client-go secret access instead of oc extract.
Added helper functions in util.go: - generateTestNamespace(prefix) - creates unique test namespace names - createNamespace(client, name) - creates namespace - getServerURL(client) - gets API server URL from client config - getCurrentContext() - gets current kubeconfig context - getNodeArchitecture(client) - gets first node's architecture - getPodNameWithLabel(client, namespace, label) - finds pod by label - getServiceClusterIP(client, namespace, label) - gets service IP by label - execInPod(client, namespace, pod, command) - executes command in pod Migrated test_flowmetrics.go: - Removed oc variable and compat_otp import - Replaced oc.NewCLI/KubeConfigPath with generateTestNamespace - Now uses namespace variable instead of oc.Namespace() All operations now use client-go. Ready to migrate remaining test files.
- Replace oc variable with namespace string
- Replace all oc.AsAdmin().Run('get') calls with client-go APIs
- Replace oc.AsAdmin().Run('delete') calls with typed/dynamic clients
- Replace oc.AsAdmin().Run('exec') with execInPod helper
- Use getPodNameWithLabel and getServiceClusterIP helpers
- Remove compat_otp import entirely
All oc calls eliminated from test_exporters.go.
- Remove oc variable declaration - Add patchFlowCollector and patchFlowCollectorMerge helpers - Add scaleDeployment helper function - Replace namespace generation with generateTestNamespace - Replace server URL and context functions - Replace flowcollector patch operations (JSON and merge patches) - Replace deployment scale operations - Replace pod listing with field selectors - Replace HPA deletion with client-go - Add metav1 import Still has ~100+ oc calls remaining that need migration: - must-gather operations - taint operations - namespace creation/deletion - user/context management - various kubectl run/exec operations
- Add taintNode and removeTaintFromNode helpers - Add getPodEnvValue and getPodEnvByIndex helpers - Replace taint operations with client-go node updates - Replace namespace creation/deletion with helpers - Replace must-gather with kubectl exec - Replace pod env queries with client-go helpers - Replace pod logs with getPodLogs helper - Replace service queries with client-go - Replace kubectl apply/config with exec.Command - Fix namespace generation to use generateTestNamespace - Remove oc parameter from loki_storage methods - Fix syntax errors in namespace defer cleanup Still has ~30+ oc calls remaining to complete migration.
- Remove oc parameter from loki_storage methods - Add patchSubscription helper for subscription patches - Add labelNamespace and removeLabelFromNamespace helpers - Replace subscription patch with client-go - Replace namespace label operations with client-go - Replace oc.Namespace() with generateTestNamespace - Replace user context operations with kubectl commands - Fix g.DeferCleanup syntax for deleteNamespace Remaining oc calls: ~25 (UDN operations, scale, get operations, etc.)
- Fix serverURL scope issue in user context test - Fix deleteNamespace spacing issues - Now builds successfully with 30 oc calls remaining Remaining oc operations are complex: - UDN namespace creation (CreateSpecificNamespaceUDN) - Background exec operations - CRD/ClusterVersion queries with jsonpath - Pause/resume flowcollector operations - Various kubectl create/delete/get operations
Replaced all remaining oc calls (30 → 0): - Background exec operations with exec.Command().Start() - CRD checks with apiextClient (added to K8sClient) - Secret checks with client-go APIs - Pod logs with getPodLogs helper - FlowCollector queries with dynamic client + unstructured - Pause/resume operations with patchFlowCollectorMerge - Component listing with kubectl commands - UDN namespace creation with kubectl - Various kubectl create/delete operations - Namespace labeling with labelNamespace helper Fixes: - Added apiextClient to K8sClient struct - Fixed removeLabelFromNamespace bug - Removed compat_otp import entirely - Removed exutil import from loki_storage.go - Fixed undefined ctx/gvr declarations - Updated test_flowcollectorslice.go loki method calls test_flowcollector.go: ✅ 0 oc calls (100% migrated - 3714 lines) Build: ✅ Success
There was a problem hiding this comment.
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
integration-tests/backend/util.go (1)
130-142: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winApply all template documents through the shared YAML path.
yaml.Unmarshalonly loads one document, and the localKind + "s"GVR derivation has the same pluralization issue asapplyObject. Use the multi-doc apply path, then wait forrto appear.Proposed direction
- data, err := os.ReadFile(file) - if err != nil { - return fmt.Errorf("failed to read file %s: %w", file, err) - } - - var obj unstructured.Unstructured - if err := yaml.Unmarshal(data, &obj); err != nil { - return fmt.Errorf("failed to unmarshal YAML: %w", err) - } - - gvk := obj.GroupVersionKind() - gvr := schema.GroupVersionResource{ - Group: gvk.Group, - Version: gvk.Version, - Resource: strings.ToLower(gvk.Kind) + "s", - } - - _, err = client.dynamic.Resource(gvr).Namespace(r.Namespace).Create(ctx, &obj, metav1.CreateOptions{}) - if err != nil && !apierrors.IsAlreadyExists(err) { - return fmt.Errorf("failed to create resource: %w", err) - } + if err := client.ApplyYAMLWithNamespace(ctx, file, r.Namespace); err != nil { + return fmt.Errorf("failed to apply resource template: %w", err) + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/util.go` around lines 130 - 142, The single-document YAML handling in the apply flow is incomplete: `yaml.Unmarshal` only processes one document, and the ad hoc `GroupVersionResource` construction from `obj.GroupVersionKind()` duplicates the same pluralization problem as `applyObject`. Update this path to use the shared multi-document YAML apply logic already used elsewhere, so every template document is applied through the same helper and resource mapping, then wait for the target `r` to become present after all documents are created.integration-tests/backend/udn.go (1)
327-350: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winWait for layer2 dual-stack UDN readiness.
The layer2 dual-stack branch creates the UDN and exits without
waitUDNCRDApplied; only the default branch waits.Proposed fix
case "dualstack": udncrd = udnCRDResource{ crdname: crdName, namespace: namespace, @@ template: udnCRDLayer2dualStack, } udncrd.createLayer2DualStackUDNCRD(client) default: @@ } udncrd.createLayer2SingleStackUDNCRD(client) - err := waitUDNCRDApplied(client, namespace, udncrd.crdname) - o.Expect(err).NotTo(o.HaveOccurred()) } + err := waitUDNCRDApplied(client, namespace, udncrd.crdname) + o.Expect(err).NotTo(o.HaveOccurred())🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/udn.go` around lines 327 - 350, The layer2 dual-stack path in the UDN creation flow returns immediately after createLayer2DualStackUDNCRD without waiting for readiness. Update the "layer2" switch branch in udnCRDResource handling so the dualstack case also calls waitUDNCRDApplied, matching the single-stack path; use the existing createLayer2DualStackUDNCRD and waitUDNCRDApplied helpers to keep behavior consistent.integration-tests/backend/loki_client.go (1)
33-57: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winCompare memory using the same unit.
remainingMemoryis converted to Mi, butrequiredMemory.MilliValue()is milli-bytes. This makes the Loki resource gate wrong.Proposed fix
- var remainingCPU, remainingMemory int64 + var remainingCPU, remainingMemoryBytes int64 @@ cpu := node.Status.Allocatable.Cpu().MilliValue() mem := node.Status.Allocatable.Memory().Value() remainingCPU += cpu - remainingMemory += mem / (1024 * 1024) // Convert to Mi + remainingMemoryBytes += mem } @@ - e2e.Logf("the remaining cpu is: %d, and the remaning memory is: %d", remainingCPU, remainingMemory) - return remainingCPU > requiredCPU.MilliValue() && remainingMemory > requiredMemory.MilliValue() + e2e.Logf("the remaining cpu is: %d, and the remaning memory is: %d", remainingCPU, remainingMemoryBytes) + return remainingCPU > requiredCPU.MilliValue() && remainingMemoryBytes > requiredMemory.Value()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/loki_client.go` around lines 33 - 57, The memory comparison in compareClusterResources is using mismatched units, since remainingMemory is accumulated in Mi while requiredMemory is logged and compared via MilliValue(). Update the Loki resource gate so both sides use the same unit consistently, using the parsed quantity from k8sresource.ParseQuantity and a matching conversion for remainingMemory before the final comparison.
🟠 Major comments (29)
integration-tests/backend/test_flowcollectorslice.go-488-510 (1)
488-510: 🔒 Security & Privacy | 🟠 Major | ⚡ Quick winCreate the rolebindings before switching to the test user.
Line 490 moves kubectl to
userContext, but Lines 495-509 and 644-647 still use kubectl to grant RBAC. Those commands now run as the unprivileged user they are supposed to authorize, so setup will fail if the switch worked, and will silently test the wrong identity if it did not. Keep the rolebinding creation on the admin context, then switch touserContextfor the permission checks.Also applies to: 644-647
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/test_flowcollectorslice.go` around lines 488 - 510, The RBAC setup is happening after kubectl switches to userContext, so the rolebinding creation in the test flow is executed as the target unprivileged user instead of the admin context. Move the rolebinding creation calls in the test flow around the kubectl config use-context step so `exec.Command("kubectl", "create", "rolebinding", ...)` runs before switching contexts, then keep the context switch only for the subsequent permission checks. Apply the same fix anywhere else in this test that creates RBAC bindings after the context switch, including the later rolebinding block referenced by the same pattern.integration-tests/backend/test_exporters.go-139-147 (1)
139-147: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winRead the FlowCollector as cluster-scoped.
DeleteFlowcollectorinintegration-tests/backend/flowcollector.godeletes"cluster"without a namespace, but this check adds.Namespace(namespace). That makes the verification path inconsistent with the resource contract and can 404 before the exporter assertion runs. Also checkResolveResourceinstead of discarding it.Suggested fix
g.By("Verify flowcollector is deployed with IPFIX exporter") ctx := context.Background() - gvr, _ := client.ResolveResource("flowcollector") - fc, err := client.dynamic.Resource(gvr).Namespace(namespace).Get(ctx, "cluster", metav1.GetOptions{}) + gvr, err := client.ResolveResource("flowcollector") + o.Expect(err).NotTo(o.HaveOccurred()) + fc, err := client.dynamic.Resource(gvr).Get(ctx, "cluster", metav1.GetOptions{}) o.Expect(err).ToNot(o.HaveOccurred())🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/test_exporters.go` around lines 139 - 147, The FlowCollector lookup in the exporter test is using a namespace even though the resource is cluster-scoped, which makes the verification inconsistent with the delete path and can cause a 404. Update the `ResolveResource`/`dynamic.Resource(...).Get(...)` flow in `TestExporters` to fetch the `cluster` FlowCollector without `.Namespace(namespace)`, and handle the result from `ResolveResource` instead of ignoring it. Keep the existing exporter assertions after the corrected cluster-scoped fetch.integration-tests/backend/test_flowmetrics.go-67-78 (1)
67-78: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick winRegister teardown only after the FlowCollector is created.
integration-tests/backend/test_flowmetrics.go:67-78g.AfterEachstill runs onBeforeEachfailures and skips, butDeleteFlowcollectoralways deletes the cluster-scoped"cluster"FlowCollector. If setup exits beforeflow.CreateFlowcollector(client), this cleanup can remove a collector created by another spec. Move the cleanup registration to after the create call succeeds.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/test_flowmetrics.go` around lines 67 - 78, The teardown in test_flowmetrics.go is registered too early and can delete the shared cluster-scoped FlowCollector even when setup fails. Update the BeforeEach/AfterEach flow around Flowcollector.CreateFlowcollector and Flowcollector.DeleteFlowcollector so the cleanup registration happens only after CreateFlowcollector(client) succeeds, ensuring DeleteFlowcollector is not run for skipped or failed setup.integration-tests/backend/template_processor.go-143-145 (1)
143-145: 🔒 Security & Privacy | 🟠 Major | ⚡ Quick winWrite processed manifests with private temp-file permissions.
Template output can include Secrets or credentials.
0644makes the processed manifest readable by other local users while the test is running; useos.CreateTemp, which creates files with0600.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/template_processor.go` around lines 143 - 145, The temporary manifest file in the template processing flow is being created with world-readable permissions, which can expose Secrets or credentials. Update the write logic in the template processor helper to use os.CreateTemp instead of building a path with filepath.Join/os.TempDir and os.WriteFile, so the file is created with private 0600 permissions; keep the surrounding output serialization and cleanup behavior in the same processing path.integration-tests/backend/util.go-173-175 (1)
173-175: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winStrip the leading dot before
NestedString.For jsonpaths like
{.status.phase},strings.Trim(jsonPath, "{}")leaves.status.phase, sostrings.Splitproduces an empty first field and lookups never match.Proposed fix
- fields := strings.Split(strings.Trim(jsonPath, "{}"), ".") + path := strings.TrimPrefix(strings.Trim(jsonPath, "{}"), ".") + fields := strings.Split(path, ".")Apply the same change to
assertResourceStatus.Also applies to: 261-263
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/util.go` around lines 173 - 175, The jsonpath parsing in the `NestedString` lookup still leaves a leading dot for expressions like `{.status.phase}`, causing the first path segment to be empty and the lookup to fail. Update the parsing in the helper that builds `fields` before calling `unstructured.NestedString` to strip any leading dot after removing `{}` so the split path starts with `status`; apply the same correction in `assertResourceStatus` as well.integration-tests/backend/template_processor.go-105-109 (1)
105-109: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winDo not let
ignoreUnknownbypass required parameters.
--ignore-unknown-parameters=trueshould ignore extra supplied keys, not allow required template parameters to become empty strings. Missing required values should still return an error.Proposed fix
- } else if param.Required && !ignoreUnknown { + } else if param.Required { return "", fmt.Errorf("parameter %s is required but not provided", param.Name)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/template_processor.go` around lines 105 - 109, `ignoreUnknown` in `processTemplate` is incorrectly allowing required parameters to fall through to the empty-string branch. Update the required-parameter handling so `param.Required` always returns an error when the value is missing, regardless of `ignoreUnknown`, and keep `ignoreUnknown` limited to skipping unexpected extra keys; check the `param.Required` / `paramValues` logic in `template_processor.go` to make sure only optional parameters get default empty strings.integration-tests/backend/template_processor.go-134-140 (1)
134-140: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winSeparate generated YAML documents with a newline.
When
rawObj.Rawdoes not end with\n, the next---is appended to the previous document body, producing invalid YAML.Proposed fix
for i, objYAML := range processedYAMLs { if i > 0 { - output.WriteString("---\n") + output.WriteString("\n---\n") } output.Write(objYAML) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/template_processor.go` around lines 134 - 140, The multi-document assembly in template_processor.go is missing a newline before separators when a prior YAML chunk does not end with one, causing the next document marker to be glued to the body. Update the output-building logic in the loop that writes processedYAMLs so each document separator is always on its own line, preserving valid YAML regardless of whether rawObj.Raw ends with a trailing newline.integration-tests/backend/k8s_client.go-373-376 (1)
373-376: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winDefault namespace only after scope resolution
ApplyYAMLWithNamespaceandApplyResourceFromFileset a namespace before resource scope is known, so cluster-scoped manifests end up on namespaced requests and fail. Also replace theKind + "s"GVR construction inintegration-tests/backend/k8s_client.goandintegration-tests/backend/util.gowith REST-mapper lookup so non-standard plurals resolve correctly.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/k8s_client.go` around lines 373 - 376, Update ApplyYAMLWithNamespace and ApplyResourceFromFile so the default namespace is applied only after resource scope is resolved, not before, and skip namespace assignment for cluster-scoped objects. Also replace the manual Kind + "s" GVR construction in the k8s client and helper code with REST-mapper-based lookup so resource pluralization is resolved correctly for non-standard kinds.integration-tests/backend/backend_suite_test.go-122-126 (1)
122-126: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick winUse
JUNIT_REPORT_FILEfor the filtered JUnit path.reporterConfig.JUnitReportstill routes through the built-in--junit-reportpath, so Ginkgo can write the unfiltered report over this file. Read onlyJUNIT_REPORT_FILEhere, or clear the flag value beforeRunSpecs.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/backend_suite_test.go` around lines 122 - 126, The JUnit report path in the backend test setup is still being taken from g.GinkgoConfiguration().JUnitReport, which can let Ginkgo write the unfiltered report over the filtered file. Update the setup around g.GinkgoConfiguration() and RunSpecs so the filtered report path is sourced only from JUNIT_REPORT_FILE, or explicitly clear the built-in junit-report value before RunSpecs to prevent the flag from being reused.integration-tests/backend/util.go-1541-1544 (1)
1541-1544: 🔒 Security & Privacy | 🟠 Major | ⚡ Quick winReuse the kubeconfig transport for OAuth
This request sends kubeadmin credentials, so it should reuse the existing kubeconfig-backed transport instead of a barehttp.TransportwithInsecureSkipVerify, and set an HTTP timeout.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/util.go` around lines 1541 - 1544, The OAuth request setup in the test helper is using a bare http.Transport with InsecureSkipVerify instead of the existing kubeconfig-backed transport, and it also lacks an HTTP timeout. Update the transport creation in the helper that builds the OAuth client to reuse the kubeconfig transport used elsewhere in the integration tests, then apply a reasonable timeout on the HTTP client so kubeadmin credential requests do not hang indefinitely.Source: Linters/SAST tools
integration-tests/backend/k8s_client.go-486-493 (1)
486-493: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winUse the mapper API that matches a resource string.
RESTMapping(schema.GroupKind{}, resource)won’t resolve the inputs passed toResolveResource; switch this helper toResourceFor, and addShortcutExpanderonly if short names likecmare meant to work too.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/k8s_client.go` around lines 486 - 493, The ResolveResource helper is using the wrong mapper call for a raw resource string; update K8sClient.ResolveResource to use the mapper API that resolves a resource name directly, rather than RESTMapping with an empty GroupKind. If short names are expected to work, wrap the mapper with ShortcutExpander before calling ResourceFor; otherwise keep the lookup limited to direct resource names and preserve the existing error handling in ResolveResource.integration-tests/backend/virtualization.go-94-107 (1)
94-107: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winSelect readiness conditions by
type, not array position.
status.conditions[0]is not guaranteed to be the Available/Ready condition. This can pass on the wrong condition or keep polling after the resource is ready.Proposed fix pattern
- firstCondition, ok := conditions[0].(map[string]interface{}) - if !ok { - return false, nil - } - - status, ok := firstCondition["status"].(string) - if !ok || status != "True" { - return false, nil + for _, condition := range conditions { + conditionMap, ok := condition.(map[string]interface{}) + if !ok { + continue + } + conditionType, _, _ := unstructured.NestedString(conditionMap, "type") + status, _, _ := unstructured.NestedString(conditionMap, "status") + if conditionType == "Available" && status == "True" { + return true, nil + } } - - return true, nil + return false, nilUse
conditionType == "Ready"inwaitUntilVMReady.Also applies to: 151-166
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/virtualization.go` around lines 94 - 107, The readiness check in waitUntilVMReady is using the first entry in status.conditions, which can miss the actual Ready/Available condition. Update the logic to iterate the conditions slice and select the condition by its type field (for example, type == "Ready") before checking its status, and apply the same change to the duplicate condition check elsewhere in the file. Use the existing waitUntilVMReady helper and the status.conditions handling as the anchor points for the fix.integration-tests/backend/udn.go-364-369 (1)
364-369: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick winOnly retry not-found errors during readiness polling.
These loops currently swallow every GET error, including RBAC, mapper, and API failures, turning actionable failures into slow timeouts.
Proposed fix
obj, efErr := client.dynamic.Resource(gvr).Namespace(ns).Get(ctx, crdName, metav1.GetOptions{}) if efErr != nil { + if !apierrors.IsNotFound(efErr) { + return false, efErr + } e2e.Logf("Failed to get UDN %v, error: %s. Trying again", crdName, efErr) return false, nil } @@ obj, efErr := client.dynamic.Resource(gvr).Get(ctx, crdName, metav1.GetOptions{}) if efErr != nil { + if !apierrors.IsNotFound(efErr) { + return false, efErr + } e2e.Logf("Failed to get CUDN %v, error: %s. Trying again", crdName, efErr) return false, nil }Also applies to: 403-408
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/udn.go` around lines 364 - 369, The readiness polling in the UDN test helper is currently retrying every GET failure, which hides real failures. Update the polling logic in the get-and-check block used by the UDN readiness wait so it only returns false/continues when the error is a not-found condition, and otherwise surfaces the error immediately. Use the existing dynamic client GET call and the polling callback in the UDN helper to detect and special-case not-found while letting RBAC, discovery/mapper, and API errors fail fast.integration-tests/backend/sctp.go-20-28 (1)
20-28: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick winBound
kubectl debugcalls with a timeout.
Eventuallycannot interrupt a callback stuck insidedebugNodeWithCommand; the provided helper usesexec.CommandContext(context.Background()), so a hung debug command can hang the suite. Add a timeout indebugNodeWithCommand.Proposed fix in `integration-tests/backend/util.go`
- ctx := context.Background() + ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute) + defer cancel()Also applies to: 38-38
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/sctp.go` around lines 20 - 28, The `debugNodeWithCommand` helper can block forever because it runs `kubectl debug` with an uncancelable background context, so `Eventually` cannot recover from a hung command. Update `debugNodeWithCommand` in the util helper to use a real timeout-capable context for `exec.CommandContext`, and make sure callers like the SCTP install flow in `sctp.go` can rely on it returning within a bounded time. Keep the existing call sites unchanged aside from any needed timeout parameter wiring if you add one.integration-tests/backend/virtualization.go-44-51 (1)
44-51: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winScan for any metal worker instead of requiring every worker to be metal.
The current loop returns
falseon the first non-metal worker and returnstruewhen the worker list is empty.Proposed fix
for _, node := range nodes.Items { instanceType, ok := node.Labels["node.kubernetes.io/instance-type"] - if !ok || !strings.Contains(instanceType, "metal") { - e2e.Logf("Cluster does not have metal worker nodes") - return false + if ok && strings.Contains(instanceType, "metal") { + return true } } - return true + e2e.Logf("Cluster does not have metal worker nodes") + return false🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/virtualization.go` around lines 44 - 51, The metal-node check in the worker scan is too strict because it returns false on the first non-metal node and true when there are no nodes; update the logic in the function that iterates over nodes.Items to return true as soon as any node has a node.kubernetes.io/instance-type containing "metal", and only return false after checking all nodes. Keep the existing e2e.Logf message for the no-metal case, but move it to the final failure path rather than inside the loop.integration-tests/backend/sctp.go-74-92 (1)
74-92: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winDo not ignore node list errors.
If either OS-specific node list fails, SCTP preparation silently continues with an empty slice and the test proceeds without enabling SCTP.
Proposed fix
rhelWorkers, err := client.clientset.CoreV1().Nodes().List(ctx, metav1.ListOptions{ LabelSelector: "node-role.kubernetes.io/worker,node.openshift.io/os_id=rhel", }) + o.Expect(err).NotTo(o.HaveOccurred()) var rhelWorkerNames []string - if err == nil && len(rhelWorkers.Items) > 0 { + if len(rhelWorkers.Items) > 0 { for _, node := range rhelWorkers.Items { rhelWorkerNames = append(rhelWorkerNames, node.Name) } } rhcosWorkers, err := client.clientset.CoreV1().Nodes().List(ctx, metav1.ListOptions{ LabelSelector: "node-role.kubernetes.io/worker,node.openshift.io/os_id=rhcos", }) + o.Expect(err).NotTo(o.HaveOccurred()) var rhcosWorkerNames []string - if err == nil && len(rhcosWorkers.Items) > 0 { + if len(rhcosWorkers.Items) > 0 { for _, node := range rhcosWorkers.Items { rhcosWorkerNames = append(rhcosWorkerNames, node.Name) } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/sctp.go` around lines 74 - 92, The SCTP node discovery in the preparation logic is swallowing errors from the CoreV1().Nodes().List calls and continuing with empty worker lists, which can hide setup failures. In the SCTP setup path, make the code in the node-listing block fail fast when either the rhel or rhcos node query returns an error instead of only checking len(items); surface the error from the client.clientset.CoreV1().Nodes().List calls so SCTP preparation stops if node discovery fails.integration-tests/backend/azure_utils.go-290-298 (1)
290-298: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winKeep sovereign cloud suffix handling consistent.
getStorageAccountURISuffixAndEnvForAzurehandles Gov, China, and Germany clouds, butgetAzureCloudStorageURISuffixonly handles Gov. The shared-key helpers will hit the public Azure endpoint on China/Germany clusters.Proposed fix
cloudName, _, _ := unstructured.NestedString(obj.Object, "status", "platformStatus", "azure", "cloudName") - if strings.ToLower(cloudName) == "azureusgovernmentcloud" { + switch strings.ToLower(cloudName) { + case "azureusgovernmentcloud": return ".blob.core.usgovcloudapi.net" + case "azurechinacloud": + return ".blob.core.chinacloudapi.cn" + case "azuregermancloud": + return ".blob.core.cloudapi.de" } - // Add other clouds as needed: - // "azurechinacloud" -> ".blob.core.chinacloudapi.cn" - // "azuregermancloud" -> ".blob.core.cloudapi.de" return ".blob.core.windows.net" // AzurePublicCloud default🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/azure_utils.go` around lines 290 - 298, The cloud suffix handling is inconsistent between getStorageAccountURISuffixAndEnvForAzure and getAzureCloudStorageURISuffix, causing shared-key helpers to fall back to the public Azure endpoint for sovereign clouds. Update getAzureCloudStorageURISuffix to recognize the same Azure cloudName values already handled in getStorageAccountURISuffixAndEnvForAzure, especially China and Germany in addition to Gov, and return the matching blob URI suffix for each case with the public suffix as the default.integration-tests/backend/azure_utils.go-168-175 (1)
168-175: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winFix
ResolveResourceto use a real kind/GVK lookup.RESTMapping(schema.GroupKind{}, gvr.Resource)passes the resource name into the version slot, so helpers likegetAzureResourceGroupFromClusterwill fail to resolveinfrastructureand other dynamic resources.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/azure_utils.go` around lines 168 - 175, `ResolveResource` is doing a REST mapping with the resource name in the wrong place, so `getAzureResourceGroupFromCluster` cannot reliably resolve dynamic resources like `infrastructure`. Update the `K8sClient.ResolveResource` path to look up the real kind/GVK (for example via the resource’s GroupKind/metadata rather than passing `gvr.Resource` as the version), and keep the `getAzureResourceGroupFromCluster` call site unchanged so it can resolve the correct mapping.integration-tests/backend/flowcollector_utils.go-96-98 (1)
96-98: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick winBound this in-pod curl.
execCommandInSpecificPodalready uses an unbounded context upstream, so thiscurl -scan hang the suite if the collector endpoint stops responding. Add--fail --max-time ...here and/or pass a timeout into the exec helper.Suggested fix
- cmd := "curl -s http://localhost:8080/records?format=json" + cmd := "curl --fail --silent --show-error --max-time 10 http://localhost:8080/records?format=json"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/flowcollector_utils.go` around lines 96 - 98, The in-pod curl in flowcollector_utils.go can hang the test suite because execCommandInSpecificPod is using an unbounded context. Update the command built near the collector HTTP query to include curl timeout/failure flags such as --fail and --max-time, and/or change execCommandInSpecificPod to accept and enforce a timeout context. Keep the fix localized to the collector polling path so the helper and its callers remain easy to identify.integration-tests/backend/gcp_utils.go-64-75 (1)
64-75: 🔒 Security & Privacy | 🟠 Major | ⚡ Quick winAvoid project-wide bucket listing for the existence check.
listGCSBuckets(client, projectID)requiresstorage.buckets.list, which is broader than needed and can fail for identities that can manage a specific bucket but cannot enumerate the project. UseBucket(bucketName).Attrs(ctx)to check existence, or create and handleAlreadyExists.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/gcp_utils.go` around lines 64 - 75, The bucket existence check in the GCS helper is too broad because listGCSBuckets(projectID) requires project-wide bucket listing permissions. Update the logic in the bucket-check flow to use the storage client’s Bucket(bucketName).Attrs(ctx) for existence verification, or switch the creation path to handle AlreadyExists directly, so the check works with identities that can access a single bucket but not enumerate all buckets.integration-tests/backend/flowcollector.go-206-210 (1)
206-210: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
ResolveResourcecan't mapflowcollectoras written.
RESTMapping(schema.GroupKind{}, gvr.Resource)passes the resource name where the mapper expects aGroupKind/version, so these delete/readiness paths won't get a valid GVR. Use the FlowCollector GVK or a direct known GVR instead.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/flowcollector.go` around lines 206 - 210, The delete/readiness flow is resolving the Flowcollector resource with the wrong identifier, so the REST mapper cannot produce a valid mapping. Update DeleteFlowcollector and the related ResolveResource/RESTMapping path to use the FlowCollector GroupVersionKind (or a known GVR directly) instead of passing only the resource name, and make sure the lookup targets the correct kind for FlowCollector so the delete/readiness operations resolve successfully.integration-tests/backend/openstack_utils.go-37-37 (1)
37-37: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winWire
clouds.yaml.verifyinto the OpenStack client.
Verifyis parsed but never applied when buildingAuthenticatedClient, soverify: falsestill leaves TLS verification on. That breaks OpenStack setups with self-signed or internal certs; set the providerHTTPClientfrom this flag or load the cloud config via gophercloud’sclouds.yamlhelpers.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/openstack_utils.go` at line 37, Wire the parsed clouds.yaml Verify flag through the OpenStack client setup so it actually controls TLS verification. Update the logic around AuthenticatedClient in openstack_utils.go to use the provider HTTPClient configured from this field, or switch to gophercloud’s clouds.yaml helper path so Verify=false disables certificate verification as intended.integration-tests/backend/flowcollector.go-255-267 (1)
255-267: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winDon't key readiness off
reasonsubstring matches.
strings.Contains(reason, "Ready")returns true for values likeNotReady, and only checkingconditions[0]can miss the actual Ready condition entirely. Match theReadycondition bytype/statusacross all conditions instead.Suggested fix
- condition := conditions[0].(map[string]interface{}) - reason, _, _ := unstructured.NestedString(condition, "reason") - message, _, _ := unstructured.NestedString(condition, "message") - - if strings.Contains(reason, "Ready") { - return true, nil - } - - e2e.Logf("flowcollector status is %s due to %s", reason, message) + for _, raw := range conditions { + condition, ok := raw.(map[string]interface{}) + if !ok { + continue + } + conditionType, _, _ := unstructured.NestedString(condition, "type") + status, _, _ := unstructured.NestedString(condition, "status") + reason, _, _ := unstructured.NestedString(condition, "reason") + message, _, _ := unstructured.NestedString(condition, "message") + + if conditionType == "Ready" && status == "True" { + return true, nil + } + + e2e.Logf("flowcollector condition %s is %s due to %s", conditionType, status, message) + _ = reason + } return false, nil🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/flowcollector.go` around lines 255 - 267, The readiness check in flowcollector.go is incorrectly using a substring match on the first status condition’s reason, which can treat non-ready states as ready. Update the logic in the FlowCollector polling code to iterate over all entries from unstructured.NestedSlice(fc.Object, "status", "conditions") and identify the condition whose type is Ready, then verify its status rather than inspecting reason. Keep the existing condition parsing helpers, but replace the strings.Contains(reason, "Ready") check with an explicit Ready condition/status match.integration-tests/backend/kafka.go-187-195 (1)
187-195: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winCheck
Ready=True, not just condition presence.Strimzi resources can have a
Readycondition withstatus: "False"/"Unknown". This currently reports readiness as soon as the condition exists.Proposed fix
condType, ok := condMap["type"].(string) - if ok && condType == "Ready" { + condStatus, statusOK := condMap["status"].(string) + if ok && statusOK && condType == "Ready" && condStatus == "True" { return true, nil }condType, ok := condMap["type"].(string) - if ok && condType == "Ready" { + condStatus, statusOK := condMap["status"].(string) + if ok && statusOK && condType == "Ready" && condStatus == "True" { e2e.Logf("Waiting for kafka status %s", condType) return true, nil }Also applies to: 221-229
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/kafka.go` around lines 187 - 195, The readiness check in `hasReadyCondition` (and the similar condition scan in the other referenced block) currently returns true as soon as it finds a `Ready` condition, even if its status is not `True`. Update the condition handling to inspect the `status` field on the `Ready` condition and only report ready when it is explicitly `True`; otherwise keep scanning or return false. Use the existing `condMap`/`condType` logic to locate the `Ready` condition and add the status check there.integration-tests/backend/util_cloud.go-23-24 (1)
23-24: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winReturn errors when cloud values are missing.
These helpers currently return empty region/project/account/key with
nilerror when fields or Secret keys are absent.Proposed fix
import ( "context" + "fmt" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" ) @@ - region, _, err := unstructured.NestedString(obj.Object, "status", "platformStatus", "aws", "region") - return region, err + region, found, err := unstructured.NestedString(obj.Object, "status", "platformStatus", "aws", "region") + if err != nil { + return "", err + } + if !found || region == "" { + return "", fmt.Errorf("aws region not found in infrastructure status") + } + return region, nil @@ - projectID, _, err := unstructured.NestedString(obj.Object, "status", "platformStatus", "gcp", "projectID") - return projectID, err + projectID, found, err := unstructured.NestedString(obj.Object, "status", "platformStatus", "gcp", "projectID") + if err != nil { + return "", err + } + if !found || projectID == "" { + return "", fmt.Errorf("gcp projectID not found in infrastructure status") + } + return projectID, nil @@ for _, container := range imageRegistry.Spec.Template.Spec.Containers { for _, env := range container.Env { if env.Name == "REGISTRY_STORAGE_AZURE_ACCOUNTNAME" { accountName = env.Value break } } } + if accountName == "" { + return "", "", fmt.Errorf("azure storage account name not found in image-registry deployment") + } @@ - accountKey := secret.Data["REGISTRY_STORAGE_AZURE_ACCOUNTKEY"] + accountKey, ok := secret.Data["REGISTRY_STORAGE_AZURE_ACCOUNTKEY"] + if !ok || len(accountKey) == 0 { + return accountName, "", fmt.Errorf("azure storage account key not found in image registry secret") + } return accountName, string(accountKey), nil }Also applies to: 40-41, 54-69
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/util_cloud.go` around lines 23 - 24, The cloud helper functions currently treat missing nested values or Secret keys as success, returning empty strings with nil errors; update the helpers in util_cloud.go to validate the extracted values and return a non-nil error when region/project/account/key fields are absent. Use the existing nested lookup helpers around the region/project/account/key accessors (the functions covering the aws/gcp/azure paths) to detect missing fields after unstructured.NestedString or Secret key reads, and propagate a descriptive error instead of returning an empty value.integration-tests/backend/multitenants.go-202-206 (1)
202-206: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winPreserve the NotFound-tolerant cleanup path.
Line 206 re-asserts
err, so a NotFound still fails after being intentionally ignored.Proposed fix
err = client.dynamic.Resource(gvr).Delete(ctx, crbName, metav1.DeleteOptions{}) if err != nil && !apierrors.IsNotFound(err) { o.Expect(err).NotTo(o.HaveOccurred()) } - o.Expect(err).NotTo(o.HaveOccurred()) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/multitenants.go` around lines 202 - 206, The cleanup in the dynamic Delete path re-checks err after intentionally ignoring NotFound, which causes the tolerant teardown to fail anyway. Update the Delete handling in the multitenants test cleanup so the NotFound case is treated as success and only unexpected errors are asserted. Keep the logic centered around client.dynamic.Resource(gvr).Delete and the existing apierrors.IsNotFound check, and remove the redundant final expectation on err.integration-tests/backend/loki.go-75-90 (1)
75-90: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winReturn on the first workload readiness error.
erris overwritten on each loop iteration, so an earlier failed deployment/statefulset can be masked by a later successful one.Proposed fix
func (l lokiStack) waitForLokiStackToBeReady(client *K8sClient) error { - var err error for _, deploy := range []string{l.Name + "-distributor", l.Name + "-gateway", l.Name + "-querier", l.Name + "-query-frontend"} { - err = waitForDeploymentPodsToBeReady(client, l.Namespace, deploy) + if err := waitForDeploymentPodsToBeReady(client, l.Namespace, deploy); err != nil { + return err + } } for _, ss := range []string{l.Name + "-compactor", l.Name + "-index-gateway", l.Name + "-ingester"} { - err = waitForStatefulsetReady(client, l.Namespace, ss) + if err := waitForStatefulsetReady(client, l.Namespace, ss); err != nil { + return err + } } if isWorkloadIdentityCluster(client) { currentPlatform := checkPlatform(client) @@ } } - return err + return nil }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/loki.go` around lines 75 - 90, The readiness check in lokiStack.waitForLokiStackToBeReady overwrites err on each deployment/statefulset iteration, so a failure can be hidden by a later success. Update the loops to stop and return immediately on the first non-nil result from waitForDeploymentPodsToBeReady or waitForStatefulsetReady, preserving the earliest readiness error. Keep the workload-identity validation block unchanged, and ensure the function returns the first encountered error instead of the last one.integration-tests/backend/ip_utils.go-67-88 (1)
67-88: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winFail fast when the expected pod IPs are missing.
Returning
("", "")lets callers buildcurl :8080, so the connectivity check no longer targets the destination pod.Proposed fix
if (ipStack == "ipv6single") || (ipStack == "ipv4single") { - if len(pod.Status.PodIPs) > 0 { - podIP := pod.Status.PodIPs[0].IP - e2e.Logf("The pod %s IP in namespace %s is %q", podName, namespace, podIP) - return podIP, "" - } - return "", "" + o.Expect(pod.Status.PodIPs).NotTo(o.BeEmpty(), "pod %s/%s has no assigned IP", namespace, podName) + podIP := pod.Status.PodIPs[0].IP + e2e.Logf("The pod %s IP in namespace %s is %q", podName, namespace, podIP) + return podIP, "" } else if ipStack == "dualstack" { - if len(pod.Status.PodIPs) >= 2 { - podIP1 := pod.Status.PodIPs[1].IP - e2e.Logf("The pod's %s 1st IP in namespace %s is %q", podName, namespace, podIP1) - podIP2 := pod.Status.PodIPs[0].IP - e2e.Logf("The pod's %s 2nd IP in namespace %s is %q", podName, namespace, podIP2) - if netutils.IsIPv6String(podIP1) { - e2e.Logf("This is IPv4 primary dual stack cluster with IP %s", podIP1) - return podIP1, podIP2 - } - e2e.Logf("This is IPv6 primary dual stack cluster with IP %s", podIP2) - return podIP2, podIP1 + o.Expect(len(pod.Status.PodIPs)).To(o.BeNumerically(">=", 2), "pod %s/%s does not have dual-stack IPs", namespace, podName) + podIP1 := pod.Status.PodIPs[1].IP + e2e.Logf("The pod's %s 1st IP in namespace %s is %q", podName, namespace, podIP1) + podIP2 := pod.Status.PodIPs[0].IP + e2e.Logf("The pod's %s 2nd IP in namespace %s is %q", podName, namespace, podIP2) + if netutils.IsIPv6String(podIP1) { + e2e.Logf("This is IPv4 primary dual stack cluster with IP %s", podIP1) + return podIP1, podIP2 } + e2e.Logf("This is IPv6 primary dual stack cluster with IP %s", podIP2) + return podIP2, podIP1 }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/ip_utils.go` around lines 67 - 88, The pod IP lookup in the IP helper returns empty strings when the expected addresses are missing, which lets callers continue with an invalid destination. Update the pod IP retrieval logic in the function handling ipStack cases to fail fast instead of returning ("", "") when PodIPs is empty or dual-stack is incomplete, and surface a clear error to the caller so connectivity checks in this path can stop immediately.integration-tests/backend/kafka.go-128-164 (1)
128-164: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winResolve resources with the correct mapper input.
ResolveResourcepasses the resource string intoRESTMapping(schema.GroupKind{}, ...), but that API resolves a kind, not a resource name. These Kafka deletes will fail to resolve before hitting the API.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/kafka.go` around lines 128 - 164, The Kafka delete helpers are calling ResolveResource with resource names, but RESTMapping in that path expects a Kind, so resource resolution can fail before the delete request is sent. Update the delete methods in Kafka, KafkaTopic, KafkaNodePool, and the related user cleanup to pass the correct kind identifier into ResolveResource (or otherwise map the kind-to-resource correctly) so the dynamic client can resolve the GVR before calling client.dynamic.Resource(...).Delete.
🟡 Minor comments (2)
integration-tests/backend/backend_suite_test.go-162-165 (1)
162-165: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick winDo not count pending specs as failed.
SpecStatePendingis not a failed execution state, so this summary can report failures while the suite itself succeeds. Move pending specs to the skipped bucket.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/backend_suite_test.go` around lines 162 - 165, The suite summary in backend_suite_test.go is treating SpecStatePending as a failure in the failed-spec accounting. Update the switch in the report aggregation logic so only true failure states are appended to failedSpecs, and move SpecStatePending into the skipped branch alongside SpecStateSkipped; use the spec report handling around the failedSpecs and skipped buckets to locate the change.integration-tests/backend/loki.go-116-124 (1)
116-124: 🩺 Stability & Availability | 🟡 Minor | ⚡ Quick winDo not ignore PVC cleanup errors.
A failed PVC delete leaves storage behind and can pollute later tests.
Proposed fix
- _ = client.clientset.CoreV1().PersistentVolumeClaims(l.Namespace).DeleteCollection(ctx, metav1.DeleteOptions{}, metav1.ListOptions{ + err := client.clientset.CoreV1().PersistentVolumeClaims(l.Namespace).DeleteCollection(ctx, metav1.DeleteOptions{}, metav1.ListOptions{ LabelSelector: labelSelector, }) + o.Expect(err).NotTo(o.HaveOccurred())🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integration-tests/backend/loki.go` around lines 116 - 124, The PVC cleanup in removeLokiStack is currently ignoring delete failures, which can leave storage behind and affect later tests. Update lokiStack.removeLokiStack to handle and surface the error returned by client.clientset.CoreV1().PersistentVolumeClaims(...).DeleteCollection instead of discarding it, and ensure the Resource.clear cleanup path is treated consistently so cleanup failures are not silently lost.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 461bf04d-180b-4c3d-83be-ef90c41ee880
⛔ Files ignored due to path filters (1)
integration-tests/backend/go.sumis excluded by!**/*.sum
📒 Files selected for processing (33)
integration-tests/backend/Dockerfileintegration-tests/backend/Makefileintegration-tests/backend/alerts.gointegration-tests/backend/aws_sts.gointegration-tests/backend/azure_utils.gointegration-tests/backend/backend_suite_test.gointegration-tests/backend/custom_metrics.gointegration-tests/backend/flowcollector.gointegration-tests/backend/flowcollector_utils.gointegration-tests/backend/flowcollectorslice.gointegration-tests/backend/gcp_utils.gointegration-tests/backend/go.modintegration-tests/backend/ip_utils.gointegration-tests/backend/k8s_client.gointegration-tests/backend/kafka.gointegration-tests/backend/loki.gointegration-tests/backend/loki_client.gointegration-tests/backend/loki_storage.gointegration-tests/backend/metrics.gointegration-tests/backend/multitenants.gointegration-tests/backend/openstack_utils.gointegration-tests/backend/operator.gointegration-tests/backend/sctp.gointegration-tests/backend/template_processor.gointegration-tests/backend/test_exporters.gointegration-tests/backend/test_flowcollector.gointegration-tests/backend/test_flowcollectorslice.gointegration-tests/backend/test_flowmetrics.gointegration-tests/backend/udn.gointegration-tests/backend/util.gointegration-tests/backend/util_cloud.gointegration-tests/backend/util_workload_identity.gointegration-tests/backend/virtualization.go
🚧 Files skipped from review as they are similar to previous changes (5)
- integration-tests/backend/flowcollectorslice.go
- integration-tests/backend/alerts.go
- integration-tests/backend/metrics.go
- integration-tests/backend/operator.go
- integration-tests/backend/loki_storage.go
|
@Amoghrd: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Description
Move away from openshift/origin dependency in integration-tests
Dependencies
n/a
Checklist
Summary by CodeRabbit
New Features
Bug Fixes