Skip to content

CORENET-6714: Enable Network Observability on Day 0#2925

Open
OlivierCazade wants to merge 1 commit into
openshift:masterfrom
OlivierCazade:day0
Open

CORENET-6714: Enable Network Observability on Day 0#2925
OlivierCazade wants to merge 1 commit into
openshift:masterfrom
OlivierCazade:day0

Conversation

@OlivierCazade

@OlivierCazade OlivierCazade commented Mar 10, 2026

Copy link
Copy Markdown

Add Network Observability Day-0 Installation Support

The replace instruction in the go.mod file need to be removed before merging this branch, this can be removed once the PR on the API repo is merged

This PR introduces automatic Day-0 installation of the Network Observability operator, controlled by a new installNetworkObservability field in the Network CR.

Summary

The Cluster Network Operator (CNO) now handles initial deployment of the Network Observability operator and FlowCollector, enabling network flow monitoring out-of-the-box for OpenShift clusters while respecting user preferences and cluster topology.

Key Features

  1. Opt-out Model with SNO Detection

    • Network Observability is installed by default on multi-node clusters
    • Single Node OpenShift (SNO) clusters are excluded by default to conserve resources
    • Users can explicitly enable/disable via spec.installNetworkObservability
  2. Deployment Tracking

    • After successful deployment, the NetworkObservabilityDeployed condition is set
    • All subsequent reconciliations immediately return when this condition is present
    • This prevents reinstallation if users delete the operator or FlowCollector
    • Users maintain full control after initial deployment
  3. Status Reporting

    • New ObservabilityConfig status level for degraded state reporting
    • Comprehensive error handling with clear degraded status messages
    • Status cleared when Network Observability is running or disabled

Configuration

The spec.installNetworkObservability field accepts three values:

  • "" (empty/nil): Default behavior - install on multi-node clusters, skip on SNO
  • "Enable": Explicitly enable installation, even on SNO clusters
  • "Disable": Explicitly disable installation

RBAC Changes

Added permissions for the CNO to manage:

  • Subscriptions and ClusterServiceVersions (operators.coreos.com)
  • FlowCollectors (flows.netobserv.io)
  • Namespaces for operator deployment

Related

/cc @stleerh

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 10, 2026
@openshift-ci openshift-ci Bot requested a review from stleerh March 10, 2026 14:08
@openshift-ci-robot

openshift-ci-robot commented Mar 10, 2026

Copy link
Copy Markdown
Contributor

@OlivierCazade: This pull request references CORENET-6714 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Add Network Observability Day-0 Installation Support

The replace instruction in the go.mod file need to be removed before merging this branch, this can be removed once the PR on the API repo is merged

This PR introduces automatic Day-0 installation of the Network Observability operator, controlled by a new installNetworkObservability field in the Network CR.

Summary

The Cluster Network Operator (CNO) now handles initial deployment of the Network Observability operator and FlowCollector, enabling network flow monitoring out-of-the-box for OpenShift clusters while respecting user preferences and cluster topology.

Key Features

  1. Opt-out Model with SNO Detection
  • Network Observability is installed by default on multi-node clusters
  • Single Node OpenShift (SNO) clusters are excluded by default to conserve resources
  • Users can explicitly enable/disable via spec.installNetworkObservability
  1. Deployment Tracking
  • After successful deployment, the NetworkObservabilityDeployed condition is set
  • All subsequent reconciliations immediately return when this condition is present
  • This prevents reinstallation if users delete the operator or FlowCollector
  • Users maintain full control after initial deployment
  1. Status Reporting
  • New ObservabilityConfig status level for degraded state reporting
  • Comprehensive error handling with clear degraded status messages
  • Status cleared when Network Observability is running or disabled

Configuration

The spec.installNetworkObservability field accepts three values:

  • "" (empty/nil): Default behavior - install on multi-node clusters, skip on SNO
  • "Enable": Explicitly enable installation, even on SNO clusters
  • "Disable": Explicitly disable installation

RBAC Changes

Added permissions for the CNO to manage:

  • Subscriptions and ClusterServiceVersions (operators.coreos.com)
  • FlowCollectors (flows.netobserv.io)
  • Namespaces for operator deployment

Related

/cc @stleerh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Mar 10, 2026

Copy link
Copy Markdown

Summary by CodeRabbit

  • New Features

    • Added automated Network Observability deployment during cluster setup.
    • Included a new sample configuration option to enable Network Observability installation.
  • Bug Fixes

    • Improved handling for existing, missing, or not-yet-ready observability components.
    • Added retry and status reporting so deployment progress and failures are reflected more clearly.

Walkthrough

Adds a new observability-controller reconciler that installs the Network Observability Operator via OLM v1/v0 and creates a FlowCollector CR, gated by the NetworkObservabilityInstall feature flag and spec.networkObservability.installationPolicy. Includes RBAC/OLM manifests, controller registration, and comprehensive tests.

Changes

Network Observability Controller

Layer / File(s) Summary
Operator and FlowCollector manifests
bindata/observability/07-observability-operator.yaml, bindata/observability/08-flowcollector.yaml
Adds the Namespace, ServiceAccount, ClusterRole/Binding, namespace-scoped Role/Binding, OLM ClusterExtension, and FlowCollector YAML resources that the controller server-side-applies during installation.
Controller registration and scaffolding
pkg/controller/add_networkconfig.go, pkg/controller/observability/observability_controller.go (L1–79)
Imports the observability package, appends observability.Add to AddToManagerFuncs, and defines exported constants plus the Add() entrypoint that wires the reconciler to watch the singleton configv1.Network.
Reconcile loop and installation decision logic
pkg/controller/observability/observability_controller.go (L80–321)
Implements Reconcile, isFeatureGateEnabled, shouldInstallNetworkObservability, isSingleNodeCluster, wasNetworkObservabilityDeployed, and setNetworkObservabilityCondition; handles feature-gate gating, SNO detection, policy branching, and NetworkObservabilityDeployed condition upserts.
OLM detection and apply helpers
pkg/controller/observability/observability_controller.go (L323–611)
Implements isNetObservOperatorInstalled, checkOLMv1Installation, checkOLMv0Installation, applyManifest, installNetObservOperator, waitForNetObservOperator, isFlowCollectorExists, and createFlowCollector.
Controller tests
pkg/controller/observability/observability_controller_test.go
Covers the full reconciler surface: policy/SNO branching, OLM v1/v0 detection, apply helpers, idempotency, policy transitions, concurrent reconciles, status-condition assertions, and feature-gate behavior.
Sample config
sample-cluster-network-config.yaml
Adds a networkObservability.installationPolicy: InstallAndEnable example with inline comments for valid policy values.

Sequence Diagram(s)

sequenceDiagram
  participant CNO as cluster-network-operator
  participant OLMv1 as OLM v1 (ClusterExtension)
  participant OLMv0 as OLM v0 (CSV)
  participant Kube as Kubernetes API

  CNO->>CNO: isFeatureGateEnabled?
  CNO->>CNO: shouldInstallNetworkObservability?
  CNO->>Kube: Get flowcollectors CRD
  CNO->>OLMv1: Get ClusterExtension netobserv-operator
  alt OLMv1 not found
    CNO->>OLMv0: List CSVs in openshift-netobserv-operator
  end
  alt operator not installed
    CNO->>Kube: SSA apply 07-observability-operator.yaml
    CNO->>OLMv1: Poll Installed condition == True
  end
  CNO->>Kube: Get FlowCollector cluster
  alt FlowCollector missing
    CNO->>Kube: Ensure openshift-network-observability namespace
    CNO->>Kube: SSA apply 08-flowcollector.yaml
  end
  CNO->>Kube: Update operatorv1.Network condition NetworkObservabilityDeployed
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 11

❌ Failed checks (1 warning, 10 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.19% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Stable And Deterministic Test Names ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
Test Structure And Quality ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
Microshift Test Compatibility ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
Single Node Openshift (Sno) Test Compatibility ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
Topology-Aware Scheduling Compatibility ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
Ote Binary Stdout Contract ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
Ipv6 And Disconnected Network Test Compatibility ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
No-Weak-Crypto ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
Container-Privileges ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
No-Sensitive-Data-In-Logs ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: enabling Network Observability on Day 0.
Description check ✅ Passed The description matches the implemented Day-0 installation, deployment tracking, and RBAC changes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

level=error msg="Running error: context loading failed: failed to load packages: failed to load packages: failed to load with go/packages: err: exit status 1: stderr: go: inconsistent vendoring in :\n\tgithub.com/Masterminds/semver@v1.5.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/Masterminds/sprig/v3@v3.2.3: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/containernetworking/cni@v0.8.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/ghodss/yaml@v1.0.1-0.20190212211648-25d852aebe32: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/go-bindata/go-bindata@v3.1.2+incompatible: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/onsi/gomega@v1.39.1: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/ope

... [truncated 17357 characters] ...

red in go.mod, but not marked as explicit in vendor/modules.txt\n\tk8s.io/gengo/v2@v2.0.0-20251215205346-5ee0d033ba5b: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tk8s.io/kms@v0.35.2: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tk8s.io/kube-aggregator@v0.35.1: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tsigs.k8s.io/randfill@v1.0.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tsigs.k8s.io/structured-merge-diff/v6@v6.3.2: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\n\tTo ignore the vendor directory, use -mod=readonly or -mod=mod.\n\tTo sync the vendor directory, run:\n\t\tgo mod vendor\n"


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@openshift-ci-robot

openshift-ci-robot commented Mar 10, 2026

Copy link
Copy Markdown
Contributor

@OlivierCazade: This pull request references CORENET-6714 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Add Network Observability Day-0 Installation Support

The replace instruction in the go.mod file need to be removed before merging this branch, this can be removed once the PR on the API repo is merged

This PR introduces automatic Day-0 installation of the Network Observability operator, controlled by a new installNetworkObservability field in the Network CR.

Summary

The Cluster Network Operator (CNO) now handles initial deployment of the Network Observability operator and FlowCollector, enabling network flow monitoring out-of-the-box for OpenShift clusters while respecting user preferences and cluster topology.

Key Features

  1. Opt-out Model with SNO Detection
  • Network Observability is installed by default on multi-node clusters
  • Single Node OpenShift (SNO) clusters are excluded by default to conserve resources
  • Users can explicitly enable/disable via spec.installNetworkObservability
  1. Deployment Tracking
  • After successful deployment, the NetworkObservabilityDeployed condition is set
  • All subsequent reconciliations immediately return when this condition is present
  • This prevents reinstallation if users delete the operator or FlowCollector
  • Users maintain full control after initial deployment
  1. Status Reporting
  • New ObservabilityConfig status level for degraded state reporting
  • Comprehensive error handling with clear degraded status messages
  • Status cleared when Network Observability is running or disabled

Configuration

The spec.installNetworkObservability field accepts three values:

  • "" (empty/nil): Default behavior - install on multi-node clusters, skip on SNO
  • "Enable": Explicitly enable installation, even on SNO clusters
  • "Disable": Explicitly disable installation

RBAC Changes

Added permissions for the CNO to manage:

  • Subscriptions and ClusterServiceVersions (operators.coreos.com)
  • FlowCollectors (flows.netobserv.io)
  • Namespaces for operator deployment

Related

/cc @stleerh

Summary by CodeRabbit

  • New Features
  • Network Observability is now installable and manageable through cluster configuration—users can enable or disable network observability monitoring for their cluster.
  • Automatic deployment and lifecycle management of the observability operator and flow collection resources.
  • Network configuration status now reflects observability deployment state and health.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
manifests/0000_70_cluster-network-operator_02_rbac_observability.yaml (1)

11-14: Consider adding watch and delete verbs for OLM resources.

The controller may need watch to properly track subscription and CSV state changes via informers. Additionally, delete permission on subscriptions might be needed if you ever want to support uninstallation or cleanup scenarios.

🔧 Suggested addition
   # Manage OLM resources for operator installation
   - apiGroups: ["operators.coreos.com"]
     resources: ["subscriptions", "clusterserviceversions", "operatorgroups"]
-    verbs: ["get", "list", "create", "update", "patch"]
+    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@manifests/0000_70_cluster-network-operator_02_rbac_observability.yaml` around
lines 11 - 14, The RBAC rule for apiGroups "operators.coreos.com" covering
resources ["subscriptions", "clusterserviceversions", "operatorgroups"] is
missing the "watch" and "delete" verbs; update the verbs array for that rule
(for the resources "subscriptions", "clusterserviceversions", and
"operatorgroups") to include "watch" (so informers can track state changes) and
"delete" (to allow cleanup/uninstallation) in addition to the existing verbs.
pkg/controller/observability/observability_controller_test.go (1)

1512-1518: Test creates files in working directory - may cause side effects.

This test creates a manifests/ directory and writes FlowCollectorYAML in the working directory. While defer os.Remove(FlowCollectorYAML) removes the file, it doesn't remove the manifests directory, which could persist across test runs.

🔧 Suggested fix for proper cleanup
 	err := os.MkdirAll("manifests", 0755)
 	g.Expect(err).NotTo(HaveOccurred())
+	defer os.RemoveAll("manifests")  // Clean up directory

 	// Create the FlowCollector manifest at the expected path
 	err = os.WriteFile(FlowCollectorYAML, []byte(flowCollectorManifest), 0644)
 	g.Expect(err).NotTo(HaveOccurred())
-	defer os.Remove(FlowCollectorYAML)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/observability/observability_controller_test.go` around lines
1512 - 1518, The test creates a "manifests" directory and a file at
FlowCollectorYAML using os.MkdirAll and os.WriteFile but only defers removal of
the file, leaving the manifests directory behind; update the test to always
clean up the directory by deferring a call to remove the directory (e.g. defer
os.RemoveAll("manifests")) immediately after os.MkdirAll succeeds (and keep or
replace the existing defer os.Remove(FlowCollectorYAML)), so that both the file
created from flowCollectorManifest and the manifests directory are removed after
the test.
pkg/controller/observability/observability_controller.go (2)

356-392: Potential race condition in markNetworkObservabilityDeployed.

The function has a TOCTOU (time-of-check-time-of-use) pattern: it reads the latest Network CR, modifies conditions, then updates. If another controller or user modifies the Network CR between Get and Update, the update will fail with a conflict error (which would be retried by the caller).

This is acceptable since Kubernetes retries on conflicts, but consider using a retry loop here for better resilience.

🔧 Suggested: Add retry for conflict resilience
 func (r *ReconcileObservability) markNetworkObservabilityDeployed(ctx context.Context, network *configv1.Network) error {
+	return retry.RetryOnConflict(retry.DefaultBackoff, func() error {
+		return r.doMarkNetworkObservabilityDeployed(ctx)
+	})
+}
+
+func (r *ReconcileObservability) doMarkNetworkObservabilityDeployed(ctx context.Context) error {
 	// Check if condition already exists and is true
-	for _, condition := range network.Status.Conditions {
+	latest := &configv1.Network{}
+	if err := r.client.Get(ctx, types.NamespacedName{Name: "cluster"}, latest); err != nil {
+		return err
+	}
+
+	for _, condition := range latest.Status.Conditions {
 		if condition.Type == NetworkObservabilityDeployed && condition.Status == metav1.ConditionTrue {
 			return nil // Already marked as deployed
 		}
 	}
-
-	// Get the latest version of the Network CR to avoid conflicts
-	latest := &configv1.Network{}
-	if err := r.client.Get(ctx, types.NamespacedName{Name: "cluster"}, latest); err != nil {
-		return err
-	}
 	// ... rest of the function
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/observability/observability_controller.go` around lines 356 -
392, markNetworkObservabilityDeployed currently does a Get -> modify ->
Status().Update which can fail on a concurrent update; wrap the
Get/modify/Status().Update sequence in a retry loop using
k8s.io/apimachinery/pkg/util/wait.RetryOnConflict (or retry.RetryOnConflict) to
retry on conflicts, re-fetching the latest Network (using r.client.Get) inside
the loop before applying the condition and calling r.client.Status().Update so
transient conflicts are retried safely; keep the same condition logic and return
the final error from the retry.

288-319: Polling uses parent context which may already have timeout pressure.

waitForNetObservOperator uses the passed context for polling, but also defines its own checkTimeout (10 minutes). If the caller's context has a shorter deadline, the poll will exit early with the caller's context error rather than context.DeadlineExceeded from wait.PollUntilContextTimeout.

The current implementation may work correctly in practice since the controller-runtime context typically doesn't have a deadline, but this could cause unexpected behavior in tests or if the calling context changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/observability/observability_controller.go` around lines 288 -
319, The polling uses the caller's ctx which can have a shorter deadline;
instead create a new timeout-bound context for the poll so it always runs up to
checkTimeout: inside waitForNetObservOperator, call
context.WithTimeout(context.Background(), checkTimeout) (defer cancel) and pass
that new ctx to wait.PollUntilContextTimeout (keep checkInterval, checkTimeout,
condition as before); reference function waitForNetObservOperator and variables
checkInterval/checkTimeout to locate where to replace the ctx usage.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@go.mod`:
- Line 166: The go.mod contains a temporary replace directive pointing to a
personal fork for openshift/api#2752; do not merge until the upstream PR is
merged. Add a CI/merge-block gate that fails or flags the PR when the replace
directive "replace github.com/openshift/api" is present, add a clear TODO
comment adjacent to the replace line referencing openshift/api#2752 and stating
it must be removed when that PR merges, and add repository-level tracking (e.g.,
in the PR description or a maintainer checklist) to actively monitor
openshift/api#2752 and remove the replace directive immediately once the
upstream PR is approved and merged.

In `@pkg/controller/observability/observability_controller_test.go`:
- Around line 1250-1265: The test spawns goroutines that call r.Reconcile and
uses g.Expect inside those goroutines, which can panic because Gomega's fail
handler (t.FailNow) must run in the main test goroutine; fix by removing
g.Expect from the goroutines and collecting errors via a channel (e.g., make
errCh chan error), have each goroutine send its err to errCh after calling
r.Reconcile(req), then in the main goroutine close/iterate the channel and
assert with g.Expect that all received errors are nil; alternatively create a
per-goroutine Gomega tied to a testing.T (NewWithT) if you need per-goroutine
assertions—reference r.Reconcile, req, and the done/err channel approach to
locate where to change code.

In `@pkg/controller/observability/observability_controller.go`:
- Around line 169-173: The call to markNetworkObservabilityDeployed(err) is only
logged on failure which prevents the controller from retrying and can lead to
repeated FlowCollector creation; modify the reconciler to return the error
instead of just logging it so the reconciliation is requeued—locate the call to
r.markNetworkObservabilityDeployed(ctx, &network) in the reconciler function
(the surrounding code that currently does klog.Warningf on error) and replace
the log-without-return with returning a wrapped/annotated error (or simply
return err) so the controller will retry the reconcile loop; keep any logging
but ensure the function exits with an error when
markNetworkObservabilityDeployed fails.
- Around line 144-152: When waitForNetObservOperator(ctx) times out you
currently return ctrl.Result{RequeueAfter: 0}, nil which prevents any automatic
retry; update the timeout branch in the reconciler to either return a non-zero
RequeueAfter (e.g., RequeueAfter: time.Minute*5) so the controller will re-check
later, or return a wrapped error instead of nil to trigger error-based
requeueing; adjust the block around waitForNetObservOperator, the
r.status.SetDegraded call, and the return statement so the controller will retry
(refer to waitForNetObservOperator,
r.status.SetDegraded(statusmanager.ObservabilityConfig, "OperatorNotReady",
...), and the existing return ctrl.Result{RequeueAfter: 0}, nil).

---

Nitpick comments:
In `@manifests/0000_70_cluster-network-operator_02_rbac_observability.yaml`:
- Around line 11-14: The RBAC rule for apiGroups "operators.coreos.com" covering
resources ["subscriptions", "clusterserviceversions", "operatorgroups"] is
missing the "watch" and "delete" verbs; update the verbs array for that rule
(for the resources "subscriptions", "clusterserviceversions", and
"operatorgroups") to include "watch" (so informers can track state changes) and
"delete" (to allow cleanup/uninstallation) in addition to the existing verbs.

In `@pkg/controller/observability/observability_controller_test.go`:
- Around line 1512-1518: The test creates a "manifests" directory and a file at
FlowCollectorYAML using os.MkdirAll and os.WriteFile but only defers removal of
the file, leaving the manifests directory behind; update the test to always
clean up the directory by deferring a call to remove the directory (e.g. defer
os.RemoveAll("manifests")) immediately after os.MkdirAll succeeds (and keep or
replace the existing defer os.Remove(FlowCollectorYAML)), so that both the file
created from flowCollectorManifest and the manifests directory are removed after
the test.

In `@pkg/controller/observability/observability_controller.go`:
- Around line 356-392: markNetworkObservabilityDeployed currently does a Get ->
modify -> Status().Update which can fail on a concurrent update; wrap the
Get/modify/Status().Update sequence in a retry loop using
k8s.io/apimachinery/pkg/util/wait.RetryOnConflict (or retry.RetryOnConflict) to
retry on conflicts, re-fetching the latest Network (using r.client.Get) inside
the loop before applying the condition and calling r.client.Status().Update so
transient conflicts are retried safely; keep the same condition logic and return
the final error from the retry.
- Around line 288-319: The polling uses the caller's ctx which can have a
shorter deadline; instead create a new timeout-bound context for the poll so it
always runs up to checkTimeout: inside waitForNetObservOperator, call
context.WithTimeout(context.Background(), checkTimeout) (defer cancel) and pass
that new ctx to wait.PollUntilContextTimeout (keep checkInterval, checkTimeout,
condition as before); reference function waitForNetObservOperator and variables
checkInterval/checkTimeout to locate where to replace the ctx usage.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 826708e4-9aa4-4a6f-941f-be6e5a7c5d63

📥 Commits

Reviewing files that changed from the base of the PR and between 88891a9 and b1eeed3.

⛔ Files ignored due to path filters (36)
  • go.sum is excluded by !**/*.sum
  • vendor/github.com/openshift/api/config/v1/types_network.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.deepcopy.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/config/v1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/modules.txt is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.gitignore is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.golangci.yml is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.gomodcheck.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/CONTRIBUTING.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/FAQ.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/Makefile is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/OWNERS is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/OWNERS_ALIASES is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/README.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/RELEASE.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/SECURITY_CONTACTS is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/TMP-LOGGING.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/VERSIONING.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/alias.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/code-of-conduct.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/controller.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/options.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/webhook.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/client/config/config.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/client/config/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/conversion/conversion.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal_posix.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal_windows.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/scheme/scheme.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/webhook/conversion/conversion.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/webhook/conversion/decoder.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/webhook/conversion/metrics/metrics.go is excluded by !**/vendor/**, !vendor/**
📒 Files selected for processing (9)
  • go.mod
  • manifests/0000_70_cluster-network-operator_02_rbac_observability.yaml
  • manifests/07-observability-operator.yaml
  • manifests/08-flowcollector.yaml
  • pkg/controller/add_networkconfig.go
  • pkg/controller/observability/observability_controller.go
  • pkg/controller/observability/observability_controller_test.go
  • pkg/controller/statusmanager/status_manager.go
  • sample-config.yaml

Comment thread go.mod Outdated
Comment thread pkg/controller/observability/observability_controller_test.go
Comment thread pkg/controller/observability/observability_controller.go Outdated
Comment thread pkg/controller/observability/observability_controller.go Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
pkg/controller/observability/observability_controller_test.go (1)

1268-1283: ⚠️ Potential issue | 🟡 Minor

Race condition risk: g.Expect called inside goroutines.

Gomega's fail handler calls t.FailNow() which must only be called from the test goroutine. If an assertion fails inside a spawned goroutine, this can cause a test panic.

Suggested fix using error channel
 	// Run 5 concurrent reconciliations
 	done := make(chan bool, 5)
+	errChan := make(chan error, 5)
 	for i := 0; i < 5; i++ {
 		go func() {
 			_, err := r.Reconcile(context.TODO(), req)
-			// All should complete without error (idempotent)
-			g.Expect(err).NotTo(HaveOccurred())
+			errChan <- err
 			done <- true
 		}()
 	}

 	// Wait for all to complete
 	for i := 0; i < 5; i++ {
 		<-done
+		err := <-errChan
+		g.Expect(err).NotTo(HaveOccurred())
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/observability/observability_controller_test.go` around lines
1268 - 1283, The test currently calls g.Expect inside spawned goroutines (see
r.Reconcile, req, and done), which risks calling Gomega's FailNow from non-test
goroutines; instead capture errors in a channel from each goroutine and perform
the assertion in the main test goroutine. Spawn the 5 goroutines to call
r.Reconcile and send any returned error (or nil) into an errs channel, then
after reading all results from errs in the main goroutine use
g.Expect(err).NotTo(HaveOccurred()) for each entry, removing g.Expect from
inside the goroutines.
pkg/controller/observability/observability_controller.go (1)

144-152: ⚠️ Potential issue | 🟠 Major

Timeout stops reconciliation permanently without requeue.

When waitForNetObservOperator times out, returning RequeueAfter: 0 with no error means the controller stops trying. If the operator eventually succeeds, this controller won't notice and won't create the FlowCollector.

Consider returning a non-zero RequeueAfter to periodically check if the operator eventually becomes ready:

Suggested fix
 	if err := r.waitForNetObservOperator(ctx); err != nil {
 		if err == context.DeadlineExceeded {
 			klog.Errorf("Timed out waiting for Network Observability Operator to be ready after %v. Stopping reconciliation.", checkTimeout)
 			r.status.SetDegraded(statusmanager.ObservabilityConfig, "OperatorNotReady", fmt.Sprintf("Timed out waiting for Network Observability Operator to be ready after %v", checkTimeout))
-			return ctrl.Result{RequeueAfter: 0}, nil // Don't requeue
+			return ctrl.Result{RequeueAfter: 5 * time.Minute}, nil // Retry later
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/observability/observability_controller.go` around lines 144 -
152, When waitForNetObservOperator(ctx) returns context.DeadlineExceeded the
current code returns ctrl.Result{RequeueAfter: 0} which stops reconciliation
permanently; change that return to schedule a future requeue (e.g.,
ctrl.Result{RequeueAfter: someNonZeroDuration}) so the controller will
periodically re-check for the operator and eventually create the FlowCollector.
Update the branch handling DeadlineExceeded in the Reconcile logic (the block
around waitForNetObservOperator and r.status.SetDegraded) to return a non-zero
RequeueAfter (choose a sensible poll interval constant or reuse checkTimeout)
instead of 0 and keep returning nil error.
🧹 Nitpick comments (1)
pkg/controller/observability/observability_controller_test.go (1)

1507-1572: Test creates manifest file in working directory.

This test writes to manifests/08-flowcollector.yaml in the working directory. While the cleanup with defer os.Remove() handles the file, consider:

  1. If the test panics before defer is registered (lines 1534-1536), the file persists
  2. Could conflict with parallel tests using the same path

Using a temp directory and temporarily overriding the FlowCollectorYAML constant (if possible) would be more robust, but this is acceptable for an integration-style test.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/observability/observability_controller_test.go` around lines
1507 - 1572, TestReconcile_FirstTimeDeploymentSetsCondition writes to a fixed
manifests path (FlowCollectorYAML) which can leak or conflict; fix by creating a
temporary directory with os.MkdirTemp at test start, write the FlowCollector
manifest into that temp dir (use os.Create or os.WriteFile), set or inject the
FlowCollectorYAML path used by the reconciler to point to the temp file (or
refactor the code to accept a manifestPath parameter) before calling Reconcile,
and register defer os.RemoveAll(tempDir) immediately after creating the temp
directory to guarantee cleanup even on panic; update references to
FlowCollectorYAML in the test to use the temp file path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@pkg/controller/observability/observability_controller_test.go`:
- Around line 1268-1283: The test currently calls g.Expect inside spawned
goroutines (see r.Reconcile, req, and done), which risks calling Gomega's
FailNow from non-test goroutines; instead capture errors in a channel from each
goroutine and perform the assertion in the main test goroutine. Spawn the 5
goroutines to call r.Reconcile and send any returned error (or nil) into an errs
channel, then after reading all results from errs in the main goroutine use
g.Expect(err).NotTo(HaveOccurred()) for each entry, removing g.Expect from
inside the goroutines.

In `@pkg/controller/observability/observability_controller.go`:
- Around line 144-152: When waitForNetObservOperator(ctx) returns
context.DeadlineExceeded the current code returns ctrl.Result{RequeueAfter: 0}
which stops reconciliation permanently; change that return to schedule a future
requeue (e.g., ctrl.Result{RequeueAfter: someNonZeroDuration}) so the controller
will periodically re-check for the operator and eventually create the
FlowCollector. Update the branch handling DeadlineExceeded in the Reconcile
logic (the block around waitForNetObservOperator and r.status.SetDegraded) to
return a non-zero RequeueAfter (choose a sensible poll interval constant or
reuse checkTimeout) instead of 0 and keep returning nil error.

---

Nitpick comments:
In `@pkg/controller/observability/observability_controller_test.go`:
- Around line 1507-1572: TestReconcile_FirstTimeDeploymentSetsCondition writes
to a fixed manifests path (FlowCollectorYAML) which can leak or conflict; fix by
creating a temporary directory with os.MkdirTemp at test start, write the
FlowCollector manifest into that temp dir (use os.Create or os.WriteFile), set
or inject the FlowCollectorYAML path used by the reconciler to point to the temp
file (or refactor the code to accept a manifestPath parameter) before calling
Reconcile, and register defer os.RemoveAll(tempDir) immediately after creating
the temp directory to guarantee cleanup even on panic; update references to
FlowCollectorYAML in the test to use the temp file path.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b46e56ed-85bd-48c6-8a11-d0576b268eb9

📥 Commits

Reviewing files that changed from the base of the PR and between b1eeed3 and 9cc8e57.

⛔ Files ignored due to path filters (8)
  • go.sum is excluded by !**/*.sum
  • vendor/github.com/openshift/api/config/v1/types_network.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.deepcopy.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.swagger_doc_generated.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/api/features.md is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/api/features/features.go is excluded by !vendor/**, !**/vendor/**
  • vendor/modules.txt is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (4)
  • go.mod
  • pkg/controller/observability/observability_controller.go
  • pkg/controller/observability/observability_controller_test.go
  • sample-config.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • go.mod
  • sample-config.yaml

@openshift-ci-robot

openshift-ci-robot commented Mar 17, 2026

Copy link
Copy Markdown
Contributor

@OlivierCazade: This pull request references CORENET-6714 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Add Network Observability Day-0 Installation Support

The replace instruction in the go.mod file need to be removed before merging this branch, this can be removed once the PR on the API repo is merged

This PR introduces automatic Day-0 installation of the Network Observability operator, controlled by a new installNetworkObservability field in the Network CR.

Summary

The Cluster Network Operator (CNO) now handles initial deployment of the Network Observability operator and FlowCollector, enabling network flow monitoring out-of-the-box for OpenShift clusters while respecting user preferences and cluster topology.

Key Features

  1. Opt-out Model with SNO Detection
  • Network Observability is installed by default on multi-node clusters
  • Single Node OpenShift (SNO) clusters are excluded by default to conserve resources
  • Users can explicitly enable/disable via spec.installNetworkObservability
  1. Deployment Tracking
  • After successful deployment, the NetworkObservabilityDeployed condition is set
  • All subsequent reconciliations immediately return when this condition is present
  • This prevents reinstallation if users delete the operator or FlowCollector
  • Users maintain full control after initial deployment
  1. Status Reporting
  • New ObservabilityConfig status level for degraded state reporting
  • Comprehensive error handling with clear degraded status messages
  • Status cleared when Network Observability is running or disabled

Configuration

The spec.installNetworkObservability field accepts three values:

  • "" (empty/nil): Default behavior - install on multi-node clusters, skip on SNO
  • "Enable": Explicitly enable installation, even on SNO clusters
  • "Disable": Explicitly disable installation

RBAC Changes

Added permissions for the CNO to manage:

  • Subscriptions and ClusterServiceVersions (operators.coreos.com)
  • FlowCollectors (flows.netobserv.io)
  • Namespaces for operator deployment

Related

/cc @stleerh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
pkg/controller/observability/observability_controller.go (2)

157-162: ⚠️ Potential issue | 🟠 Major

Timeout path disables automatic recovery

Line 161 returns ctrl.Result{RequeueAfter: 0}, nil on timeout, so reconciliation stops and may never create the FlowCollector even if the operator becomes ready later.

Suggested fix
 	if err := r.waitForNetObservOperator(ctx); err != nil {
 		if err == context.DeadlineExceeded {
 			klog.Errorf("Timed out waiting for Network Observability Operator to be ready after %v. Stopping reconciliation.", checkTimeout)
 			r.status.SetDegraded(statusmanager.ObservabilityConfig, "OperatorNotReady", fmt.Sprintf("Timed out waiting for Network Observability Operator to be ready after %v", checkTimeout))
-			return ctrl.Result{RequeueAfter: 0}, nil // Don't requeue
+			return ctrl.Result{RequeueAfter: 5 * time.Minute}, nil
 		}

183-185: ⚠️ Potential issue | 🟠 Major

Deployment-condition update errors are swallowed

At Line 183-Line 185, markNetworkObservabilityDeployed failure is only logged. That can cause repeated deploy-path execution on later reconciles instead of retrying the status write immediately.

Suggested fix
 	// Mark as deployed in Network CR status
 	if err := r.markNetworkObservabilityDeployed(ctx, &network); err != nil {
-		klog.Warningf("Failed to update Network Observability deployment status: %v", err)
+		r.status.SetDegraded(statusmanager.ObservabilityConfig, "MarkDeployedError", fmt.Sprintf("Failed to update Network Observability deployment status: %v", err))
+		return ctrl.Result{}, err
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/observability/observability_controller.go` around lines 183 -
185, The call to markNetworkObservabilityDeployed is currently logging failures
and swallowing the error, which prevents the reconcile loop from retrying status
writes; change the code in observability_controller.go so that if
r.markNetworkObservabilityDeployed(ctx, &network) returns an error you propagate
that error (e.g., return fmt.Errorf("markNetworkObservabilityDeployed: %w", err)
or wrap with kerrors) instead of only calling klog.Warningf so the reconciler
will requeue and retry the status update; locate the
markNetworkObservabilityDeployed call in the reconcile flow and replace the
swallowed-log branch with an error return (or requeue result) so failures are
retried immediately.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/controller/observability/observability_controller.go`:
- Around line 326-338: The readiness check in waitForNetObservOperator currently
returns after inspecting the first matching csv in csvs.Items which can be
non-deterministic; instead collect all CSVs whose name has the
"netobserv-operator" prefix, determine the intended CSV by selecting the one
with the highest semantic version parsed from the name (e.g., parse the version
suffix from csv.GetName() like "netobserv-operator.vX.Y.Z" and compare using
semver semantics), and then call unstructured.NestedString on that selected
CSV's Object to check status.phase; update the logic around csvs.Items,
csv.GetName(), and unstructured.NestedString to implement this deterministic
selection before returning phase == "Succeeded".

---

Duplicate comments:
In `@pkg/controller/observability/observability_controller.go`:
- Around line 183-185: The call to markNetworkObservabilityDeployed is currently
logging failures and swallowing the error, which prevents the reconcile loop
from retrying status writes; change the code in observability_controller.go so
that if r.markNetworkObservabilityDeployed(ctx, &network) returns an error you
propagate that error (e.g., return fmt.Errorf("markNetworkObservabilityDeployed:
%w", err) or wrap with kerrors) instead of only calling klog.Warningf so the
reconciler will requeue and retry the status update; locate the
markNetworkObservabilityDeployed call in the reconcile flow and replace the
swallowed-log branch with an error return (or requeue result) so failures are
retried immediately.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 63c37a51-4f01-487a-9005-0b35a6728d9f

📥 Commits

Reviewing files that changed from the base of the PR and between 9cc8e57 and 5bb3220.

📒 Files selected for processing (1)
  • pkg/controller/observability/observability_controller.go

Comment thread pkg/controller/observability/observability_controller.go Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/controller/observability/observability_controller.go (1)

133-146: Namespace creation handles not-found but not already-exists race.

The Get-then-Create pattern at lines 136-146 could encounter an AlreadyExists error if another process creates the namespace between the Get and Create calls. This would fail the reconciliation, though it will succeed on retry.

♻️ Optional: handle AlreadyExists gracefully
 		if errors.IsNotFound(err) {
 			if err := r.client.Create(ctx, ns); err != nil {
+				if errors.IsAlreadyExists(err) {
+					klog.V(4).Infof("Namespace %s was created by another process", OperatorNamespace)
+				} else {
 				r.status.SetDegraded(statusmanager.ObservabilityConfig, "CreateNamespaceError", fmt.Sprintf("Failed to create namespace %s: %v", OperatorNamespace, err))
 				return ctrl.Result{}, err
+				}
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/observability/observability_controller.go` around lines 133 -
146, The Get-then-Create namespace flow in Reconcile
(observability_controller.go) can race: after r.client.Get succeeds with
NotFound, r.client.Create may return AlreadyExists if another actor created the
namespace; update the Create error handling in the block that constructs ns
(using OperatorNamespace) to treat errors.IsAlreadyExists(err) as non-fatal (do
not call r.status.SetDegraded or return error) and only set degraded/return for
other errors; keep using r.status.SetDegraded(statusmanager.ObservabilityConfig,
...) for real create failures so reconciliation continues cleanly on races.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/controller/observability/observability_controller.go`:
- Around line 133-146: The Get-then-Create namespace flow in Reconcile
(observability_controller.go) can race: after r.client.Get succeeds with
NotFound, r.client.Create may return AlreadyExists if another actor created the
namespace; update the Create error handling in the block that constructs ns
(using OperatorNamespace) to treat errors.IsAlreadyExists(err) as non-fatal (do
not call r.status.SetDegraded or return error) and only set degraded/return for
other errors; keep using r.status.SetDegraded(statusmanager.ObservabilityConfig,
...) for real create failures so reconciliation continues cleanly on races.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c15118e0-7f89-43e5-a75e-3ba878565507

📥 Commits

Reviewing files that changed from the base of the PR and between 5bb3220 and 2a76763.

📒 Files selected for processing (1)
  • pkg/controller/observability/observability_controller.go

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
pkg/controller/observability/observability_controller_test.go (1)

1268-1283: ⚠️ Potential issue | 🟡 Minor

Race condition in concurrent test remains unaddressed.

Calling g.Expect() inside goroutines can cause test panics because Gomega's fail handler invokes t.FailNow(), which must only be called from the test's main goroutine.

🔧 Suggested fix using error channel
 	// Run 5 concurrent reconciliations
 	done := make(chan bool, 5)
+	errChan := make(chan error, 5)
 	for i := 0; i < 5; i++ {
 		go func() {
 			_, err := r.Reconcile(context.TODO(), req)
-			// All should complete without error (idempotent)
-			g.Expect(err).NotTo(HaveOccurred())
+			errChan <- err
 			done <- true
 		}()
 	}

 	// Wait for all to complete
 	for i := 0; i < 5; i++ {
 		<-done
+		err := <-errChan
+		g.Expect(err).NotTo(HaveOccurred())
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/observability/observability_controller_test.go` around lines
1268 - 1283, The test invokes g.Expect inside goroutines which can call
t.FailNow from non-test goroutines; instead collect errors from r.Reconcile in a
buffered channel and perform Gomega assertions in the main test goroutine.
Replace the done channel with an errChan (buffered to 5), have each goroutine
call r.Reconcile(context.TODO(), req) and send the returned error into errChan,
then loop 5 times in the main goroutine to receive err := <-errChan and call
g.Expect(err).NotTo(HaveOccurred()). This keeps the Reconcile calls concurrent
but ensures assertions (using g.Expect) run only on the main test goroutine.
🧹 Nitpick comments (2)
pkg/controller/observability/observability_controller_test.go (2)

936-938: Misleading test name and comment.

The test is named TestReconcile_FlowCollectorDeleted with comment "tests that reconciliation recreates FlowCollector if it gets deleted," but the actual assertion verifies that reconciliation skips reinstallation when the deployed condition is set. The test logic is correct per the PR's design (avoid reinstallation once deployed), but the name/comment should reflect this behavior.

♻️ Suggested rename for clarity
-// TestReconcile_FlowCollectorDeleted tests that reconciliation recreates
-// FlowCollector if it gets deleted
-func TestReconcile_FlowCollectorDeleted(t *testing.T) {
+// TestReconcile_SkipsReinstallWhenFlowCollectorDeleted tests that reconciliation
+// does NOT recreate FlowCollector after deletion if the deployed condition is set
+func TestReconcile_SkipsReinstallWhenFlowCollectorDeleted(t *testing.T) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/observability/observability_controller_test.go` around lines
936 - 938, Rename the test function TestReconcile_FlowCollectorDeleted and
update its comment to reflect the actual behavior being asserted: that
reconciliation skips reinstallation when the FlowCollector has the "deployed"
condition set. Specifically, change the function name (e.g., to
TestReconcile_SkipsReinstallWhenFlowCollectorDeployed) and update the preceding
comment to state "tests that reconciliation skips reinstallation of
FlowCollector when the deployed condition is present"; also update any
references to the old test name in the file (including test registration or
helper calls) so they remain consistent.

1530-1536: Test creates files in working directory without full cleanup.

The test creates manifests/ directory and writes a file but only removes the file on cleanup, not the directory. While manifests/ likely already exists in the repository, consider using t.TempDir() for full isolation, especially if tests run in parallel.

♻️ Suggested fix using temp directory
-	err := os.MkdirAll("manifests", 0755)
-	g.Expect(err).NotTo(HaveOccurred())
-
-	// Create the FlowCollector manifest at the expected path
-	err = os.WriteFile(FlowCollectorYAML, []byte(flowCollectorManifest), 0644)
-	g.Expect(err).NotTo(HaveOccurred())
-	defer os.Remove(FlowCollectorYAML)
+	// Use temp directory and override the manifest path for this test
+	tmpDir := t.TempDir()
+	manifestPath := filepath.Join(tmpDir, "flowcollector.yaml")
+	err := os.WriteFile(manifestPath, []byte(flowCollectorManifest), 0644)
+	g.Expect(err).NotTo(HaveOccurred())
+	
+	// Note: This requires the controller to accept a configurable manifest path
+	// or use a test-specific approach to override FlowCollectorYAML

Alternatively, if modifying the controller isn't feasible, ensure the manifests directory cleanup:

+	defer os.RemoveAll("manifests")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/observability/observability_controller_test.go` around lines
1530 - 1536, The test creates a real manifests/ directory and only removes the
file, leaving the directory behind; update the test to use an isolated temp
directory so artifacts are fully cleaned up: create a temp dir via t.TempDir()
(or os.MkdirTemp if t is not available), write the FlowCollector manifest into
filepath.Join(tempDir, filepath.Base(FlowCollectorYAML)) using the existing
flowCollectorManifest, and set the test to use that path (or defer
os.RemoveAll(tempDir)) instead of writing to the repository-level "manifests"
directory; alternatively ensure the test defers os.RemoveAll("manifests") after
writing—reference FlowCollectorYAML and flowCollectorManifest to locate the
write site in observability_controller_test.go.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@pkg/controller/observability/observability_controller_test.go`:
- Around line 1268-1283: The test invokes g.Expect inside goroutines which can
call t.FailNow from non-test goroutines; instead collect errors from r.Reconcile
in a buffered channel and perform Gomega assertions in the main test goroutine.
Replace the done channel with an errChan (buffered to 5), have each goroutine
call r.Reconcile(context.TODO(), req) and send the returned error into errChan,
then loop 5 times in the main goroutine to receive err := <-errChan and call
g.Expect(err).NotTo(HaveOccurred()). This keeps the Reconcile calls concurrent
but ensures assertions (using g.Expect) run only on the main test goroutine.

---

Nitpick comments:
In `@pkg/controller/observability/observability_controller_test.go`:
- Around line 936-938: Rename the test function
TestReconcile_FlowCollectorDeleted and update its comment to reflect the actual
behavior being asserted: that reconciliation skips reinstallation when the
FlowCollector has the "deployed" condition set. Specifically, change the
function name (e.g., to TestReconcile_SkipsReinstallWhenFlowCollectorDeployed)
and update the preceding comment to state "tests that reconciliation skips
reinstallation of FlowCollector when the deployed condition is present"; also
update any references to the old test name in the file (including test
registration or helper calls) so they remain consistent.
- Around line 1530-1536: The test creates a real manifests/ directory and only
removes the file, leaving the directory behind; update the test to use an
isolated temp directory so artifacts are fully cleaned up: create a temp dir via
t.TempDir() (or os.MkdirTemp if t is not available), write the FlowCollector
manifest into filepath.Join(tempDir, filepath.Base(FlowCollectorYAML)) using the
existing flowCollectorManifest, and set the test to use that path (or defer
os.RemoveAll(tempDir)) instead of writing to the repository-level "manifests"
directory; alternatively ensure the test defers os.RemoveAll("manifests") after
writing—reference FlowCollectorYAML and flowCollectorManifest to locate the
write site in observability_controller_test.go.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 394e355a-3990-4ada-aaac-e4a5ee489446

📥 Commits

Reviewing files that changed from the base of the PR and between 2a76763 and 9548585.

📒 Files selected for processing (1)
  • pkg/controller/observability/observability_controller_test.go

Comment thread manifests/08-flowcollector.yaml Outdated
Comment thread manifests/08-flowcollector.yaml Outdated
Comment thread pkg/controller/observability/observability_controller.go Outdated
Comment thread pkg/controller/observability/observability_controller.go Outdated
// Check if Network Observability was previously deployed
// If so, we're done - no need to check or manage anything
if r.wasNetworkObservabilityDeployed(&network) {
klog.Info("Network Observability was previously deployed, skipping reconciliation.")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to clear this flag so it will install Network Observability again? If not, we just need to make it clear in our documentation.

Comment thread pkg/controller/observability/observability_controller.go Outdated
err := r.client.Get(ctx, types.NamespacedName{
Name: "netobserv-operator",
Namespace: OperatorNamespace,
}, subscription)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PoC determined that Network Observability was installed if there is a Deployment with the "app=netobserv-operator" label. With the Service deployment model added, it uses a DaemonSet.

It seems checking for a Deployment or DaemonSet is more reliable than a Subscription, which is OLM-specific, because unofficially, you could install it with Helm. Should we not worry about that? Thoughts?

Group: "operators.coreos.com",
Version: "v1alpha1",
Kind: "ClusterServiceVersion",
})

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment in that CSV is specific to OLM. The alternative is to check for the Deployment or DaemonSet.

@stleerh stleerh requested review from dave-tucker and jotak March 24, 2026 18:42
@arkadeepsen

Copy link
Copy Markdown
Member

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Mar 25, 2026

Copy link
Copy Markdown
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
go.mod (1)

165-165: ⚠️ Potential issue | 🟠 Major

Keep the temporary openshift/api fork out of the merge path.

This still makes release builds resolve API types from a personal fork. Please keep a hard merge blocker on this line and remove it before merging.

Verify that the replace is still present and that the upstream API PR is not merged yet:

#!/bin/bash
set -euo pipefail

echo "replace directive in go.mod:"
rg -n '^replace github\.com/openshift/api ' go.mod

echo
echo "openshift/api#2752 state:"
curl -fsSL https://api.github.com/repos/openshift/api/pulls/2752 | jq '{state, merged_at, title, html_url}'

Expected result: if the replace line is present and merged_at is null, this remains a pre-merge blocker.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go.mod` at line 165, The go.mod contains a temporary replace directive
referencing a personal fork ("replace github.com/openshift/api
v0.0.0-20260302174620-dcac36b908db => github.com/OlivierCazade/api
v0.0.0-20260324161534-5d972e0f328a") which must not make it into mainline;
remove that replace before merging OR add a strict CI merge-blocker that fails
the build if that exact replace line exists (e.g., a simple grep check
equivalent to the reviewer’s snippet), and ensure the replace is only present in
a local/temporary branch and not in the merged branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/controller/observability/observability_controller.go`:
- Around line 144-148: When r.waitForNetObservOperator(ctx) returns
context.DeadlineExceeded, in addition to requeuing keep the current behavior but
call r.status.SetDegraded with the ObservabilityConfig key (use the existing
"OperatorNotReady" reason) and a clear message including checkTimeout so the CNO
status reflects the timeout; ensure you still return ctrl.Result{RequeueAfter:
requeueAfter}, nil. Reference the r.waitForNetObservOperator function,
r.status.SetDegraded method, the ObservabilityConfig constant/key, the
"OperatorNotReady" reason, and the checkTimeout/requeueAfter variables when
adding the status update.
- Around line 129-140: The current logic short-circuits once any Subscription
exists (isNetObservOperatorInstalled) which stops reapplying the desired OLM
manifest and can leave a drifted/partial install stuck; change reconcile to
always call installNetObservOperator until the operator is fully ready (e.g.,
check CSV/ClusterServiceVersion or subscription status for installed/ready
state) instead of relying on Subscription existence: remove or stop using
isNetObservOperatorInstalled, have installNetObservOperator be idempotent and
reapply the operator manifest each reconcile when readiness checks
(CSV/Subscription conditions) are not satisfied, and keep setting degraded via
r.status.SetDegraded(...) on failures so errors still surface.

---

Duplicate comments:
In `@go.mod`:
- Line 165: The go.mod contains a temporary replace directive referencing a
personal fork ("replace github.com/openshift/api
v0.0.0-20260302174620-dcac36b908db => github.com/OlivierCazade/api
v0.0.0-20260324161534-5d972e0f328a") which must not make it into mainline;
remove that replace before merging OR add a strict CI merge-blocker that fails
the build if that exact replace line exists (e.g., a simple grep check
equivalent to the reviewer’s snippet), and ensure the replace is only present in
a local/temporary branch and not in the merged branch.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d4dbd2bf-3cd4-41dd-a545-948c1fba572d

📥 Commits

Reviewing files that changed from the base of the PR and between 2a76763 and 844d745.

⛔ Files ignored due to path filters (7)
  • go.sum is excluded by !**/*.sum
  • vendor/github.com/openshift/api/config/v1/types_network.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.deepcopy.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.swagger_doc_generated.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/api/features.md is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/api/features/features.go is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (9)
  • go.mod
  • manifests/0000_70_cluster-network-operator_02_rbac_observability.yaml
  • manifests/07-observability-operator.yaml
  • manifests/08-flowcollector.yaml
  • pkg/controller/add_networkconfig.go
  • pkg/controller/observability/observability_controller.go
  • pkg/controller/observability/observability_controller_test.go
  • pkg/controller/statusmanager/status_manager.go
  • sample-config.yaml
✅ Files skipped from review due to trivial changes (4)
  • pkg/controller/add_networkconfig.go
  • manifests/07-observability-operator.yaml
  • manifests/08-flowcollector.yaml
  • manifests/0000_70_cluster-network-operator_02_rbac_observability.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • sample-config.yaml
  • pkg/controller/observability/observability_controller_test.go

Comment on lines +129 to +140
installed, err := r.isNetObservOperatorInstalled(ctx)
if err != nil {
r.status.SetDegraded(statusmanager.ObservabilityConfig, "CheckOperatorError", fmt.Sprintf("Failed to check if Network Observability Operator is installed: %v", err))
return ctrl.Result{}, err
}
if !installed {
// Install Network Observability Operator
if err := r.installNetObservOperator(ctx); err != nil {
r.status.SetDegraded(statusmanager.ObservabilityConfig, "InstallOperatorError", fmt.Sprintf("Failed to install Network Observability Operator: %v", err))
return ctrl.Result{}, err
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reapply the operator manifest until readiness instead of keying on Subscription existence.

As soon as any Subscription exists, this branch stops asserting the desired OLM spec. A drifted or half-created install can then sit in the wait/requeue path forever because installNetObservOperator is never called again.

Suggested fix
-	// First time installation - proceed with operator installation
-	installed, err := r.isNetObservOperatorInstalled(ctx)
-	if err != nil {
-		r.status.SetDegraded(statusmanager.ObservabilityConfig, "CheckOperatorError", fmt.Sprintf("Failed to check if Network Observability Operator is installed: %v", err))
-		return ctrl.Result{}, err
-	}
-	if !installed {
-		// Install Network Observability Operator
-		if err := r.installNetObservOperator(ctx); err != nil {
-			r.status.SetDegraded(statusmanager.ObservabilityConfig, "InstallOperatorError", fmt.Sprintf("Failed to install Network Observability Operator: %v", err))
-			return ctrl.Result{}, err
-		}
-	}
+	// Re-apply the desired OLM resources until the operator becomes ready.
+	if err := r.installNetObservOperator(ctx); err != nil {
+		r.status.SetDegraded(statusmanager.ObservabilityConfig, "InstallOperatorError", fmt.Sprintf("Failed to apply Network Observability Operator manifest: %v", err))
+		return ctrl.Result{}, err
+	}

That also makes isNetObservOperatorInstalled removable afterward.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
installed, err := r.isNetObservOperatorInstalled(ctx)
if err != nil {
r.status.SetDegraded(statusmanager.ObservabilityConfig, "CheckOperatorError", fmt.Sprintf("Failed to check if Network Observability Operator is installed: %v", err))
return ctrl.Result{}, err
}
if !installed {
// Install Network Observability Operator
if err := r.installNetObservOperator(ctx); err != nil {
r.status.SetDegraded(statusmanager.ObservabilityConfig, "InstallOperatorError", fmt.Sprintf("Failed to install Network Observability Operator: %v", err))
return ctrl.Result{}, err
}
}
// Re-apply the desired OLM resources until the operator becomes ready.
if err := r.installNetObservOperator(ctx); err != nil {
r.status.SetDegraded(statusmanager.ObservabilityConfig, "InstallOperatorError", fmt.Sprintf("Failed to apply Network Observability Operator manifest: %v", err))
return ctrl.Result{}, err
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/observability/observability_controller.go` around lines 129 -
140, The current logic short-circuits once any Subscription exists
(isNetObservOperatorInstalled) which stops reapplying the desired OLM manifest
and can leave a drifted/partial install stuck; change reconcile to always call
installNetObservOperator until the operator is fully ready (e.g., check
CSV/ClusterServiceVersion or subscription status for installed/ready state)
instead of relying on Subscription existence: remove or stop using
isNetObservOperatorInstalled, have installNetObservOperator be idempotent and
reapply the operator manifest each reconcile when readiness checks
(CSV/Subscription conditions) are not satisfied, and keep setting degraded via
r.status.SetDegraded(...) on failures so errors still surface.

Comment thread pkg/controller/observability/observability_controller.go Outdated
@arkadeepsen

Copy link
Copy Markdown
Member

@OlivierCazade PLMK when the PR is ready to be reviewed.

@OlivierCazade OlivierCazade force-pushed the day0 branch 2 times, most recently from 67701bb to 66bf264 Compare March 25, 2026 11:26
@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 25, 2026
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 25, 2026
@OlivierCazade OlivierCazade force-pushed the day0 branch 5 times, most recently from 4b236f1 to 477a479 Compare March 26, 2026 12:08
@OlivierCazade

Copy link
Copy Markdown
Author

About the e2e tests, I did not find any e2e tests in this repository, how are you usually implementing them ?

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/controller/observability/observability_controller.go (1)

112-117: 💤 Low value

Consider logging if status condition update fails.

Discarding the error from setNetworkObservabilityCondition at lines 115, 125, 129, and 147 is acceptable since the primary error is already being handled with a requeue. However, adding a warning log when the condition update fails would improve observability without changing the control flow.

 		if err := r.installNetObservOperator(ctx); err != nil {
 			klog.Warningf("Failed to install Network Observability Operator: %v. Will retry in %v.", err, requeueAfterOLM)
 			// Mark deployment as failed
-			_ = r.setNetworkObservabilityCondition(ctx, operatorv1.ConditionFalse, "DeploymentFailed", fmt.Sprintf("Failed to install Network Observability Operator: %v", err))
+			if condErr := r.setNetworkObservabilityCondition(ctx, operatorv1.ConditionFalse, "DeploymentFailed", fmt.Sprintf("Failed to install Network Observability Operator: %v", err)); condErr != nil {
+				klog.V(4).Infof("Failed to update deployment status condition: %v", condErr)
+			}
 			return ctrl.Result{RequeueAfter: requeueAfterOLM}, nil
 		}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/controller/observability/observability_controller.go` around lines 112 -
117, The call to setNetworkObservabilityCondition in the error paths (used
alongside installNetObservOperator) currently discards its returned error;
update those sites to capture the error from setNetworkObservabilityCondition
and emit a klog.Warningf (or similar) noting that the condition update failed
and include the error, but do not change the existing control flow or requeue
behavior — just log the failure to improve observability.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pkg/controller/observability/observability_controller.go`:
- Around line 112-117: The call to setNetworkObservabilityCondition in the error
paths (used alongside installNetObservOperator) currently discards its returned
error; update those sites to capture the error from
setNetworkObservabilityCondition and emit a klog.Warningf (or similar) noting
that the condition update failed and include the error, but do not change the
existing control flow or requeue behavior — just log the failure to improve
observability.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 17a029ad-6c04-4afc-8147-c652211b825f

📥 Commits

Reviewing files that changed from the base of the PR and between ba9e013 and c0f4f25.

⛔ Files ignored due to path filters (100)
  • go.sum is excluded by !**/*.sum
  • vendor/github.com/openshift/api/.golangci.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_apiserver.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_authentication.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_cluster_operator.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_dns.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_image.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_infrastructure.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_kmsencryption.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_network.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.deepcopy.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/config/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/config/v1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/config/v1alpha1/types_cluster_monitoring.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1alpha1/zz_generated.deepcopy.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/config/v1alpha1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/console/v1/types_console_plugin.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/console/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/envtest-releases.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/etcd/install.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/etcd/v1/Makefile is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/etcd/v1/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/etcd/v1/register.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/etcd/v1/types_pacemakercluster.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/etcd/v1/zz_generated.deepcopy.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/etcd/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/etcd/v1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/etcd/v1alpha1/types_pacemakercluster.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/etcd/v1alpha1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/features.md is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/features/features.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/features/legacyfeaturegates.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/install.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/types_machineset.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/machineconfiguration/v1/types.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machineconfiguration/v1/types_machineconfignode.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machineconfiguration/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/machineconfiguration/v1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/operator/v1/types_csi_cluster_driver.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/types_ingress.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_20_kube-apiserver_01_kubeapiservers-Default.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_20_kube-apiserver_01_kubeapiservers-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_20_kube-apiserver_01_kubeapiservers-OKD.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_20_kube-apiserver_01_kubeapiservers-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_20_kube-apiserver_01_kubeapiservers.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_csi-driver_01_clustercsidrivers-CustomNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_csi-driver_01_clustercsidrivers-Default.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_csi-driver_01_clustercsidrivers-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_csi-driver_01_clustercsidrivers-OKD.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_csi-driver_01_clustercsidrivers-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-CustomNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-Default.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-OKD.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/operator/v1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/operator/v1alpha1/types_clusterapi.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1alpha1/zz_generated.deepcopy.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/operator/v1alpha1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/quota/v1/generated.proto is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/quota/v1/types.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/quota/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/route/v1/generated.proto is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/route/v1/types.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/route/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/security/v1/generated.proto is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/security/v1/types.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/security/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/security/v1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/modules.txt is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.gitignore is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.golangci.yml is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.gomodcheck.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/CONTRIBUTING.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/FAQ.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/Makefile is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/OWNERS is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/OWNERS_ALIASES is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/README.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/RELEASE.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/SECURITY_CONTACTS is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/TMP-LOGGING.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/VERSIONING.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/alias.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/code-of-conduct.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/controller.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/options.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/webhook.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/client/config/config.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/client/config/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal_posix.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal_windows.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/scheme/scheme.go is excluded by !**/vendor/**, !vendor/**
📒 Files selected for processing (7)
  • bindata/observability/07-observability-operator.yaml
  • bindata/observability/08-flowcollector.yaml
  • go.mod
  • pkg/controller/add_networkconfig.go
  • pkg/controller/observability/observability_controller.go
  • pkg/controller/observability/observability_controller_test.go
  • sample-config.yaml
🚧 Files skipped from review as they are similar to previous changes (4)
  • sample-config.yaml
  • bindata/observability/08-flowcollector.yaml
  • pkg/controller/add_networkconfig.go
  • go.mod

@arkadeepsen

Copy link
Copy Markdown
Member

About the e2e tests, I did not find any e2e tests in this repository, how are you usually implementing them ?

The tests are added to https://github.com/openshift/origin repo. Currently, this is a TP feature. You can refer to https://docs.google.com/document/d/1zOL38_KDKwAvsx-LMHfyDdX4-jq3HQU1YWBjfuHaM0Q/edit?usp=drivesdk for details regarding feature gates. I think it'll also have details about test coverage. It also has details about lifting the feature gates and making features available by default.

@OlivierCazade

Copy link
Copy Markdown
Author

@arkadeepsen, outside of the e2e, does this PR is fine for you now ?

@arkadeepsen

Copy link
Copy Markdown
Member

@arkadeepsen, outside of the e2e, does this PR is fine for you now ?

I think this comment is still not resolved: #2925 (comment)

The description of InstallAndEnable in the API states the following:

	// NetworkObservabilityInstallAndEnable means that network observability should be installed and enabled during cluster deployment
	// Since this was explicitly set to install, if the user remove NetworkObservability, it will be installed again unless the value of InstallationPolicy is changed
	NetworkObservabilityInstallAndEnable NetworkObservabilityInstallationPolicy = "InstallAndEnable"

I am not sure all the scenarios are handled for this mode. The above comment is related to how CNO will detect that the user has removed the resources managed by it.

@OlivierCazade

Copy link
Copy Markdown
Author

Sorry, answering in the comment thread.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/controller/observability/observability_controller_test.go`:
- Around line 426-447: The ReconcileObservability struct in these tests is
created with featureGate set to nil, causing Reconcile() to exit early at the
feature-gate check before reaching the behavior the tests are designed to
validate. To fix this, initialize the featureGate field in the
ReconcileObservability struct with a properly configured feature gate that
enables the observability feature (rather than leaving it nil). This pattern
should be applied to the ReconcileObservability initialization across all the
affected test functions: TestReconcile_SkipsWhenDisabled at lines 426-447, the
tests at lines 492-532, 970-1001, 1045-1084, 1124-1235, and 1556-1585 so that
Reconcile() progresses past the feature-gate guard and actually exercises the
behavior being asserted.

In `@pkg/controller/observability/observability_controller.go`:
- Around line 105-109: When isNetObservOperatorInstalled returns an error, the
controller currently only logs a warning and requeues without updating the
Network CR status. To surface this error to users, set the
NetworkObservabilityDeployed condition to False with the error message included
whenever isNetObservOperatorInstalled fails. This will make persistent
installation issues (like ClusterExtension failures or inability to identify the
installation source) visible in the Network CR status, allowing users to
understand why deployment is stalled rather than only seeing a requeue loop.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1e6771cd-b873-4c39-89bf-5237154cf733

📥 Commits

Reviewing files that changed from the base of the PR and between c0f4f25 and bc5e876.

⛔ Files ignored due to path filters (100)
  • go.sum is excluded by !**/*.sum
  • vendor/github.com/openshift/api/.golangci.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_apiserver.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_authentication.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_cluster_operator.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_dns.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_image.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_infrastructure.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_kmsencryption.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/types_network.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1/zz_generated.deepcopy.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/config/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/config/v1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/config/v1alpha1/types_cluster_monitoring.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/config/v1alpha1/zz_generated.deepcopy.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/config/v1alpha1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/console/v1/types_console_plugin.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/console/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/envtest-releases.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/etcd/install.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/etcd/v1/Makefile is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/etcd/v1/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/etcd/v1/register.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/etcd/v1/types_pacemakercluster.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/etcd/v1/zz_generated.deepcopy.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/etcd/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/etcd/v1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/etcd/v1alpha1/types_pacemakercluster.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/etcd/v1alpha1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/features.md is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/features/features.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/features/legacyfeaturegates.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/install.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/types_machineset.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machine/v1beta1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/machineconfiguration/v1/types.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machineconfiguration/v1/types_machineconfignode.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/machineconfiguration/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/machineconfiguration/v1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/operator/v1/types_csi_cluster_driver.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/types_ingress.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_20_kube-apiserver_01_kubeapiservers-Default.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_20_kube-apiserver_01_kubeapiservers-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_20_kube-apiserver_01_kubeapiservers-OKD.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_20_kube-apiserver_01_kubeapiservers-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_20_kube-apiserver_01_kubeapiservers.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_csi-driver_01_clustercsidrivers-CustomNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_csi-driver_01_clustercsidrivers-Default.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_csi-driver_01_clustercsidrivers-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_csi-driver_01_clustercsidrivers-OKD.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_csi-driver_01_clustercsidrivers-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-CustomNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-Default.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-DevPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-OKD.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-TechPreviewNoUpgrade.crd.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/operator/v1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/operator/v1alpha1/types_clusterapi.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/operator/v1alpha1/zz_generated.deepcopy.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/operator/v1alpha1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/quota/v1/generated.proto is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/quota/v1/types.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/quota/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/route/v1/generated.proto is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/route/v1/types.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/route/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/security/v1/generated.proto is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/security/v1/types.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/openshift/api/security/v1/zz_generated.featuregated-crd-manifests.yaml is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/github.com/openshift/api/security/v1/zz_generated.swagger_doc_generated.go is excluded by !**/vendor/**, !vendor/**, !**/zz_generated*
  • vendor/modules.txt is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.gitignore is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.golangci.yml is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.gomodcheck.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/CONTRIBUTING.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/FAQ.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/Makefile is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/OWNERS is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/OWNERS_ALIASES is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/README.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/RELEASE.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/SECURITY_CONTACTS is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/TMP-LOGGING.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/VERSIONING.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/alias.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/code-of-conduct.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/controller.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/options.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/webhook.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/client/config/config.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/client/config/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal_posix.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal_windows.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/scheme/scheme.go is excluded by !**/vendor/**, !vendor/**
📒 Files selected for processing (7)
  • bindata/observability/07-observability-operator.yaml
  • bindata/observability/08-flowcollector.yaml
  • go.mod
  • pkg/controller/add_networkconfig.go
  • pkg/controller/observability/observability_controller.go
  • pkg/controller/observability/observability_controller_test.go
  • sample-config.yaml
✅ Files skipped from review due to trivial changes (1)
  • sample-config.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
  • go.mod
  • pkg/controller/add_networkconfig.go
  • bindata/observability/08-flowcollector.yaml

Comment thread pkg/controller/observability/observability_controller_test.go
Comment thread pkg/controller/observability/observability_controller.go
@stleerh

stleerh commented Jun 23, 2026

Copy link
Copy Markdown

/lgtm

@arkadeepsen Are there any outstanding issues?

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 23, 2026
@openshift-ci

openshift-ci Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: OlivierCazade, stleerh
Once this PR has been reviewed and has the lgtm label, please assign kyrtapz for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 24, 2026
@openshift-ci

openshift-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

New changes are detected. LGTM label has been removed.

@OlivierCazade

Copy link
Copy Markdown
Author

Rebased the branch due to conflicts (no need to update openshift/api anymore it was updated on the main branch)

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
bindata/observability/07-observability-operator.yaml (1)

38-44: 🔒 Security & Privacy | 🔵 Trivial

Drop the webhook RBAC after install create/list/watch on validatingwebhookconfigurations is cluster-wide; if this SA is only needed for OLM v1 pre-authorization, remove or narrow the binding once installation finishes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bindata/observability/07-observability-operator.yaml` around lines 38 - 44,
The RBAC on validatingwebhookconfigurations is broader than needed after
installation, so narrow or remove the create/list/watch permissions once the OLM
v1 pre-authorization flow is done. Update the relevant binding in
observability/07-observability-operator.yaml so the service account used for
install-time webhook setup no longer retains cluster-wide webhook management
access, while keeping only the specific permissions needed by the install path.

Sources: Path instructions, Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@bindata/observability/07-observability-operator.yaml`:
- Around line 1-4: The openshift-netobserv-operator Namespace is missing the
usual network isolation resource. Add a NetworkPolicy for this namespace,
following the pattern used by the other operator namespace manifests in this
repo, so the operator namespace is explicitly isolated for ingress and egress;
place it alongside the existing Namespace manifest and ensure it targets
openshift-netobserv-operator.

---

Nitpick comments:
In `@bindata/observability/07-observability-operator.yaml`:
- Around line 38-44: The RBAC on validatingwebhookconfigurations is broader than
needed after installation, so narrow or remove the create/list/watch permissions
once the OLM v1 pre-authorization flow is done. Update the relevant binding in
observability/07-observability-operator.yaml so the service account used for
install-time webhook setup no longer retains cluster-wide webhook management
access, while keeping only the specific permissions needed by the install path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1e3de99a-274e-4c6c-8024-88ca299bd4a0

📥 Commits

Reviewing files that changed from the base of the PR and between bc5e876 and d9d4cad.

⛔ Files ignored due to path filters (28)
  • vendor/modules.txt is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.gitignore is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.golangci.yml is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.gomodcheck.yaml is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/CONTRIBUTING.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/FAQ.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/Makefile is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/OWNERS is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/OWNERS_ALIASES is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/README.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/RELEASE.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/SECURITY_CONTACTS is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/TMP-LOGGING.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/VERSIONING.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/alias.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/code-of-conduct.md is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/controller.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/options.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/webhook.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/client/config/config.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/client/config/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/doc.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal_posix.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal_windows.go is excluded by !**/vendor/**, !vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/scheme/scheme.go is excluded by !**/vendor/**, !vendor/**
📒 Files selected for processing (6)
  • bindata/observability/07-observability-operator.yaml
  • bindata/observability/08-flowcollector.yaml
  • pkg/controller/add_networkconfig.go
  • pkg/controller/observability/observability_controller.go
  • pkg/controller/observability/observability_controller_test.go
  • sample-config.yaml
✅ Files skipped from review due to trivial changes (2)
  • bindata/observability/08-flowcollector.yaml
  • pkg/controller/add_networkconfig.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • sample-config.yaml
  • pkg/controller/observability/observability_controller_test.go
  • pkg/controller/observability/observability_controller.go

Comment on lines +1 to +4
apiVersion: v1
kind: Namespace
metadata:
name: openshift-netobserv-operator

@coderabbitai coderabbitai Bot Jun 24, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether a NetworkPolicy exists for the new namespace.
# Expected: at least one NetworkPolicy targeting openshift-netobserv-operator, or an intentional documented exception.
rg -n -C3 'kind:\s*NetworkPolicy|namespace:\s*openshift-netobserv-operator' --glob '*.yaml' --glob '*.yml'

Repository: openshift/cluster-network-operator

Length of output: 172


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== files mentioning openshift-netobserv-operator =="
rg -n --hidden --glob '*.yaml' --glob '*.yml' --glob '!**/.git/**' 'openshift-netobserv-operator' .

echo
echo "== NetworkPolicy manifests =="
rg -n --hidden --glob '*.yaml' --glob '*.yml' --glob '!**/.git/**' 'kind:\s*NetworkPolicy' .

echo
echo "== target file excerpt =="
sed -n '1,80p' bindata/observability/07-observability-operator.yaml

Repository: openshift/cluster-network-operator

Length of output: 5247


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== files mentioning openshift-netobserv-operator =="
rg -n --hidden --glob '*.yaml' --glob '*.yml' --glob '!**/.git/**' 'openshift-netobserv-operator' .

echo
echo "== NetworkPolicy manifests =="
rg -n --hidden --glob '*.yaml' --glob '*.yml' --glob '!**/.git/**' 'kind:\s*NetworkPolicy' .

echo
echo "== target file excerpt =="
sed -n '1,80p' bindata/observability/07-observability-operator.yaml

Repository: openshift/cluster-network-operator

Length of output: 5247


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== manifests/0000_70_cluster-network-operator_00_namespace.yaml =="
sed -n '1,120p' manifests/0000_70_cluster-network-operator_00_namespace.yaml

echo
echo "== NetworkPolicy objects in bindata/observability/07-observability-operator.yaml =="
rg -n -C4 'kind:\s*NetworkPolicy|apiVersion:\s*networking.k8s.io/v1|metadata:|namespace:' bindata/observability/07-observability-operator.yaml

echo
echo "== Namespace-scoped policy refs across repo =="
rg -n --hidden --glob '*.yaml' --glob '*.yml' 'namespace:\s*openshift-netobserv-operator|name:\s*openshift-netobserv-operator' .

Repository: openshift/cluster-network-operator

Length of output: 3995


Define a NetworkPolicy for openshift-netobserv-operator. The namespace is created without any ingress/egress isolation, unlike other operator namespaces in this repo.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bindata/observability/07-observability-operator.yaml` around lines 1 - 4, The
openshift-netobserv-operator Namespace is missing the usual network isolation
resource. Add a NetworkPolicy for this namespace, following the pattern used by
the other operator namespace manifests in this repo, so the operator namespace
is explicitly isolated for ingress and egress; place it alongside the existing
Namespace manifest and ensure it targets openshift-netobserv-operator.

Source: Path instructions

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually correct suggestion and we need netpols added for the netobserv namespace.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arkadeepsen Agreed — this one still stands. Adding NetworkPolicy resources for openshift-netobserv-operator alongside bindata/observability/07-observability-operator.yaml would be the right fix to keep the new operator namespace explicitly isolated for ingress and egress.


✏️ Learnings added
Learnt from: arkadeepsen
Repo: openshift/cluster-network-operator PR: 2925
File: bindata/observability/07-observability-operator.yaml:1-4
Timestamp: 2026-06-26T14:57:46.466Z
Learning: In the OpenShift Cluster Network Operator repository, when a new operator namespace is introduced in Kubernetes manifests (such as `bindata/observability/07-observability-operator.yaml` for `openshift-netobserv-operator`), the namespace is expected to include NetworkPolicy resources to explicitly isolate ingress and egress traffic, following the existing operator namespace pattern in the repo.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@OlivierCazade OlivierCazade Jun 26, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are working on adding the network policy directly inside the bundle. We do not provide for now a netpol for the operator.

The network policy for the Network Observability stack are deployed by the operator.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

@arkadeepsen arkadeepsen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the status of the e2e tests? It'd be great to check the various scenarios of InstallAndEnable option. I don't think there are UTs for the different scenarios.

Comment thread sample-config.yaml Outdated
Comment on lines +13 to +19
networkObservability:
# installationPolicy controls Network Observability installation during cluster deployment (day-0).
# Valid values: "", "InstallAndEnable", "NoAction"
# Default (empty or omitted): enabled on multi-node clusters, disabled on SNO
# "InstallAndEnable": explicitly enable (even on SNO)
# "NoAction": explicitly disable
installationPolicy: InstallAndEnable

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section should be added in sample-cluster-network-config.yaml rather than in this file as the field was added to config.openshift.io/v1 rather than operator.openshift.io/v1.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment thread pkg/controller/observability/observability_controller.go Outdated
Comment on lines +330 to +332
err := r.client.Get(ctx, types.NamespacedName{
Name: "flowcollectors.flows.netobserv.io",
}, crd)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked with chai-bot regarding the various scenarios: https://redhat-internal.slack.com/archives/C3VS0LV41/p1782466535969549

Following is my understanding:

  • If operator is installed via OLMv0, then the resources installed during installation of the operator is not managed, i.e. the CRD if deleted will not be re-installed. Thus, for the case of OLMv0, the CRD may not exist, but the operator can still be running.
  • If operator is installed via OLMv1, then the resources are managed and if deleted will be re-installed. Thus, if the ClusterExtension object is installed, CRD will always be installed.
  • The user can delete the ClusterExtension object in the case of OLMv1 using --cascade=orphan which will not delete the CRD. For the case of OLMv0, when CSV and Subscription are deleted, the operator is removed, but the CRD is not removed. Currently if CRD exists and OLM resources do not exist, operator is not installed.

The current flow will try to install the operator if it detects that the CRD is not installed, but the operator may be installed using OLMv0 and CNO will install another version of NOO via OLMv1. A check for this scenario should be added.

I think we also need to document the pitfalls of the different install methods (OLMv0 and OLMv1) as this does deviate from what the API documents:

	// NetworkObservabilityInstallAndEnable means that network observability should be installed and enabled during cluster deployment
	// Since this was explicitly set to install, if the user remove NetworkObservability, it will be installed again unless the value of InstallationPolicy is changed
	NetworkObservabilityInstallAndEnable NetworkObservabilityInstallationPolicy = "InstallAndEnable"


installed, err := r.isNetObservOperatorInstalled(context.TODO())

// Should return no error but false (not installed yet)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Comment is incorrect.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks.

Comment on lines +1 to +4
apiVersion: v1
kind: Namespace
metadata:
name: openshift-netobserv-operator

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually correct suggestion and we need netpols added for the netobserv namespace.

// Use RawPatch with ApplyPatchType to avoid deprecated crclient.Apply
patch := crclient.RawPatch(types.ApplyPatchType, data)
if err := r.client.Patch(ctx, obj, patch, &crclient.PatchOptions{
Force: ptr.To(true),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Force is set to true, it'll overwrite any changes done by the user. Who owns the objects? If CNO does, then this is fine, however if the user owns the objects then this will not respect user changes.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The object is owned by the user but created by the CNO.

I removed the force field to prevent overriding user object.

@OlivierCazade OlivierCazade force-pushed the day0 branch 2 times, most recently from c793431 to f2e4ea2 Compare June 26, 2026 20:19
This commit introduces automated installation and management of the
Network Observability Operator during cluster deployment (day-0).

Key features:
- Automatic operator installation via OLM when NetworkObservabilityInstall
  feature gate is enabled
- Opt-out model: installs by default except on SNO clusters

Implementation details:
- New observability controller in pkg/controller/observability/
- Manifest-based operator installation
- Default FlowCollector configuration
- RBAC permissions for OLM resource management
- namespace creation for operator and observability components
@openshift-ci

openshift-ci Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

@OlivierCazade: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/4.22-upgrade-from-stable-4.21-e2e-aws-ovn-upgrade 9ac0fcc link false /test 4.22-upgrade-from-stable-4.21-e2e-aws-ovn-upgrade
ci/prow/e2e-aws-ovn-upgrade 5af2fe8 link true /test e2e-aws-ovn-upgrade
ci/prow/e2e-aws-ovn-fdp-qe 5af2fe8 link true /test e2e-aws-ovn-fdp-qe
ci/prow/e2e-ovn-ipsec-step-registry 5af2fe8 link true /test e2e-ovn-ipsec-step-registry
ci/prow/hypershift-e2e-aks 5af2fe8 link true /test hypershift-e2e-aks
ci/prow/e2e-aws-ovn-serial-1of2 5af2fe8 link true /test e2e-aws-ovn-serial-1of2
ci/prow/e2e-metal-ipi-ovn-ipv6 5af2fe8 link true /test e2e-metal-ipi-ovn-ipv6
ci/prow/e2e-aws-ovn-upgrade-ipsec 5af2fe8 link true /test e2e-aws-ovn-upgrade-ipsec
ci/prow/e2e-metal-ipi-ovn-dualstack-bgp-local-gw 5af2fe8 link true /test e2e-metal-ipi-ovn-dualstack-bgp-local-gw
ci/prow/e2e-gcp-ovn 5af2fe8 link true /test e2e-gcp-ovn
ci/prow/e2e-metal-ipi-ovn-ipv6-ipsec 5af2fe8 link true /test e2e-metal-ipi-ovn-ipv6-ipsec
ci/prow/e2e-azure-ovn-upgrade 5af2fe8 link true /test e2e-azure-ovn-upgrade
ci/prow/5.0-upgrade-from-stable-4.22-e2e-aws-ovn-upgrade 5af2fe8 link false /test 5.0-upgrade-from-stable-4.22-e2e-aws-ovn-upgrade

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@arkadeepsen arkadeepsen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the status of the e2e tests? It'd be great to check the various scenarios of InstallAndEnable option. I don't think there are UTs for the different scenarios.

We need the e2e tests to check everything is working fine or not. Otherwise, implementation wise I am mostly good.

Comment thread vendor/modules.txt
Comment on lines 1681 to +1720
@@ -1708,11 +1711,13 @@ sigs.k8s.io/controller-runtime/pkg/internal/syncs
sigs.k8s.io/controller-runtime/pkg/leaderelection
sigs.k8s.io/controller-runtime/pkg/log
sigs.k8s.io/controller-runtime/pkg/manager
sigs.k8s.io/controller-runtime/pkg/manager/signals
sigs.k8s.io/controller-runtime/pkg/metrics
sigs.k8s.io/controller-runtime/pkg/metrics/server
sigs.k8s.io/controller-runtime/pkg/predicate
sigs.k8s.io/controller-runtime/pkg/reconcile
sigs.k8s.io/controller-runtime/pkg/recorder
sigs.k8s.io/controller-runtime/pkg/scheme

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there's no change in go.mod file, the vendor directory should also not be updated.

@arkadeepsen

Copy link
Copy Markdown
Member

@coderabbitai resume
@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown

@arkadeepsen Sure — I’ll resume automatic reviews and trigger a review for #2925.

✅ Action performed

Reviews resumed.


Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@bindata/observability/07-observability-operator.yaml`:
- Around line 135-139: The RBAC rule for metrics access is using the wrong verb
set for reading pod metrics. Update the rule in the observability operator
manifest so the `metrics.k8s.io` pods permission grants read-only access via
`get`, `list`, and `watch` instead of `create`. Keep the change scoped to the
existing RBAC entry that defines the pod metrics permissions so the operator can
query metrics correctly.
- Around line 6-10: The installer ServiceAccount should not automatically mount
a token by default. Update the ServiceAccount definition in the observability
operator manifest to set automountServiceAccountToken to false for this
installer identity, and only leave token mounting enabled if OLM v1 explicitly
requires it. Use the ServiceAccount resource in the YAML as the target for this
change.

In `@sample-cluster-network-config.yaml`:
- Around line 13-15: Update the documentation for installationPolicy so it only
lists the valid enum values accepted by the API, namely InstallAndEnable and
NoAction. Remove the empty-string entry and make it clear that the default
behavior comes from omitting the field or block entirely, not from setting
installationPolicy to an empty string. Refer to the installationPolicy section
in the sample cluster network config comment so it can be corrected even if the
surrounding text moves.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 0b2e3b22-0cab-4ab1-aee1-4e9205d7f9b9

📥 Commits

Reviewing files that changed from the base of the PR and between bc5e876 and 5af2fe8.

⛔ Files ignored due to path filters (28)
  • vendor/modules.txt is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.gitignore is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.golangci.yml is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/.gomodcheck.yaml is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/CONTRIBUTING.md is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/FAQ.md is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/Makefile is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/OWNERS is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/OWNERS_ALIASES is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/README.md is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/RELEASE.md is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/SECURITY_CONTACTS is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/TMP-LOGGING.md is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/VERSIONING.md is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/alias.go is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/code-of-conduct.md is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/doc.go is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/controller.go is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/doc.go is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/options.go is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/builder/webhook.go is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/client/config/config.go is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/client/config/doc.go is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/doc.go is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal.go is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal_posix.go is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/manager/signals/signal_windows.go is excluded by !vendor/**, !**/vendor/**
  • vendor/sigs.k8s.io/controller-runtime/pkg/scheme/scheme.go is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (6)
  • bindata/observability/07-observability-operator.yaml
  • bindata/observability/08-flowcollector.yaml
  • pkg/controller/add_networkconfig.go
  • pkg/controller/observability/observability_controller.go
  • pkg/controller/observability/observability_controller_test.go
  • sample-cluster-network-config.yaml
✅ Files skipped from review due to trivial changes (1)
  • pkg/controller/add_networkconfig.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • bindata/observability/08-flowcollector.yaml
  • pkg/controller/observability/observability_controller_test.go
  • pkg/controller/observability/observability_controller.go

Comment on lines +6 to +10
apiVersion: v1
kind: ServiceAccount
metadata:
name: netobserv-operator-installer
namespace: openshift-netobserv-operator

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Disable ambient token mounting for the installer ServiceAccount.

This ServiceAccount carries broad installer permissions; set automountServiceAccountToken: false unless OLM v1 specifically requires token-mounted pods for this SA.

Suggested change
 apiVersion: v1
 kind: ServiceAccount
 metadata:
   name: netobserv-operator-installer
   namespace: openshift-netobserv-operator
+automountServiceAccountToken: false

As per path instructions, “automountServiceAccountToken: false unless needed.”

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
apiVersion: v1
kind: ServiceAccount
metadata:
name: netobserv-operator-installer
namespace: openshift-netobserv-operator
apiVersion: v1
kind: ServiceAccount
metadata:
name: netobserv-operator-installer
namespace: openshift-netobserv-operator
automountServiceAccountToken: false
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bindata/observability/07-observability-operator.yaml` around lines 6 - 10,
The installer ServiceAccount should not automatically mount a token by default.
Update the ServiceAccount definition in the observability operator manifest to
set automountServiceAccountToken to false for this installer identity, and only
leave token mounting enabled if OLM v1 explicitly requires it. Use the
ServiceAccount resource in the YAML as the target for this change.

Source: Path instructions

Comment on lines +135 to +139
# Metrics
- apiGroups: ["metrics.k8s.io"]
resources: ["pods"]
verbs: ["create"]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== metrics.k8s.io RBAC references =="
rg -n -C4 'apiGroups:\s*\["metrics\.k8s\.io"\]|resources:\s*\["pods"\]|verbs:\s*\["create"\]' --glob '*.yaml' --glob '*.yml' .

Repository: openshift/cluster-network-operator

Length of output: 7825


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== surrounding RBAC =="
sed -n '120,145p' bindata/observability/07-observability-operator.yaml

echo
echo "== metrics API usage in repo =="
rg -n 'metrics\.k8s\.io|PodMetrics|pod metrics|metrics client|metrics\.Pods|GetMetrics|ListMetrics' .

Repository: openshift/cluster-network-operator

Length of output: 28672


Use read verbs for metrics.k8s.io pods bindata/observability/07-observability-operator.yaml:135-138

create does not grant pod metrics reads. If this rule is meant to let the operator query metrics, it should use get, list, and watch.

Suggested change
   # Metrics
   - apiGroups: ["metrics.k8s.io"]
     resources: ["pods"]
-    verbs: ["create"]
+    verbs: ["get", "list", "watch"]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Metrics
- apiGroups: ["metrics.k8s.io"]
resources: ["pods"]
verbs: ["create"]
# Metrics
- apiGroups: ["metrics.k8s.io"]
resources: ["pods"]
verbs: ["get", "list", "watch"]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bindata/observability/07-observability-operator.yaml` around lines 135 - 139,
The RBAC rule for metrics access is using the wrong verb set for reading pod
metrics. Update the rule in the observability operator manifest so the
`metrics.k8s.io` pods permission grants read-only access via `get`, `list`, and
`watch` instead of `create`. Keep the change scoped to the existing RBAC entry
that defines the pod metrics permissions so the operator can query metrics
correctly.

Comment on lines +13 to +15
# installationPolicy controls Network Observability installation during cluster deployment (day-0).
# Valid values: "", "InstallAndEnable", "NoAction"
# Default (empty or omitted): enabled on multi-node clusters, disabled on SNO

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Fix the documented installationPolicy values.

Line 14 says "" is a valid value, but the API enum only allows InstallAndEnable and NoAction. Copying that into a manifest would fail validation; the “default” behavior only applies when the field or block is omitted, not when installationPolicy: "" is set explicitly.

Suggested comment update
-    # Valid values: "", "InstallAndEnable", "NoAction"
-    # Default (empty or omitted): enabled on multi-node clusters, disabled on SNO
+    # Valid values: "InstallAndEnable", "NoAction"
+    # Default when this block is omitted: enabled on multi-node clusters, disabled on SNO
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# installationPolicy controls Network Observability installation during cluster deployment (day-0).
# Valid values: "", "InstallAndEnable", "NoAction"
# Default (empty or omitted): enabled on multi-node clusters, disabled on SNO
# Valid values: "InstallAndEnable", "NoAction"
# Default when this block is omitted: enabled on multi-node clusters, disabled on SNO
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sample-cluster-network-config.yaml` around lines 13 - 15, Update the
documentation for installationPolicy so it only lists the valid enum values
accepted by the API, namely InstallAndEnable and NoAction. Remove the
empty-string entry and make it clear that the default behavior comes from
omitting the field or block entirely, not from setting installationPolicy to an
empty string. Refer to the installationPolicy section in the sample cluster
network config comment so it can be corrected even if the surrounding text
moves.

@arkadeepsen arkadeepsen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit message says that RBAC is added. However, the required RBAC was already there and there was no separate RBAC added. Let's update the commit message to correctly indicate the changes.

Comment on lines +6 to +14
agent:
ebpf:
features:
- DNSTracking
sampling: 400
type: eBPF
deploymentModel: Service
loki:
enable: false

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the EP comments it was mentioned that an empty FlowCollector will be created: openshift/enhancements#1908 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants