Hello Team,
Environment
Platform: OpenShift 4.18.33
Deployment Method: Redis Enterprise Operator (Red Hat certified)
REC Version: 8.0.16-25.0
SCC in use: restricted-v2 (default), tested with anyuid
Issue Description
We are deploying Redis Enterprise Cluster (REC) using the Operator on OpenShift. During initialization:
Pod starts successfully but quickly enters a crash loop. pdns_server repeatedly fails to start
Supervisor reports:
> ERROR (no such process)
Observed errors:
> pdns_server exited with status 126 Operation not permitted
This prevents successful cluster bootstrap.
Observed Behavior
From container logs:
WARN exited: pdns_server (exit status 126; not expected)
INFO gave up: pdns_server entered FATAL state
ERROR (no such process)
Troubleshooting Performed
We performed a detailed investigation across multiple layers:
1. SCC / Security Context Behavior
Default restricted-v2 SCC prevents running with fixed UID (1001). Operator-generated pods include:
> runAsUser: 1001
This causes incompatibility with OpenShift security model. Even after assigning anyuid SCC:
> oc adm policy add-scc-to-user anyuid -z <serviceaccount>
the pod still defaults to nonroot-v2 SCC unless explicitly forced
2. Binary Execution Failure
pdns_server fails with:
`exit status 126`
This indicates:
`Binary exists but cannot be executed`
Inspection shows:
Binaries and libraries (e.g., pdns_server, libpipebackend.so) are owned by:
`redislabs:redislabs (UID 1001)`
> Permissions: 750
When running under a different UID (as enforced by OpenShift), execution fails
_3. Supervisor Configuration Behavior
In configuration file, /opt/redislabs/config/supervisord.conf , it's including other core components configurations,
includes:
`supervisord.conf.d/*.conf`
However: The expected supervisord.conf.d directory is not present in the container at runtime (prior to bootstrap completion)
This leads to:
ERROR (no such process)
It is unclear whether:
This directory is expected to be dynamically created during bootstrap
Or if there is a packaging inconsistency in the image
Key Observations
There appears to be a mismatch between container image assumptions and OpenShift security model:
Image expects fixed UID (1001)
OpenShift enforces arbitrary UID (restricted-v2)
Even when SCC is relaxed (anyuid), SCC selection behavior may still prevent proper execution unless explicitly controlled
The failure of pdns_server blocks further bootstrap, leading to incomplete initialization
Expected Behavior
REC should initialize successfully on OpenShift 4.18 using the certified operator
Internal services (including pdns_server) should start without requiring manual SCC overrides or UID adjustments
Questions / Clarifications
- Is REC 8.0.16-25 fully compatible with OpenShift restricted-v2 SCC?
- Is running with fixed UID (1001) a hard requirement for this version?
- Are the binary permissions (750) intentional, or should they support arbitrary UID execution?
- Is the supervisord.conf.d directory expected to be:
- Present in the image, or
- Generated dynamically during bootstrap?
Additional Notes
-
No recent changes were introduced at the OpenShift platform level
-
The setup was functioning previously after upgrade, but the issue started occurring recently without configuration changes
Please refer the pod error logs below,
2026-04-20 16:31:29,310 WARN exited: pdns_server (exit status 126; not expected)
2026-04-20 16:31:29,313 INFO spawned: 'pdns_server' with pid 634
2026-04-20 16:31:29,910 WARN exited: pdns_server (exit status 126; not expected)
2026-04-20 16:31:30,915 INFO spawned: 'pdns_server' with pid 693
2026-04-20 16:31:31,415 WARN exited: pdns_server (exit status 126; not expected)
2026-04-20 16:31:34,114 INFO spawned: 'pdns_server' with pid 786
2026-04-20 16:31:34,325 WARN exited: pdns_server (exit status 126; not expected)
2026-04-20 16:31:38,028 INFO spawned: 'pdns_server' with pid 904
2026-04-20 16:31:38,122 WARN exited: pdns_server (exit status 126; not expected)
2026-04-20 16:31:38,123 INFO gave up: pdns_server entered FATAL state, too many start retries too quickly
2026-04-20 16:31:38,624 INFO main MainThread: Done, moving to bootstrapping
2026-04-20 16:31:38,624 INFO main MainThread: Node Type is None
2026-04-20 16:31:38,624 INFO main MainThread: Bootstrapping node with action 'None'
2026-04-20 16:31:38,624 INFO main MainThread: Done
2026-04-20 16:31:51,277 INFO waiting for envoy to stop
2026-04-20 16:31:51,279 WARN stopped: envoy (terminated by SIGTERM)
2026-04-20 16:31:51,283 INFO spawned: 'envoy' with pid 1262
2026-04-20 16:31:52,430 INFO success: envoy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2026-04-20 16:31:52,432 INFO waiting for envoy_control_plane to stop
2026-04-20 16:31:52,435 WARN stopped: envoy_control_plane (terminated by SIGTERM)
2026-04-20 16:31:52,438 INFO spawned: 'envoy_control_plane' with pid 1283
2026-04-20 16:31:53,440 INFO success: envoy_control_plane entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2026-04-20 16:31:53,443 INFO waiting for ccs to stop
2026-04-20 16:31:53,545 INFO stopped: ccs (exit status 0)
2026-04-20 16:31:53,548 INFO spawned: 'ccs' with pid 1295
2026-04-20 16:31:54,937 INFO success: ccs entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2026-04-20 16:31:54,989 INFO waiting for crdb_coordinator to stop
2026-04-20 16:31:55,675 INFO stopped: crdb_coordinator (exit status 0)
2026-04-20 16:31:55,681 INFO spawned: 'crdb_coordinator' with pid 1322
2026-04-20 16:31:56,712 INFO success: crdb_coordinator entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2026-04-20 16:32:06,312 INFO waiting for bootstrap_mgr to stop
2026-04-20 16:32:06,417 WARN stopped: bootstrap_mgr (terminated by SIGTERM)
Here, I've attached the debug log for reference,
rec_debug_info.tar.gz
Hello Team,
Environment
Platform: OpenShift 4.18.33
Deployment Method: Redis Enterprise Operator (Red Hat certified)
REC Version: 8.0.16-25.0
SCC in use: restricted-v2 (default), tested with anyuid
Issue Description
We are deploying Redis Enterprise Cluster (REC) using the Operator on OpenShift. During initialization:
Pod starts successfully but quickly enters a crash loop. pdns_server repeatedly fails to start
Supervisor reports:
> ERROR (no such process)Observed errors:
> pdns_server exited with status 126 Operation not permittedThis prevents successful cluster bootstrap.
Observed Behavior
From container logs:
Troubleshooting Performed
We performed a detailed investigation across multiple layers:
1. SCC / Security Context Behavior
2. Binary Execution Failure
_3. Supervisor Configuration Behavior
Key Observations
There appears to be a mismatch between container image assumptions and OpenShift security model:
Expected Behavior
REC should initialize successfully on OpenShift 4.18 using the certified operator
Internal services (including pdns_server) should start without requiring manual SCC overrides or UID adjustments
Questions / Clarifications
- Present in the image, or
- Generated dynamically during bootstrap?
Additional Notes
No recent changes were introduced at the OpenShift platform level
The setup was functioning previously after upgrade, but the issue started occurring recently without configuration changes
Please refer the pod error logs below,
Here, I've attached the debug log for reference,
rec_debug_info.tar.gz