REC 8.0.16-25 fails to initialize on OpenShift 4.18 – pdns_server exits with status 126 / permission issues

Hello Team,


### **Environment**

_Platform: OpenShift 4.18.33
Deployment Method: Redis Enterprise Operator (Red Hat certified)
REC Version: 8.0.16-25.0
SCC in use: restricted-v2 (default), tested with anyuid_


### **Issue Description**

We are deploying Redis Enterprise Cluster (REC) using the Operator on OpenShift. During initialization:

Pod starts successfully but quickly enters a crash loop. pdns_server repeatedly fails to start

### _Supervisor reports:_
`>  ERROR (no such process)`

### _Observed errors:_

`> pdns_server exited with status 126 Operation not permitted`


**This prevents successful cluster bootstrap.**


### _Observed Behavior_

From container logs:

```
WARN exited: pdns_server (exit status 126; not expected)
INFO gave up: pdns_server entered FATAL state
ERROR (no such process)
```

### Troubleshooting Performed

We performed a detailed investigation across multiple layers:

_1. SCC / Security Context Behavior_


```
Default restricted-v2 SCC prevents running with fixed UID (1001). Operator-generated pods include:

> runAsUser: 1001


This causes incompatibility with OpenShift security model. Even after assigning anyuid SCC:

> oc adm policy add-scc-to-user anyuid -z <serviceaccount>


the pod still defaults to nonroot-v2 SCC unless explicitly forced
```


_2. Binary Execution Failure_

```
pdns_server fails with:

`exit status 126`

This indicates:

`Binary exists but cannot be executed`

Inspection shows:

Binaries and libraries (e.g., pdns_server, libpipebackend.so) are owned by:

`redislabs:redislabs (UID 1001)`

> Permissions: 750

When running under a different UID (as enforced by OpenShift), execution fails

```


_3. Supervisor Configuration Behavior

```
In configuration file, /opt/redislabs/config/supervisord.conf , it's including other core components configurations, 

includes:
`supervisord.conf.d/*.conf`

However:  The expected supervisord.conf.d directory is not present in the container at runtime (prior to bootstrap completion)

This leads to:

ERROR (no such process)
It is unclear whether:
This directory is expected to be dynamically created during bootstrap
Or if there is a packaging inconsistency in the image

```
### Key Observations

There appears to be a mismatch between container image assumptions and OpenShift security model:

```
Image expects fixed UID (1001)
OpenShift enforces arbitrary UID (restricted-v2)
Even when SCC is relaxed (anyuid), SCC selection behavior may still prevent proper execution unless explicitly controlled
The failure of pdns_server blocks further bootstrap, leading to incomplete initialization
```

### Expected Behavior

REC should initialize successfully on OpenShift 4.18 using the certified operator
Internal services (including pdns_server) should start without requiring manual SCC overrides or UID adjustments


### Questions / Clarifications

1. Is REC 8.0.16-25 fully compatible with OpenShift restricted-v2 SCC?
2. Is running with fixed UID (1001) a hard requirement for this version?
3. Are the binary permissions (750) intentional, or should they support arbitrary UID execution?
4. Is the supervisord.conf.d directory expected to be:
          - Present in the image, or
          - Generated dynamically during bootstrap?


### Additional Notes

- No recent changes were introduced at the OpenShift platform level

- The setup was functioning previously after upgrade, but the issue started occurring recently without configuration changes

Please refer the pod error logs below,

> 2026-04-20 16:31:29,310 WARN exited: pdns_server (exit status 126; not expected)
2026-04-20 16:31:29,313 INFO spawned: 'pdns_server' with pid 634
2026-04-20 16:31:29,910 WARN exited: pdns_server (exit status 126; not expected)
2026-04-20 16:31:30,915 INFO spawned: 'pdns_server' with pid 693
2026-04-20 16:31:31,415 WARN exited: pdns_server (exit status 126; not expected)
2026-04-20 16:31:34,114 INFO spawned: 'pdns_server' with pid 786
2026-04-20 16:31:34,325 WARN exited: pdns_server (exit status 126; not expected)
2026-04-20 16:31:38,028 INFO spawned: 'pdns_server' with pid 904
2026-04-20 16:31:38,122 WARN exited: pdns_server (exit status 126; not expected)
2026-04-20 16:31:38,123 INFO gave up: pdns_server entered FATAL state, too many start retries too quickly
2026-04-20 16:31:38,624 INFO __main__ MainThread: Done, moving to bootstrapping
2026-04-20 16:31:38,624 INFO __main__ MainThread: Node Type is None
2026-04-20 16:31:38,624 INFO __main__ MainThread: Bootstrapping node with action 'None'
2026-04-20 16:31:38,624 INFO __main__ MainThread: Done
2026-04-20 16:31:51,277 INFO waiting for envoy to stop
2026-04-20 16:31:51,279 WARN stopped: envoy (terminated by SIGTERM)
2026-04-20 16:31:51,283 INFO spawned: 'envoy' with pid 1262
2026-04-20 16:31:52,430 INFO success: envoy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2026-04-20 16:31:52,432 INFO waiting for envoy_control_plane to stop
2026-04-20 16:31:52,435 WARN stopped: envoy_control_plane (terminated by SIGTERM)
2026-04-20 16:31:52,438 INFO spawned: 'envoy_control_plane' with pid 1283
2026-04-20 16:31:53,440 INFO success: envoy_control_plane entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2026-04-20 16:31:53,443 INFO waiting for ccs to stop
2026-04-20 16:31:53,545 INFO stopped: ccs (exit status 0)
2026-04-20 16:31:53,548 INFO spawned: 'ccs' with pid 1295
2026-04-20 16:31:54,937 INFO success: ccs entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2026-04-20 16:31:54,989 INFO waiting for crdb_coordinator to stop
2026-04-20 16:31:55,675 INFO stopped: crdb_coordinator (exit status 0)
2026-04-20 16:31:55,681 INFO spawned: 'crdb_coordinator' with pid 1322
2026-04-20 16:31:56,712 INFO success: crdb_coordinator entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2026-04-20 16:32:06,312 INFO waiting for bootstrap_mgr to stop
2026-04-20 16:32:06,417 WARN stopped: bootstrap_mgr (terminated by SIGTERM) 


Here, I've attached the debug log for reference,

[rec_debug_info.tar.gz](https://github.com/user-attachments/files/26954551/rec_debug_info.tar.gz)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REC 8.0.16-25 fails to initialize on OpenShift 4.18 – pdns_server exits with status 126 / permission issues #331

Environment

Issue Description

Supervisor reports:

Observed errors:

Observed Behavior

Troubleshooting Performed

Key Observations

Expected Behavior

Questions / Clarifications

Additional Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

REC 8.0.16-25 fails to initialize on OpenShift 4.18 – pdns_server exits with status 126 / permission issues #331

Description

Environment

Issue Description

Supervisor reports:

Observed errors:

Observed Behavior

Troubleshooting Performed

Key Observations

Expected Behavior

Questions / Clarifications

Additional Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions