Router deploying error: update acceptor rejected router-1: took longer than 600 seconds

Description

I’m trying to install Openshift Origin via Ansible on my local VMs. The installation fails at

TASK [openshift_hosted : Poll for OpenShift pod deployment success]

because the router pod cannot be deployed. When I use oc logs router-1-deploy I only see the error above, but I’m not able to determine the cause for the problem. As I try to build a cluster locally (for practicing the installation via ansible; no productive environment), the VMs are not as big, as the recommend values. Below I include their specs, if this is a resource problem.

Version

ansible 2.6.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/ebberg/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/dist-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.15rc1 (default, Apr 15 2018, 21:51:34) [GCC 7.3.0]

Steps To Reproduce

Complete Prerequisites
Install Openshift Origin via ansible-playbook openshift-ansible/playbooks/deploy_cluster.yml
See the TASK [openshift_hosted : Poll for OpenShift pod deployment success] fail

Expected Results

The router should be deployed successfully and the installation should complete.

Observed Results

The mentioned task polls the router pod, which fails, because the deployment fails. The ansible output is a long JSON description, with no obvious error description. It is linked below with gist.

oc logs router-1-deploy gives this output:

--> Scaling router-1 to 1
error: update acceptor rejected router-1: pods for rc 'default/router-1' took longer than 600 seconds to become available

Using oc rollout latest router to redeploy the router after the failed ansible installation creates a new deployment, but similar output. The actual router pods logs are this:

I0717 08:39:34.511531       1 template.go:260] Starting template router (v3.9.0+71543b2-33)
I0717 08:39:34.512703       1 metrics.go:157] Router health and metrics port listening at 0.0.0.0:193

The pod vanishes, when the deployment fails, so that there are no logs for that.

For long output or logs, consider using a gist

Additional Information

The Ansible output can be found here at gist

OS: CentOS Linux release 7.5.1804 (Core)

1 Master: 8GB RAM, 20GB Festplatte + 15GB Docker storage
1 Infrastructure node: 6GB RAM, Storage as Master
2 Worker nodes: 6GB RAM, Storage as Master
all hosts with 1 vCPU

Output of oc status:

In project default on server https://master:8443

https://docker-registry-default.router.default.svc.cluster.local (passthrough) (svc/docker-registry)
  dc/docker-registry deploys docker.io/openshift/origin-docker-registry:v3.9.0 
    deployment #1 deployed 37 minutes ago - 1 pod

svc/kubernetes - 172.30.0.1 ports 443->8443, 53->8053, 53->8053

svc/router - 172.30.220.121 ports 80, 443, 1936
  dc/router deploys docker.io/openshift/origin-haproxy-router:v3.9.0 
    deployment #1 failed 37 minutes ago: config change


1 info identified, use 'oc status -v' to see details.

Inventory file: (I have 2 network interfaces on each host: 1 for internet and 1 for cluster communication. Thus I have to provide the openshift_ip for the master, so that the services listen on the correct interface. For the same reason I provide etc_listen_client_urls.)

[OSEv3:children]
### Define cluster elements
masters
nodes
etcd

[OSEv3:vars]
### Define cluster variables

### Set the IPs, that etcd should listen to. Necessary, because we have 2 network interfaces, 1 for internet and 1 for cluster networking. With this etcd listens on localhost and the cluster network interface
etcd_listen_client_urls=https://127.0.0.1:2379,https://192.168.99.200:2379

### Set it to allow local users according to https://access.redhat.com/documentation/en-us/openshift_container_platform/3.6/html/installation_and_configuration/install-config-configuring-authentication#identity-providers-ansible
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}]

# Native high availbility cluster method with optional load balancer.
# If no lb group is defined installer assumes that a load balancer has
# been preconfigured. For installation the value of
# openshift_master_cluster_hostname must resolve to the load balancer
# or to one or all of the masters defined in the inventory if no load
# balancer is present.
openshift_master_cluster_method=native

### Internal hostname
openshift_master_cluster_hostname=master.example.com

# enable ntp on masters to ensure proper failover
openshift_clock_enabled=true

### Define ansible ssh credentials
ansible_ssh_user=ansible
ansible_become=true

openshift_deployment_type=origin

### Openshift Version
openshift_release='3.9'

### What runs on the infra nodes
openshift_router_selector='region=infra'
openshift_registry_selector='region=infra'

### disable pre-flight check of memory, since we are operating in a test environment with lower memory, than recommended
openshift_disable_check=memory_availability

[masters]
master.example.com openshift_ip=192.168.99.200

[etcd]
master.example.com openshift_ip=192.168.99.200

[nodes]
master.example.com openshift_ip=192.168.99.200
infra.example.com openshift_node_labels="{'region': 'infra', 'zone': 'default'}"
node1.example.com openshift_node_labels="{'region': 'primary', 'zone': 'worker'}"
node2.example.com openshift_node_labels="{'region': 'primary', 'zone': 'worker'}"

Issue Analytics

State:
Created 5 years ago
Comments:12 (1 by maintainers)

Top GitHub Comments

1reaction

camabehcommented, Feb 1, 2019

I had similar problem, check your /etc/hosts

I was using Vagrant and it has been it modified.

0reactions

khalidmehmoodcommented, Jul 20, 2021

Thank you for responding back quickly. Appreciated. And sorry for the trouble once again. We are facing this kind of problem at the moment.

[image: image.png]

Regards, Khalid - Pakistan

On Tue, Jul 20, 2021 at 11:36 AM lucullusTheOnly @.***> wrote:

@khalidmehmood https://github.com/khalidmehmood Sorry, no. I gave up at some point. My theory was, that I simply didn’t have enough resources. Though this is not really relevant now, since I did that with a previous version (still in version 3). We cannot assume, that the problem would be the exact same.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openshift/openshift-ansible/issues/9227#issuecomment-883123227, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2FVAB6N6Q24J6I2WPJMBLTYUKOVANCNFSM4FKKFQFA .

Top Results From Across the Web

error: update acceptor rejected pods for rc

error : update acceptor rejected pushgateway-1: pods for rc 'test-proj/pushgateway-1' took longer than 600 seconds to become available.

Openshift Post-install - van Zantvoort

... to become ready error: update acceptor rejected docker-registry-1: pods for rc "docker-registry-1" took longer than 600 seconds to become ready.

Error: update acceptor rejected - node.js - Stack Overflow

Second log shows that server is up and running, however application isn't available and it tries recreating deployment for few more times before ......

Troubleshooting Guide for OpenShift Container Platform

This is most likely an error during the editing of the deployment config. When assigning the volume the --name= section refers to the...

pods for rc 'default/router-4' took longe - 腾讯云开发者社区

openshift route报错error: update acceptor rejected router-4: pods for rc ... pods for rc 'default/router-4' took longer than 600 seconds to ...