Router deploying error: update acceptor rejected router-1: took longer than 600 seconds
See original GitHub issueDescription
I’m trying to install Openshift Origin via Ansible on my local VMs. The installation fails at
TASK [openshift_hosted : Poll for OpenShift pod deployment success]
because the router pod cannot be deployed. When I use oc logs router-1-deploy I only see the error above, but I’m not able to determine the cause for the problem. As I try to build a cluster locally (for practicing the installation via ansible; no productive environment), the VMs are not as big, as the recommend values. Below I include their specs, if this is a resource problem.
Version
ansible 2.6.0
config file = /etc/ansible/ansible.cfg
configured module search path = [u'/home/ebberg/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python2.7/dist-packages/ansible
executable location = /usr/bin/ansible
python version = 2.7.15rc1 (default, Apr 15 2018, 21:51:34) [GCC 7.3.0]
Steps To Reproduce
- Complete Prerequisites
- Install Openshift Origin via
ansible-playbook openshift-ansible/playbooks/deploy_cluster.yml - See the
TASK [openshift_hosted : Poll for OpenShift pod deployment success]fail
Expected Results
The router should be deployed successfully and the installation should complete.
Observed Results
The mentioned task polls the router pod, which fails, because the deployment fails. The ansible output is a long JSON description, with no obvious error description. It is linked below with gist.
oc logs router-1-deploy gives this output:
--> Scaling router-1 to 1
error: update acceptor rejected router-1: pods for rc 'default/router-1' took longer than 600 seconds to become available
Using oc rollout latest router to redeploy the router after the failed ansible installation creates a new deployment, but similar output. The actual router pods logs are this:
I0717 08:39:34.511531 1 template.go:260] Starting template router (v3.9.0+71543b2-33)
I0717 08:39:34.512703 1 metrics.go:157] Router health and metrics port listening at 0.0.0.0:193
The pod vanishes, when the deployment fails, so that there are no logs for that.
For long output or logs, consider using a gist
Additional Information
The Ansible output can be found here at gist
OS: CentOS Linux release 7.5.1804 (Core)
1 Master: 8GB RAM, 20GB Festplatte + 15GB Docker storage
1 Infrastructure node: 6GB RAM, Storage as Master
2 Worker nodes: 6GB RAM, Storage as Master
all hosts with 1 vCPU
Output of oc status:
In project default on server https://master:8443
https://docker-registry-default.router.default.svc.cluster.local (passthrough) (svc/docker-registry)
dc/docker-registry deploys docker.io/openshift/origin-docker-registry:v3.9.0
deployment #1 deployed 37 minutes ago - 1 pod
svc/kubernetes - 172.30.0.1 ports 443->8443, 53->8053, 53->8053
svc/router - 172.30.220.121 ports 80, 443, 1936
dc/router deploys docker.io/openshift/origin-haproxy-router:v3.9.0
deployment #1 failed 37 minutes ago: config change
1 info identified, use 'oc status -v' to see details.
Inventory file: (I have 2 network interfaces on each host: 1 for internet and 1 for cluster communication. Thus I have to provide the openshift_ip for the master, so that the services listen on the correct interface. For the same reason I provide etc_listen_client_urls.)
[OSEv3:children]
### Define cluster elements
masters
nodes
etcd
[OSEv3:vars]
### Define cluster variables
### Set the IPs, that etcd should listen to. Necessary, because we have 2 network interfaces, 1 for internet and 1 for cluster networking. With this etcd listens on localhost and the cluster network interface
etcd_listen_client_urls=https://127.0.0.1:2379,https://192.168.99.200:2379
### Set it to allow local users according to https://access.redhat.com/documentation/en-us/openshift_container_platform/3.6/html/installation_and_configuration/install-config-configuring-authentication#identity-providers-ansible
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}]
# Native high availbility cluster method with optional load balancer.
# If no lb group is defined installer assumes that a load balancer has
# been preconfigured. For installation the value of
# openshift_master_cluster_hostname must resolve to the load balancer
# or to one or all of the masters defined in the inventory if no load
# balancer is present.
openshift_master_cluster_method=native
### Internal hostname
openshift_master_cluster_hostname=master.example.com
# enable ntp on masters to ensure proper failover
openshift_clock_enabled=true
### Define ansible ssh credentials
ansible_ssh_user=ansible
ansible_become=true
openshift_deployment_type=origin
### Openshift Version
openshift_release='3.9'
### What runs on the infra nodes
openshift_router_selector='region=infra'
openshift_registry_selector='region=infra'
### disable pre-flight check of memory, since we are operating in a test environment with lower memory, than recommended
openshift_disable_check=memory_availability
[masters]
master.example.com openshift_ip=192.168.99.200
[etcd]
master.example.com openshift_ip=192.168.99.200
[nodes]
master.example.com openshift_ip=192.168.99.200
infra.example.com openshift_node_labels="{'region': 'infra', 'zone': 'default'}"
node1.example.com openshift_node_labels="{'region': 'primary', 'zone': 'worker'}"
node2.example.com openshift_node_labels="{'region': 'primary', 'zone': 'worker'}"
Issue Analytics
- State:
- Created 5 years ago
- Comments:12 (1 by maintainers)
Top Related StackOverflow Question
I had similar problem, check your /etc/hosts
I was using Vagrant and it has been it modified.
Thank you for responding back quickly. Appreciated. And sorry for the trouble once again. We are facing this kind of problem at the moment.
[image: image.png]
Regards, Khalid - Pakistan
On Tue, Jul 20, 2021 at 11:36 AM lucullusTheOnly @.***> wrote: