Is this a request for help?: No
Is this an ISSUE or FEATURE REQUEST? (choose one): Issue
Which release version?: v1.0.11
Which component (CNI/IPAM/CNM/CNS): CNI
Which Operating System (Linux/Windows): Windows Server version 1803
Which Orchestrator and version (e.g. Kubernetes, Docker): Kubernetes v1.10.6
What happened:
When I try to scale up to several pods on the same node, there will be some store locking issues as overlapped calls are made to the Azure CNI binary.
2018/08/15 21:11:45 Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
2018/08/15 21:11:45 "Start Flag false CniSucceeded false Name CNI Version v1.0.11 ErrorMessage Store is locked vnet []
Context AzureCNI SubContext "
2018/08/15 21:11:45 OrchestratorDetails &{ kubectl command failed due to exit status 1}
2018/08/15 21:11:45 OSDetails &{windows }
2018/08/15 21:11:45 SystemDetails &{0 0 0 0 0 0 }
2018/08/15 21:11:45 InterfaceDetails &{Primary 10.240.0.0/12 10.240.0.4 00:0d:3a:f9:3a:fe vEthernet (Ethernet 2) 30 0 }
2018/08/15 21:11:45 BridgeDetails <nil>
2018/08/15 21:11:45 Send telemetry success 200
2018/08/15 21:11:45 Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
2018/08/15 21:11:45 "Start Flag false CniSucceeded false Name CNI Version v1.0.11 ErrorMessage Store is locked vnet []
Context AzureCNI SubContext "
2018/08/15 21:11:45 OrchestratorDetails &{ kubectl command failed due to exit status 1}
2018/08/15 21:11:45 OSDetails &{windows }
2018/08/15 21:11:45 SystemDetails &{0 0 0 0 0 0 }
2018/08/15 21:11:45 InterfaceDetails &{Primary 10.240.0.0/12 10.240.0.4 00:0d:3a:f9:3a:fe vEthernet (Ethernet 2) 30 0 }
2018/08/15 21:11:45 BridgeDetails <nil>
2018/08/15 21:11:45 Send telemetry success 200
2018/08/15 21:11:45 Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
2018/08/15 21:11:45 "Start Flag false CniSucceeded false Name CNI Version v1.0.11 ErrorMessage Store is locked vnet []
Context AzureCNI SubContext "
2018/08/15 21:11:45 OrchestratorDetails &{ kubectl command failed due to exit status 1}
2018/08/15 21:11:45 OSDetails &{windows }
2018/08/15 21:11:45 SystemDetails &{0 0 0 0 0 0 }
2018/08/15 21:11:45 InterfaceDetails &{Primary 10.240.0.0/12 10.240.0.4 00:0d:3a:f9:3a:fe vEthernet (Ethernet 2) 30 0 }
2018/08/15 21:11:45 BridgeDetails <nil>
2018/08/15 21:11:45 Send telemetry success 200
2018/08/15 21:11:45 Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
2018/08/15 21:11:45 "Start Flag false CniSucceeded true Name CNI Version v1.0.11 ErrorMessage vnet []
Context AzureCNI SubContext "
2018/08/15 21:11:45 OrchestratorDetails &{ kubectl command failed due to exit status 1}
2018/08/15 21:11:45 OSDetails &{windows }
2018/08/15 21:11:45 SystemDetails &{0 0 0 0 0 0 }
2018/08/15 21:11:45 InterfaceDetails &{Primary 10.240.0.0/12 10.240.0.4 00:0d:3a:f9:3a:fe vEthernet (Ethernet 2) 30 0 }
2018/08/15 21:11:45 BridgeDetails <nil>
2018/08/15 21:11:45 Send telemetry success 200
2018/08/15 21:11:45 SetReportState succeeded
2018/08/15 21:11:45 SetReportState succeeded
2018/08/15 21:11:45 SetReportState succeeded
2018/08/15 21:11:45 SetReportState succeeded
E0815 21:11:45.950994 4188 cni.go:259] Error adding network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
E0815 21:11:45.950994 4188 cni_windows.go:49] error while adding to cni network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
W0815 21:11:45.950994 4188 docker_sandbox.go:353] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "iis-1803-5ffd8b84d6-46msm_default": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
I0815 21:11:45.957988 4188 kubelet_node_status.go:491] Using Node Hostname from cloudprovider: "15453k8s9001"
E0815 21:11:45.971011 4188 cni.go:259] Error adding network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
E0815 21:11:45.971011 4188 cni_windows.go:49] error while adding to cni network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
W0815 21:11:45.971011 4188 docker_sandbox.go:353] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "iis-1803-5ffd8b84d6-vdppz_default": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
E0815 21:11:45.991997 4188 cni.go:259] Error adding network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
E0815 21:11:45.992999 4188 cni_windows.go:49] error while adding to cni network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
W0815 21:11:45.992999 4188 docker_sandbox.go:353] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "iis-1803-5ffd8b84d6-sjlmm_default": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
E0815 21:11:46.006993 4188 cni.go:259] Error adding network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
E0815 21:11:46.006993 4188 cni_windows.go:49] error while adding to cni network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
W0815 21:11:46.006993 4188 docker_sandbox.go:353] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "iis-1803-5ffd8b84d6-2rz9c_default": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
2018/08/15 21:11:46 SetReportState succeeded
2018/08/15 21:11:41 [cni] Timed out on locking store, err:Store is locked.
2018/08/15 21:11:46 Failed to initialize key-value store of network plugin, err:Store is locked.
2018/08/15 21:11:46 Report plugin error
2018/08/15 21:11:41 [cni] Timed out on locking store, err:Store is locked.
2018/08/15 21:11:46 Failed to initialize key-value store of network plugin, err:Store is locked.
2018/08/15 21:11:46 Report plugin error
2018/08/15 21:11:41 [cni] Timed out on locking store, err:Store is locked.
2018/08/15 21:11:46 Failed to initialize key-value store of network plugin, err:Store is locked.
2018/08/15 21:11:46 Report plugin error
2018/08/15 21:11:41 [cni] Timed out on locking store, err:Store is locked.
2018/08/15 21:11:46 Failed to initialize key-value store of network plugin, err:Store is locked.
2018/08/15 21:11:46 Report plugin error
2018/08/15 21:11:41 [cni] Timed out on locking store, err:Store is locked.
2018/08/15 21:11:46 Failed to initialize key-value store of network plugin, err:Store is locked.
2018/08/15 21:11:46 Report plugin error
2018/08/15 21:11:46 [cni] Timed out on locking store, err:Store is locked.
2018/08/15 21:11:46 Failed to initialize key-value store of network plugin, err:Store is locked.
2018/08/15 21:11:46 Report plugin error
2018/08/15 21:11:46 [cni] Timed out on locking store, err:Store is locked.
2018/08/15 21:11:46 Failed to initialize key-value store of network plugin, err:Store is locked.
2018/08/15 21:11:46 Report plugin error
What you expected to happen:
Instead of seeing E0815 21:11:45.950994 4188 cni.go:259] Error adding network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input after CNI failed to get the lock, I would expect a valid error.
I looked at the source:
|
log.Printf("Failed to initialize key-value store of network plugin, err:%v.\n", err) |
and it has return code 1 with nothing on stderr
What I would expect is a return code >100 (vendor specific), along with this json schema on stdout:
{
"cniVersion": "0.3.1",
"code": <numeric-error-code>,
"msg": <short-error-message>,
"details": <long-error-message> (optional)
}
How to reproduce it (as minimally and precisely as possible):
Try to scale anything up on Windows. It will eventually succeed, but there will be errors in the process as multiple pods need addEndpoint called simultaneously.
Is this a request for help?: No
Is this an ISSUE or FEATURE REQUEST? (choose one): Issue
Which release version?: v1.0.11
Which component (CNI/IPAM/CNM/CNS): CNI
Which Operating System (Linux/Windows): Windows Server version 1803
Which Orchestrator and version (e.g. Kubernetes, Docker): Kubernetes v1.10.6
What happened:
When I try to scale up to several pods on the same node, there will be some store locking issues as overlapped calls are made to the Azure CNI binary.
What you expected to happen:
Instead of seeing
E0815 21:11:45.950994 4188 cni.go:259] Error adding network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON inputafter CNI failed to get the lock, I would expect a valid error.I looked at the source:
azure-container-networking/cni/network/plugin/main.go
Line 92 in 677d471
and it has return code 1 with nothing on stderr
What I would expect is a return code >100 (vendor specific), along with this json schema on stdout:
How to reproduce it (as minimally and precisely as possible):
Try to scale anything up on Windows. It will eventually succeed, but there will be errors in the process as multiple pods need addEndpoint called simultaneously.