Error 198: DNS_ERROR
This error occurs when ClickHouse cannot resolve a hostname to an IP address through DNS lookup. It indicates that DNS resolution failed for a hostname used in cluster configuration, distributed queries, or external connections.
Most common causes
-
Hostname does not exist
- Hostname is misspelled in configuration
- Pod or service not yet created in Kubernetes
- Server has been decommissioned or renamed
- DNS record not created or has been deleted
-
DNS server issues
- DNS server is unreachable or down
- Network connectivity problems to DNS server
- DNS server timeout or slow response
- Incorrect DNS server configuration
-
Kubernetes service discovery problems
- Pods not ready when DNS lookup occurs
- Service endpoints are not yet available
- Headless service DNS not propagated
- CoreDNS or kube-dns issues in cluster
-
Cluster configuration errors
- Wrong hostname in cluster configuration
- Hostname referencing nodes that don't exist
- Typo in
remote_serversconfiguration - Stale configuration with old hostnames
-
DNS cache issues
- Cached DNS entries for deleted hosts
- DNS TTL expiration causing lookups for removed hosts
- ClickHouse DNS cache not updated after infrastructure changes
-
Network or firewall issues
- Firewall blocking DNS queries (port 53)
- Network segmentation preventing DNS access
- DNS resolution timeout too short
Common solutions
1. Verify hostname resolution manually
2. Check cluster configuration
3. Check ClickHouse DNS resolver logs
4. Clear ClickHouse DNS cache
ClickHouse caches DNS lookups. If hostnames have changed:
5. Fix Kubernetes service issues
6. Verify DNS server configuration
7. Update cluster configuration
Remove non-existent hosts from configuration:
Common scenarios
Scenario 1: Kubernetes pod not ready
Cause: Pod not yet started or service endpoints not available.
Solution:
- Wait for pods to become ready
- Check pod status:
kubectl get pods - Verify headless service has endpoints:
kubectl get endpoints
Scenario 2: Stale cluster configuration
Cause: Configuration references servers that have been removed.
Solution:
- Update cluster configuration to remove old hosts
- Reload configuration:
SYSTEM RELOAD CONFIG - Or restart ClickHouse server
Scenario 3: DNS server unreachable
Cause: DNS server is down or unreachable.
Solution:
- Check DNS server status
- Verify network connectivity
- Test DNS resolution manually:
nslookup hostname - Check
/etc/resolv.conffor correct DNS servers
Scenario 4: Embedded Keeper quorum issues
Cause: Keeper nodes not yet available or wrong hostname.
Solution:
- Ensure all Keeper nodes are started
- Verify Keeper configuration has correct hostnames
- Check Keeper logs for connectivity issues
Prevention tips
- Use valid hostnames: Verify hostnames exist before adding to configuration
- Test DNS resolution: Use
nslookupordigto test hostnames before configuring - Monitor DNS health: Set up monitoring for DNS server availability
- Use DNS caching wisely: Consider DNS TTL settings for dynamic environments
- Keep configuration current: Remove decommissioned servers from cluster config
- Kubernetes readiness: Ensure pods are ready before ClickHouse tries to connect
- Use StatefulSets: In Kubernetes, use StatefulSets for predictable DNS names
Debugging steps
-
Identify failing hostname:
-
Test DNS resolution:
-
Check cluster configuration:
-
Monitor DNS cache updates:
-
Check network connectivity:
-
Review Kubernetes events (if applicable):
Special considerations
For Kubernetes deployments:
- Headless services create DNS entries for each pod
- StatefulSet pods have predictable DNS names:
pod-name-0.service-name.namespace.svc.cluster.local - DNS may not be immediately available when pods are starting
- CoreDNS issues can affect entire cluster
For distributed clusters:
- All nodes must be able to resolve each other's hostnames
- DNS failures on one node can affect distributed queries
- Consider using IP addresses for critical internal connections (though less flexible)
For ClickHouse Keeper:
- All Keeper nodes must be resolvable by name
- Keeper quorum formation requires DNS resolution
- Wrong hostname in Keeper config prevents cluster formation
DNS cache behavior:
- ClickHouse caches DNS lookups to reduce DNS queries
- Cache is updated periodically (default: every 15 seconds)
- Failed lookups are also cached temporarily
SYSTEM RELOAD CONFIGforces DNS cache refresh
Configuration settings
DNS-related settings in ClickHouse configuration:
When DNS errors persist
If DNS errors continue after basic troubleshooting:
-
Use IP addresses temporarily:
-
Add entries to /etc/hosts:
-
Configure alternative DNS servers:
-
Increase DNS timeout:
- Check system DNS resolver timeout settings
- Consider increasing if network latency is high
If you're experiencing this error:
- Identify which hostname is failing from error logs
- Test DNS resolution manually with
nslookupordig - Verify the hostname exists and is spelled correctly
- Check DNS server availability and accessibility
- For Kubernetes: ensure pods are ready and service endpoints exist
- Update cluster configuration to remove non-existent hosts
- Reload ClickHouse configuration or restart server
- Monitor DNS cache updates in ClickHouse logs