Kubernetes
See the Kubernetes Helm chart that provides a pre-configured Scraper and Topology with some common defaults.
The kubernetes
scraper collects all of the resources and events in a Kubernetes cluster, and then watches for changes.
kubernetes-scraper.yamlapiVersion: configs.flanksource.com/v1
kind: ScrapeConfig
metadata:
name: kubernetes-scraper
spec:
retention:
changes:
- name: PodCrashLooping
count: 10
age: 72h
kubernetes:
- clusterName: local-kind-cluster
transform:
relationship:
# Link a service to a deployment (adjust the label selector accordingly)
- filter: config_type == "Kubernetes::Service"
type:
value: 'Kubernetes::Deployment'
name:
expr: |
has(config.spec.selector) && has(config.spec.selector.name) ? config.spec.selector.name : ''
# Link Pods to PVCs
- filter: config_type == 'Kubernetes::Pod'
expr: |
config.spec.volumes.
filter(item, has(item.persistentVolumeClaim)).
map(item, {"type": "Kubernetes::PersistentVolumeClaim", "name": item.persistentVolumeClaim.claimName}).
toJSON()
# Link Argo Application to the resources
- filter: config_type == "Kubernetes::Application" && config.apiVersion == "argoproj.io/v1alpha1"
expr: |
config.status.resources.map(item, {
"type": "Kubernetes::" + item.kind,
"name": item.name,
"labels": {
"namespace": item.namespace,
},
}).toJSON()
mask:
- selector: |
has(config.kind) ? config.kind == 'Certificate' : false
jsonpath: .spec.dnsNames
value: md5sum
- selector: 'config_type == "Kubernetes::Certificate"'
jsonpath: .spec.commonName
value: md5sum
- selector: config_class == 'Connection'
jsonpath: "$..['password','bearer','clientSecret','personalAccessToken','certificate','secretKey','token'].value"
value: '******'
exclude:
- types:
- Kubernetes::*
jsonpath: '.metadata.ownerReferences'
- types:
- Kubernetes::Pod
jsonpath: '.metadata.generateName'
changes:
mapping:
- filter: >
change.change_type == 'diff' && change.summary == "status.containerStatuses" &&
patch != null && has(patch.status) && has(patch.status.containerStatuses) &&
patch.status.containerStatuses.size() > 0 &&
has(patch.status.containerStatuses[0].restartCount)
type: PodCrashLooping
- filter: >
change.change_type == 'diff' &&
jq('.status.conditions[]? | select(.type == "Healthy").message', patch).contains('Health check passed')
type: HealthCheckPassed
exclude:
- 'config_type == "Kubernetes::Endpoints" && details.message == "metadata.annotations.endpoints.kubernetes.io/last-change-trigger-time"'
- 'config_type == "Kubernetes::Node" && has(details.message) && details.message == "status.images"'
- 'details.source.component == "canary-checker" && (change_type == "Failed" || change_type == "Pass")'
- >
change_type == "diff" && summary == "status.reconciledAt" &&
config != null &&
has(config.apiVersion) && config.apiVersion == "argoproj.io/v1alpha1" &&
has(config.kind) && config.kind == "Application"
properties:
- filter: 'config_type == "Kubernetes::Pod"'
name: Logs
icon: opensearch
links:
- text: opensearch
url: https://opensearch.svc/_dashboards/app/discover#/?_a=(query:(language:kuery,query:'kubernetes_pod_id:{{.id}}'))
- filter: 'config_type == "Kubernetes::Node"'
name: Grafana
icon: grafana
links:
- text: grafana
url: https://grafana.svc/d/85a562078cdf77779eaa1add43ccec1e/kubernetes-compute-resources-namespace-pods?var-namespace={{.name}}
exclusions:
name:
- junit*
- k6-junit*
- newman-junit*
- playwright-junit-*
- hello-world*
namespace:
- canaries
- monitoring
kind:
- Secret
- ReplicaSet
- APIService
- PodMetrics
- NodeMetrics
- endpoints.discovery.k8s.io
- endpointslices.discovery.k8s.io
- leases.coordination.k8s.io
- podmetrics.metrics.k8s.io
- nodemetrics.metrics.k8s.io
- customresourcedefinition
- controllerrevision
- certificaterequest
- orders.acme.cert-manager.io
labels:
canary-checker.flanksource.com/generated: 'true'
relationships:
- kind:
expr: "has(spec.claimRef) ? spec.claimRef.kind : ''"
name:
expr: "has(spec.claimRef) ? spec.claimRef.name : ''"
namespace:
expr: "has(spec.claimRef) ? spec.claimRef.namespace : ''"
- kind:
value: Kustomization
name:
label: kustomize.toolkit.fluxcd.io/name
namespace:
label: kustomize.toolkit.fluxcd.io/namespace
- kind:
value: HelmRelease
name:
label: helm.toolkit.fluxcd.io/name
namespace:
label: helm.toolkit.fluxcd.io/namespace
# FluxCD Git relationships
- name:
expr: "has(spec.sourceRef) ? spec.sourceRef.name : '' "
namespace:
expr: "has(spec.sourceRef) && has(spec.sourceRef.namespace) ? spec.sourceRef.namespace : metadata.namespace "
kind:
value: "GitRepository"
event:
exclusions:
reason:
- SuccessfulCreate
- Created
- DNSConfigForming
severityKeywords:
error:
- failed
- error
warn:
- backoff
- nodeoutofmemory
Field | Description | Scheme |
---|---|---|
logLevel | Specify the level of logging. | string |
schedule | Specify the interval to scrape in cron format. Defaults to every 15 minutes. | string |
retention | Settings for retaining changes, analysis and scraped items | Retention |
kubernetes | Specifies the list of Kubernetes configurations to scrape. | []Kubernetes |
Kubernetes
Field | Description | Scheme |
---|---|---|
clusterName* | A unique name for the cluster config. | string |
event | Specify configuration to handle Kubernetes events | |
exclusions | Resources to be excluded from scraping | |
fieldSelector | Resources to be included e.g | string |
kubeconfig | Kubeconfig to connect to the cluster | EnvVar |
namespace | Include resources only from this namespace | string |
relationships | Create relationships between kubernetes objects | |
scope | Specify scope for scrape. Allowed: | string |
selector | Include resources matching this selector only e.g | string |
since | Set time constraint for scraping resources within the set period | |
watch | List of resources to watch for real-time changes | |
labels | Labels for each config item. |
|
properties | Custom templatable properties for the scraped config items. | |
tags | Tags for each config item. Max allowed: 5 | |
transform | Transform configs after they've been scraped |
Event
Kubernetes::Event
resources are mapped to config changes. Events can be verbose so they can be excluded or their severity level changed:
spec:
kubernetes:
- event:
exclusions:
reason:
- SuccessfulCreate
- Created
- DNSConfigForming
severityKeywords:
error:
- failed
- error
warn:
- backoff
- nodeoutofmemory
Field | Description | Scheme | Required |
---|---|---|---|
exclusions | A list of keywords used to exclude event objects based on the reason | []string | |
severityKeywords | Specify keywords used to identify the severity of the Kubernetes Event based on the reason | SeverityKeywords |
SeverityKeywords
Field | Description | Scheme | Required |
---|---|---|---|
warn | A list of keywords used to identify a warning severity from the reason. It could also be a match pattern: e.g. * to match all or !badword to exclude badword | []string | |
error | Same as warn but used to map to error severity. | []string |
Watch Events & Resources
While the kubernetes scraper runs on a schedule you've specified, it can also watch for changes to resources and events in real-time. This allows near-real-time updates to your kubernetes catalogs with the flexibility of performing full scrape on a larger interval.
This feature is enabled by default but can be disabled by setting the property watch.disable=true
.
Kubernetes events automatically trigger a re-scrape of involved objects, so even though not all resources are watched by default, the vast majority of changes still reflect in real-time due to associated events that fire at the same time as the update.
Watch Selector
custom-watch-resources.yamlkubernetes:
- clusterName: 'eks'
watch:
- apiVersion: v1
kind: Pod
- apiVersion: apps
kind: Deployment
- apiVersion: batch/v1
kind: CronJob
Field | Description | Scheme |
---|---|---|
apiVersion* | API version of the Kubernetes resource to watch | string |
kind* | Kind of the Kubernetes resource to watch | string |
The following resource types are "watched" by default.
apiVersion | kind |
---|---|
apps/v1 | DaemonSet |
apps/v1 | Deployment |
apps/v1 | ReplicaSet |
apps/v1 | StatefulSet |
batch/v1 | CronJob |
batch/v1 | Job |
v1 | Node |
v1 | Pod |
Relationships
You can create relationships between kubernetes objects on the basis of kind, name & namespace.
Relationships can also be defined under transform.relationships
, however defining them under kubernetes.relationships
is simpler with specific support for kind
, name
and namespace
fields.
kubernetes-relationship.yamlkubernetes:
- clusterName: 'eks'
relationships:
# If object has spec.claimRef field, use its kind, name and namespace
- kind:
expr: "has(spec.claimRef) ? spec.claimRef.kind : ''"
name:
expr: "has(spec.claimRef) ? spec.claimRef.name : ''"
namespace:
expr: "has(spec.claimRef) ? spec.claimRef.namespace : ''"
# If object flux kustomize labels, link it to the parent Kustomization object
- kind:
value: Kustomization
name:
label: kustomize.toolkit.fluxcd.io/name
namespace:
label: kustomize.toolkit.fluxcd.io/namespace
# If object helm kustomize labels, link it to the parent HelmRelease object
- kind:
value: HelmRelease
name:
label: helm.toolkit.fluxcd.io/name
namespace:
label: helm.toolkit.fluxcd.io/namespace
Field | Description | Scheme | Required |
---|---|---|---|
kind | kind of Kubernetes Object | Lookup | true |
name | name of Kubernetes Object | Lookup | true |
namespace | namespace of Kubernetes Object | Lookup | true |
Lookup
There are 3 different ways to specify which value to use when finding related configs:
Field | Description | Scheme | Required |
---|---|---|---|
expr | Use an expression to get the value | string | |
value | Specify a static value | string | |
label | Get the value from a label | string |
Exclusion
excludes certain kubernetes objects from being scraped
Field | Description | Scheme |
---|---|---|
kinds | kinds of kubernetes objects to exclude | |
labels | labels of kubernetes objects to exclude | |
names | names of kubernetes objects to exclude | |
namespaces | namespaces of kubernetes objects to exclude |
Annotations
Kubernetes resources can be annotated with annotations that can direct the scraper to certain behaviors.
Field | Description | Scheme |
---|---|---|
config-db.flanksource.com/ignore | Ignore config items | bool |
config-db.flanksource.com/ignore-change-severity | Ignore changes by severity | |
config-db.flanksource.com/ignore-changes | Ignore changes by change type | |
config-db.flanksource.com/tags | Add custom tags. Config items can only have up to |
|
Examples
Exclude verbose changes from argo application
argo-application.yamlapiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: sock-shop
namespace: argo
annotations:
config-db.flanksource.com/ignore-changes: ReconciliationSucceeded
config-db.flanksource.com/ignore-change-severity: low
spec: ...
Excluding a particular secret from being scraped
secret.yamlapiVersion: v1
kind: Secret
metadata:
annotations:
config-db.flanksource.com/ignore: true
name: slack
namespace: default
type: Opaque
data:
token: ...
Scraping remote clusters
A single config-db instance can scrape multiple clusters when provided with a kubeconfig. Either the kubeconfig itself or the path to the kubeconfig can be provided.
Local path to kubeconfig
remote-cluster.yamlapiVersion: configs.flanksource.com/v1
kind: ScrapeConfig
metadata:
name: azure-scraper
spec:
schedule: '@every 5h'
kubernetes:
- clusterName: 'azure production cluster'
kubeconfig:
value: /home/flanksource/.kube/azure_config
Kubeconfig
remote-cluster.yamlapiVersion: configs.flanksource.com/v1
kind: ScrapeConfig
metadata:
name: aws-scraper
spec:
spec:
schedule: '@every 5h'
kubernetes:
- clusterName: 'aws cluster'
kubeconfig:
value: |
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: xxxxx
server: https://xxxxx.sk1.eu-west-1.eks.amazonaws.com
name: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
contexts:
- context:
cluster: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
namespace: mission-control
user: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
name: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
current-context: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
kind: Config
preferences: {}
users:
- name: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
user:
exec:
....
or, a kubeconfig inside a secret can be referenced as follows:
remote-cluster.yamlapiVersion: configs.flanksource.com/v1
kind: ScrapeConfig
metadata:
name: aws-scraper
spec:
spec:
schedule: '@every 5h'
kubernetes:
- clusterName: 'aws cluster'
kubeconfig:
valueFrom:
secretKeyRef:
name: aws-kubeconfig
key: kubeconfig
Performance
The scraper is highly reliant on the performance of the Kubernetes API server, and as such, it is recommended to run the scraper from within the cluster or as close as possible to the control pane.
It is possible to overload the API server with too many requests, to reduce the load on the API Server:
- Decentralize the scraper by running it on an agent, from-inside each cluster rather than remotely.
- Increase the
schedule
to1h
or more, real-time updates still be recorded by Kubernetes events and informers. - Filter out and exclude resources and events that have a high churn or verbosity