Adding Data from K8s / OpenShift (Operator)

Background

Kubernetes (K8s) / OpenShift are container-orchestration system for automating application deployment, scaling, and management.
Kubernetes (K8s) was originally designed by Google, and is now maintained by the Cloud Native Computing Foundation, and OpenShift is developed by Red Hat and managed by Kubernetes on a foundation of Red Hat Enterprise Linux - so basically they are very similar with regards to this article and both are based on Kubernetes (K8s).As K8s / OpenShift are dynamic environments that automates deployment, scaling, and operations of application containers across clusters of hosts it constantly changing. Therefore, it is not possible to collect data by engaging connections to the different NODES/PODS/CONTAINERS as they constantly changes as well.

In order to get logs data processed from such dynamic environments, it is required to dynamically send the logs from K8s / OpenShift cluster to XpoLog in real time while running - whenever a new container is created in the cluster it immediately ships the logs to XpoLog.
The procedure requires an automated deployment of a lightweight log forwarder that will be automatically deployed and managed by the K8s / OpenShift cluster and send the container’s logs to XpoLog for processing/monitoring.

XPLG Deployment

On the XPLG side, create a Syslog UDP/TCP or HTTP/S listener, and make sure the K8s nodes has access to XPLG cluster (XpoLog IP/Ports).

The listener URL should be copied from the listener definition and used in the K8s / OpenShift configuration as the output (the source when logs will be shipped to).

Operator Configuration

Operator is an OpenShift utility which allows you to send information to external endpoints. The data can be shipped to multiple endpoint types, like fluentdForward, syslog, cloudwatch, kafka, e.t.c.

The kinds of information are divided into three categories:

application. Container logs generated by user applications running in the cluster, except infrastructure container applications.
infrastructure. Container logs from pods that run in the openshift*, kube*, or default projects and journal logs sourced from node file system.
audit. Audit logs generated by the node audit system, auditd, Kubernetes API server, OpenShift API server, and OVN network.

ClusterLogForwarder:

The pipeline section under the ClusterLogForwarder is the place which you should configure the needed kinds of data which will be sent.

The Cluster Log Forwarder API enables you to send application, infrastructure and audit to specific endpoints within or outside the cluster. The pipeline should be defined for each data source.

Under Cluster Log Forwarder part, you have to define your namespace. For this purpose, it must be named as 'openshift-logging'.

Outputs:

Outputs is set of remote endpoints which you’re interested to ship all/part of the data.

Each output includes three parameters:

name - name to describe the output.
type - type of output
url - url and port of the endpoint.

Supported Outputs:

Output types	Protocols

Output types	Protocols
Amazon CloudWatch	REST over HTTPS
elasticsearch	elasticsearch
fluentdForward	fluentd forward v1
Loki	REST over HTTP and HTTPS
kafka	kafka 0.11
syslog	RFC-3164, RFC-5424

Output sample:

spec:

outputs:

- name: kafka-prod
type: "kafka"
url: tls://kafka.secure.com:9093/app-topic

Inputs:

Inputs sections allows you the flexibility to send application/infrastructure/audit logs only from specific namespaces.

Input Sample:

inputs:

- name: xplg-application

application:

namespaces: - xplg-project

Pipelines:

Pipelines is the section where you can connect your inputs (custom namespaces, data sources) to the outputs which you defined above.

Each pipeline contains the following parameters:

name - A name to describe the pipeline.
inputRefs - log type, in this example audit. Can be defined an a combination of audit/infrastructure/application/input_name
outputRefs - list of outputs to use, in this example xplg-syslog to forward to the secure XPLG instance
Optional (Recommended):
- Labels - adding your own labels to the record
- Parse - controlling the data structure of the shipped data. Kindly set it to json.

Pipeline Sample:

pipelines:

name: syslog-audit

inputRefs:

audit

outputRefs:

rsyslog-audit

default

name:

inputRefs:

application

infrastructure

outputRefs:

xplg-syslog

default

parse: json

labels:

secure: "true"

datacenter: "east"

Full Operator Configuration sample - Forwarding logs using the syslog protocol:

apiVersion: logging.openshift.io/v1
kind: ClusterLogForwarder
metadata:
name: instance
namespace: openshift-logging
spec:
outputs:

name: rsyslog-test
type: syslog
syslog:
facility: local0
rfc: RFC3164
payloadKey: message
severity: informational
url: 'tls://xplg_syslog_endpoint:514'
secret:
name: syslog-secret
pipelines:
name: xplg-syslog
inputRefs:
- audit
- application
- infrastructure
  outputRefs:
- rsyslog-test
- default
  parse: json
  labels:
  secure: "true"
  syslog: "test"

More details regarding OpenShift Operator integration can be found here: OpenShift Operator Integration

XPLG Handler + Advanced Integrations

XPLG handler is a utility which able to retrieve information from the log records themselves and based on it control the Folders and Logs strucutre. The handler scans each record independently, extracts the relevant information and defines all the metadata parameters for the relevant record (logName, logPath, Server, Template, e.t.c).

In order to apply the OpenShift handler, please follow:

Create a UDP/TCP/HTTP(S) Listener in the protocol on which you’re planning to ship the data to XPLG.

2. From the [External_Configuration] folder, navigate to conf/accounts and open the file listenersAccounts.xml

3. Look for the record which specifies the details for your OpenShift listener and copy its id field.

4. Locate the attached handler file (openshift.json) in the following path: /[External_Configuration]/conf/listeners/handlers/user

5. In the handler file itself, you may be required to change two parameters:

5.1 listenerType - based on the protocol which you configured your output section in the operator, please set this value to one of the following option: syslogTcpServer/syslogUdpServer/httpListener.

5.2 listenerAccountInfoId - set it to the ID which you saved in section 2.

5.3 columnName: "[Source Device for TCP/UDP, Remote Host for HTTP]" - based on the protocol which you configured your output section in the operator.

Locate the customMessage.js under the following path - [XPLG_External_Conf/conf/ext/scripts/listenerHandlerExtraction/user
Open browser to XPLG UI node and from the left navigation panel, navigate to Data->Patterns. Scroll down and press on the ‘Import Template’ button, and choose the file named templates.zip

7. Restart XpoLog service.

8. Once the handler is applied successfully and operator is configured to send data to XpoLog listeners, the folders and logs tree below the relevant listener should be created similar to the attached structure:

Monitors - As part of the OpenShift integration, a set of monitors is suggested per each log_type. Errors and streaming anomalies, top errors and more are suggested as part of this package. In order to import it: From the left navigation panel, navigate to Monitors and Tasks->Monitors. Press on Actions->Import and choose the Monitors.zip file
Applications - As part of the OpenShift integration, a set of applications is suggested per each log_type. Active Hosts, Namespaces and Containers, traffic over time, common errors, anomalies and more are suggested as part of this package. In order to import it: At the top application bar, press on the Applications button and then press on the Import App Conf button. This procedure should be taken three times, once per each log_type. Relevant files are attached below.