Xpolog deployment on AWS-EKS with external coniguration on EFS
What is Amazon EKS?
Amazon Elastic Kubernetes Service (EKS) is AWS's managed service that makes it easy to run, manage, and scale containerized applications using Kubernetes on the AWS cloud.
Think of it like this: Kubernetes is a powerful but complex engine for orchestrating containers. Instead of building, securing, and maintaining the most complicated parts of that engine yourself, EKS provides a fully managed, highly available, and secure Kubernetes control plane as a service.
Key Benefits of EKS
Managed Control Plane: This is the core advantage. AWS automatically manages the availability, scalability, and patching of the Kubernetes control plane components (like
etcdand the API server). This frees you from significant operational overhead and lets you focus on your applications.High Availability: The EKS control plane is distributed across multiple AWS Availability Zones (AZs), eliminating any single point of failure and ensuring your cluster's brain is always running.
Seamless AWS Integration: EKS is deeply integrated with the AWS ecosystem. It works effortlessly with services like:
IAM for secure authentication and authorization.
VPC for isolated and secure networking.
Elastic Load Balancers (ALB/NLB) for exposing your services.
EFS & EBS for persistent storage solutions.
Pure Kubernetes Experience: EKS runs upstream, certified Kubernetes. This means you get a standard, community-tested experience, and any tools or add-ons that work with Kubernetes will work with EKS.
How It Works: Control Plane vs. Worker Nodes
An EKS cluster is primarily composed of two parts:
The EKS Control Plane: Managed entirely by AWS. You don't see the underlying instances, but you interact with them through the Kubernetes API (e.g., using
kubectl).Worker Nodes: These are the EC2 instances where your application containers (Pods) actually run. You provision and manage these nodes within your VPC and are responsible for them. They register themselves with the control plane to form the complete cluster.
Why EKS Matters for This Guide
EKS provides the robust, production-grade Kubernetes environment where our applications will run and generate logs. We will deploy Fluent Bit as a DaemonSet across all the worker nodes in our EKS cluster. Fluent Bit's task is to reliably collect logs from every application on every node and forward them to a central location. By using EKS, we start with a secure and scalable foundation for our entire logging pipeline.
Create New Cluster
Open your AWS GUI
Search for EKS:
Press “Create cluster”
The EKS Cluster IAM Role: Your Cluster's AWS Passport
When you create an EKS cluster, you're asked to select a "Cluster IAM role." This is one of the most important configuration steps as it defines the permissions your cluster has to interact with other AWS services.
In simple terms, this IAM role is what the Kubernetes control plane uses to make AWS API calls on your behalf.
Why is it Necessary?
Think of the EKS control plane as a manager hired by you but living in a separate AWS-managed building. This manager (the control plane) needs a set of keys (the IAM role) to access and manage resources within your building (your AWS account and VPC).
Without this role, the control plane would be isolated and unable to perform essential tasks, such as:
Networking: Creating and managing Elastic Network Interfaces (ENIs) in your VPC subnets for pod networking.
Load Balancing: Provisioning and configuring Application or Network Load Balancers when you create a Kubernetes
Serviceof typeLoadBalancer.Storage: Interacting with services like EBS when creating
PersistentVolumes.
This role acts as a secure "passport," granting the EKS service just enough permission to manage these resources without giving it full access to your entire AWS account.
What Permissions Does It Need?
You don't have to figure out the permissions yourself. AWS provides a managed policy specifically for this purpose called AmazonEKSClusterPolicy. This policy contains all the necessary permissions (ec2:CreateNetworkInterface, elasticloadbalancing:RegisterTargets, etc.) that the control plane requires to function correctly.
When you create the cluster using the AWS Management Console, it will often guide you to create a new role and will automatically attach this policy for you.
Key Takeaway
The Cluster IAM Role is the security link between the AWS-managed Kubernetes control plane and the resources running in your own AWS account. You are granting the EKS service explicit permission to manage cluster-related resources on your behalf.
Creating the EKS Cluster IAM Role
You will create a new IAM role that the EKS service can assume. The AWS console simplifies this process by pre-selecting the correct trust relationship and permissions policy for you.
Here are the step-by-step instructions:
Navigate to the IAM Console(or press “Create recommended role”) in your AWS account.
On the left-hand navigation pane, click on Roles, then click the "Create role" button.
Step 1: Select Trusted Entity
For "Trusted entity type," choose AWS service.
Under "Use case," select EKS from the dropdown menu.
This will reveal another option below. Choose EKS - Auto Cluster.
Click Next.
Step 2: Add Permissions
The console will automatically select the required permissions policy:
AmazonEKSClusterPolicy.You don't need to do anything else on this screen. Simply click Next.
Step 3: Name, Review, and Create
Role name: Give your role a descriptive name that you will easily recognize. For example:
my-eks-cluster-roleorEKSClusterRoleForGuide.Review the details to ensure the trusted entity is
eks.amazonaws.comand the attached policy isAmazonEKSClusterPolicy.Click the "Create role" button at the bottom.
The Role to Select
Now, when you are creating your EKS cluster and you get to the "Cluster IAM role" dropdown menu, you will select the role you just created (e.g., my-eks-cluster-role).
This explicitly grants the EKS control plane the permissions defined in the AmazonEKSClusterPolicy to manage resources within your account.
Of course. After the Cluster Role, the next critical component is the Node IAM Role.
The Node IAM Role: The Worker's Toolkit
While the Cluster Role is for the EKS control plane (the manager), the Node IAM Role is attached to each of your EC2 worker nodes. This role grants the necessary permissions for the nodes themselves to function correctly within the cluster and interact with other AWS services.
Think of this as the toolkit you give to each individual worker on your team. Each worker node needs this toolkit to perform its core job.
Why is it Necessary?
Your worker nodes are not just passive machines; they are active participants in the Kubernetes cluster. The kubelet (the primary "node agent") running on each node, and the pods scheduled on them, need permissions to:
Join the Cluster: A node needs permission to communicate with the EKS control plane to register itself and receive workloads.
Pull Container Images: To run your applications, the nodes must have permission to pull container images from Amazon ECR (Elastic Container Registry).
Manage Networking: The AWS VPC CNI plugin, which handles pod networking, runs on each node and needs permissions to manage network interfaces.
Access Other AWS Services: If a pod on a node needs to access an S3 bucket or a DynamoDB table, it will (by default) inherit permissions from this role.
How to Create the Node IAM Role
The creation process is similar to the Cluster Role, but with a different trusted entity and different policies.
Navigate to the IAM Console(or press “Create recommended role”), go to Roles, and click "Create role".
Step 1: Select Trusted Entity
For "Trusted entity type," choose AWS service.
Under "Use case," select EC2. This is because your worker nodes are EC2 instances.
Click Next.
Step 2: Add Permissions
In the search bar, find and attach the following three AWS managed policies. You must attach all of them:
AmazonEKSWorkerNodePolicyIt provides the worker nodes with the minimum permissions needed to communicate with the EKS control plane.
AmazonEKS_CNI_PolicyThe Amazon VPC CNI plugin (
aws-nodeDaemonSet) is what allows Kubernetes pods in EKS to get IP addresses from your VPC subnets and communicate with other resources (pods, services, and AWS infrastructure)AmazonEC2ContainerRegistryReadOnly(This allows nodes to pull images from ECR)
Click Next.
Step 3: Name, Review, and Create
Role name: Give it a clear name, such as
my-eks-node-role.Review the configuration to ensure the trusted entity is
ec2.amazonaws.comand the three required policies are attached.Click "Create role".
Where This Role is Used
You will select this role (my-eks-node-role) later in the EKS setup process, specifically when you create a Node Group for your cluster. Assigning this role to the node group ensures that every EC2 instance launched within it has the correct permissions to operate as a functional E-K-S worker node.
Choosing Your Cluster's Network: The VPC
At this step, you're defining the private network space where your entire EKS cluster will live. VPC stands for Virtual Private Cloud, and you can think of it as your own logically isolated, fenced-off area within the vast AWS cloud. All your cluster's resources—the worker nodes, the pods, and the internal load balancers—will be launched inside this VPC.
Key Requirements for an EKS VPC
For an EKS cluster to be resilient and function correctly, the VPC you select must meet a few critical requirements:
Multiple Subnets: The VPC must have at least two subnets.
Multiple Availability Zones (AZs): Crucially, these subnets must be in different Availability Zones. An AZ is a distinct data center within an AWS region. Spanning your cluster across multiple AZs ensures high availability, so if one data center has an issue, your cluster can continue running in another.
Public and Private Subnets: A production-ready setup includes both public and private subnets:
Public Subnets: These are for internet-facing resources, primarily your public-facing load balancers. They have a direct route to an AWS Internet Gateway.
Private Subnets: This is where your worker nodes should live for security. They don't have public IP addresses and can access the internet securely through a NAT Gateway that resides in a public subnet.
Your Options
You have two main choices on the EKS creation screen:
Use an Existing VPC: If you already have a VPC configured that meets the requirements above, you can select it. This is common in established AWS environments.
Let AWS Create a New VPC: For this guide, and for anyone new to EKS, this is the highly recommended option. AWS provides a CloudFormation template that automatically creates a new VPC perfectly configured for EKS. It will set up the public and private subnets across multiple AZs, create the necessary route tables, and provision an Internet Gateway and NAT Gateways for you.
For this guide, select the default VPC or follow the prompts to have AWS create a new VPC for you. This will prevent common networking issues and ensure your cluster is built on a solid, secure, and highly available foundation.
Subnets:
Leave all the subnets selected just as they are.
For EKS to function correctly, it needs to be aware of all the available subnets in its VPC. Deselecting any of them could lead to issues with networking, load balancing, or node placement. Simply accept the default selection and proceed to the next step.
Press Create and wait for status “Active”
Connect your terminal to the cloud and new cluster:
Open your terminal:
aws configureAWS Access Key ID:
AWS Secret Access Key:AWS Management Console
Sign in to the AWS Console.
Navigate to IAM (under “Security, Identity, & Compliance”).
In the left sidebar, click Users, then select your user name.
Go to the Security credentials tab.
Under Access keys, you’ll see your existing Access Key IDs (but you can only view the ID, not the secret).
If you need a new key, click Create access key, give it a name/description, and you’ll be shown both the Access Key ID and the Secret Access Key one time.
Connect kubectl to EKS:
aws eks update-kubeconfig --region <cluster_region> --name <cluster_name>aws eks update-kubeconfig --region eu-north-1 --name andrey-test-fbOptional: Set environment variables:
export CLUSTER="andrey-test-fb"
export REGION="eu-north-1"
export ACCOUNT_ID="655536767854"Deploy test App (Apache)
nano apache-deployment.yamlapiVersion: apps/v1
kind: Deployment
metadata:
name: apache-deployment
spec:
replicas: 1
selector:
matchLabels:
app: apache
template:
metadata:
labels:
app: apache
spec:
containers:
- name: apache
image: httpd:2.4
ports:
- containerPort: 80
kubectl apply -f apache-deployment.yamlCreate Namespace
kubectl create namespace loggingStand up a shared, ReadWriteMany Amazon EFS volume in EKS.
0) What you’ll need (inputs)
An existing EKS cluster and kubectl/helm access.
IDs for your VPC and private subnets where the worker nodes run.
The security group (SG) of your node group(s).
AWS CLI configured for the right account/region.
EFS must sit in the same VPC as your nodes, with mount targets in each AZ where nodes run. NFS (port 2049/TCP) must be allowed from node SGs → EFS mount targets. AWS Documentation+2AWS Documentation+2
1) Create the EFS file system + networking
# ---------- change these ----------
export AWS_REGION=eu-north-1
export VPC_ID=vpc-00150a1cd684700d9
export SUBNET_IDS="subnet-00ab400cacdce9d40 subnet-0a2f471dec4cb6d70 subnet-0be575b8b3138c576" # private subnets with nodes (1 per AZ used)
export NODE_SG_ID=sg-0d258d04e8ab37139 # SG attached to your node group
# ----------------------------------
# 1.1 Security group for EFS mount targets (allows NFS from nodes)
EFS_SG_ID=$(aws ec2 create-security-group \
--group-name eks-efs-sg --description "NFS from EKS nodes" \
--vpc-id $VPC_ID --query GroupId --output text --region $AWS_REGION)
aws ec2 authorize-security-group-ingress --group-id $EFS_SG_ID \
--protocol tcp --port 2049 --source-group $NODE_SG_ID --region $AWS_REGION
# 1.2 Create the EFS file system (encrypted at rest; pick a KMS key if you have one)
FS_ID=$(aws efs create-file-system \
--performance-mode generalPurpose \
--encrypted \
--region $AWS_REGION \
--query FileSystemId --output text)
echo "EFS=$FS_ID"Wait until EFS is available
Check status:
aws efs describe-file-systems --file-system-id $FS_ID --region $AWS_REGION \
--query 'FileSystems[*].LifeCycleState'It must return:
"available"Create a mount target in each subnet used by the nodes:
for sn in $SUBNET_IDS; do
aws efs create-mount-target \
--file-system-id $FS_ID \
--subnet-id $sn \
--security-groups $EFS_SG_ID \
--region $AWS_REGION
doneWhy this matters: EFS mount targets must allow inbound NFS/2049 from your node SGs. Without this, the CSI driver will hang on mount. AWS Documentation+2AWS Documentation+2
2) Install the Amazon EFS CSI driver as an EKS add-on (recommended)
The driver needs IAM permissions to manage EFS access points for dynamic provisioning. The managed policy is AmazonEFSCSIDriverPolicy. We’ll use EKS Pod Identity (or use IRSA if you prefer). AWS Documentation+2AWS Documentation+2
2.1 Create Pod Identity association (eksctl)
export CLUSTER=andreyXpologTest
export AWS_REGION=eu-north-1
export ROLE_NAME=AmazonEKS_EFS_CSI_DriverRole
# make sure Pod Identity Agent addon exists (required for associations)
aws eks describe-addon --cluster-name "$CLUSTER" \
--addon-name eks-pod-identity-agent --region "$AWS_REGION" >/dev/null 2>&1 \
|| aws eks create-addon --cluster-name "$CLUSTER" \
--addon-name eks-pod-identity-agent --region "$AWS_REGION"
#create the association with the correct policy ARN
eksctl create podidentityassociation \
--cluster "$CLUSTER" \
--namespace kube-system \
--service-account-name efs-csi-controller-sa \
--role-name "$ROLE_NAME" \
--permission-policy-arns arn:aws:iam::aws:policy/service-role/AmazonEFSCSIDriverPolicy
2.2 Install the add-on
aws eks create-addon \
--cluster-name $CLUSTER \
--addon-name aws-efs-csi-driver \
--region $AWS_REGION
The EFS driver add-on is the AWS-supported path and keeps versions aligned with your cluster. Pod Identity (or IRSA) ensures the controller has exactly the permissions it needs. AWS Documentation+1
1) Namespace + StorageClass + PVC
nano storage.yamlapiVersion: v1
kind: Namespace
metadata:
name: logging
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: efs-sc
provisioner: efs.csi.aws.com
parameters:
provisioningMode: efs-ap
fileSystemId: fs-0107a1b4c87fb0213 # your EFS FS ID
basePath: "/logging"
directoryPerms: "0750"
gidRangeStart: "1000"
gidRangeEnd: "2000"
ensureUniqueDirectory: "true"
subPathPattern: "${.PVC.namespace}/${.PVC.name}"
mountOptions:
- tls
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: xpolog-pvc
namespace: logging
spec:
accessModes: ["ReadWriteMany"]
storageClassName: efs-sc
resources:
requests:
storage: 20Gikubectl apply -f storage.yamlCheks:
kubectl -n logging get pvc xpolog-pvcXpolog Deployment:
nano deployment.yamlapiVersion: apps/v1
kind: Deployment
metadata:
name: xpolog
namespace: logging
labels:
app: xpolog
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app: xpolog
template:
metadata:
labels:
app: xpolog
spec:
# IMPORTANT: match the AP-created ownership (UID/GID 1000)
securityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
initContainers:
# This new initContainer runs as root to fix permissions
- name: fix-permissions
image: 1200km/xplg:7.Release-9787
securityContext:
runAsUser: 0 # Run as root to chown
command: ["/bin/sh", "-c"]
args:
- |
set -ex
# Copy original app files to the shared volume
cp -r /opt/xplg-service/. /workdir/
# Change ownership of the copied files to user 1000
chown -R 1000:1000 /workdir
volumeMounts:
- name: xplg-workdir
mountPath: /workdir
# Your original initContainer to prepare the EFS volume
- name: init-efs
image: public.ecr.aws/amazonlinux/amazonlinux:2023
securityContext:
runAsUser: 1000
runAsGroup: 1000
command: ["/bin/bash","-c"]
args:
- |
set -e
mkdir -p /efs/config /efs/data /efs/logs
chmod -R 0770 /efs || true
volumeMounts:
- name: xpolog-storage
mountPath: /efs
containers:
- name: xpolog
image: 1200km/xplg:7.Release-9787
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 30303
env:
- name: JAVA_TOOL_OPTIONS
value: "-Xmx4g -Dxpolog.uid.structure=master"
readinessProbe:
httpGet:
path: / # You may still need to change this to /health or another path
port: 30303
initialDelaySeconds: 60 # Increased for safety
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 6
livenessProbe:
httpGet:
path: / # You may still need to change this to /health or another path
port: 30303
initialDelaySeconds: 90 # Increased for safety
periodSeconds: 20
timeoutSeconds: 5
failureThreshold: 6
resources:
requests: { cpu: "500m", memory: "6Gi" }
limits: { cpu: "1000m", memory: "8Gi" }
volumeMounts:
# Mount the shared volume with correct permissions
- name: xplg-workdir
mountPath: /opt/xplg-service
# Your original PVC mounts
- name: xpolog-storage
mountPath: /home/data
subPath: data
- name: xpolog-storage
mountPath: /opt/xplg/config
subPath: config
volumes:
# The PVC for persistent data
- name: xpolog-storage
persistentVolumeClaim:
claimName: xpolog-pvc
# The shared volume for the application files
- name: xplg-workdir
emptyDir: {}kubectl apply -f deployment.yamlCheks:
kubectl -n logging get podsExposure stage, where you make XpoLog accessible inside the cluster and outside (from your browser).
nano exposure.yaml# Internal service for in-cluster access (already correct)
apiVersion: v1
kind: Service
metadata:
name: xpolog
namespace: logging
spec:
type: ClusterIP
selector:
app: xpolog
ports:
- name: http
port: 30303
targetPort: 30303
protocol: TCP
---
# External service (NLB) for outside access
apiVersion: v1
kind: Service
metadata:
name: xpolog-public
namespace: logging
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing" # <-- The fix
spec:
type: LoadBalancer
selector:
app: xpolog
ports:
- name: http-public
port: 30443
targetPort: 30303
protocol: TCPkubectl apply -f exposure.yaml
Add a rule that allows inbound traffic on the NodePort (31216) from any source (0.0.0.0/0).
Run the following command in your terminal to authorize the connection.
Bashaws ec2 authorize-security-group-ingress \
--group-id sg-0d258d04e6ab37139 \
--protocol tcp \
--port 31216 \
--cidr 0.0.0.0/0
After running this command, wait about 30-60 seconds, and then try to access your URL again. It should now load successfully.
External configurationan
Open GUI, and paste external configuration directory:
/opt/xplg/config
restart xpolog
Validation Checklist for XpoLog Deployment on AWS EKS with EFS and External Access
awesome — here’s a tight, copy-paste test plan for your setup. it checks pod health, EFS, mounts, and both external + internal GUI access.
Assumes: namespace
logging, Deploymentxpolog, PVCxpolog-pvc, public LB Servicexpolog-public, ClusterIP Servicexpolog.
1) Pod status
# quick health
kubectl -n logging get deploy xpolog
kubectl -n logging get pods -l app=xpolog -o wide
kubectl -n logging logs deploy/xpolog --tail=150✅ Success looks like: READY 1/1, Running, and logs show “XpoLog started”.
2) EFS status (cluster + k8s objects)
# EFS CSI driver components healthy?
kubectl -n kube-system get pods -l app.kubernetes.io/name=aws-efs-csi-driver
# StorageClass exists and points to your FS ID
kubectl get storageclass efs-sc -o yaml | egrep 'provisioner|fileSystemId|provisioningMode|basePath'
# PVC bound to a PV
kubectl -n logging get pvc xpolog-pvc
kubectl -n logging describe pvc xpolog-pvc✅ Success looks like:
EFS CSI controller/daemonset pods
Runningefs-scshowsprovisioner: efs.csi.aws.comand yourfileSystemId: fs-0107a1b4c87fb0213PVC
STATUS: Bound
3) Access pod & verify EFS is mounted and writable
# get the pod name & jump in
POD=$(kubectl -n logging get pod -l app=xpolog -o jsonpath='{.items[0].metadata.name}')
# show mounts and disk usage for the EFS-backed paths
kubectl -n logging exec -it "$POD" -- sh -lc '
echo "== whoami =="; id;
echo "== mounts (look for efs.csi.aws.com / nfs4) =="; mount | egrep "nfs4|efs|/opt/xplg/config|/home/data" || true;
echo "== df -h for mounted dirs =="; df -h /opt/xplg/config /home/data 2>/dev/null || true;
echo "== rw test ==";
touch /opt/xplg/config/_rw_$(date +%s) && ls -l /opt/xplg/config | tail -n 3
'✅ Success looks like:
mountshows NFS4/EFS for/opt/xplg/configand/home/datadf -hreturns sizes for those pathstouchsucceeds and you see the new file
4) External GUI (through the NLB)
Your pod serves HTTP on 30303. The public Service exposes port 30443 but still forwards HTTP to 30303. Use
<http://> (not<https://)> unless you add an Ingress with TLS.
ELB=$(kubectl -n logging get svc xpolog-public -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
echo "NLB: $ELB"
# quick CLI test
curl -sv "http://$ELB:30443/" | head -n 20✅ Success looks like an HTTP status line (200/3xx) and some HTML.
If it times out but the next test works, open the NodePort on the node SG.
Optional (bypass NLB; proves Service/NodePort path):
NODEPORT=$(kubectl -n logging get svc xpolog-public -o jsonpath='{.spec.ports[0].nodePort}')
NODEIP=$(kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP")].address}')
echo "NodePort test -> http://$NODEIP:$NODEPORT/"
curl -sv "http://$NODEIP:$NODEPORT/" | head -n 205) Internal GUI (via port-forward)
Option A (native port):
# terminal 1: keep running
kubectl -n logging port-forward svc/xpolog 30303:30303Then visit: http://localhost:30303
Option B (keep your 30443 habit, still HTTP):
# terminal 1: keep running
kubectl -n logging port-forward svc/xpolog 30443:30303Then visit: http://localhost:30443
✅ Each request prints “Handling connection for …” in the port-forward terminal.