In the EC2 Instances console, locate the instance named EKSBastion
Right click the instance, select => Connect => Session Manager => Connect (button)
To view the GitLab Agent log in the cluster you can use this command: kubectl logs -f -l=app=gitlab-agent -n gitlab-agent
Add the following to the agent configuration file for more verbose logging:
observability:
logging:
level: debug
For common errors and more troubleshooting information visit Troubleshooting the GitLab agent for Kubernetes
When the EKS QuickStart was used to build the cluster, you can locate the ASG for the cluster nodes and scale to zero and back to the number that was in place. This takes some time (probably 15 minutes for a 2 node cluster) so it may need to be an absolutely last resort in an active classroom environment.
Resolution: You have missed the last steps of Prep Lab 2.2 for disabling group runners (and you or participants are using a free account).
Gitlab Operational Container Scanning docs.
Results Not Showing in Dashboard
Check logs:
kubectl logs -n gitlab-agent -l app=gitlab-agent | grep starboard_vulnerability | tail
Error: {"level":"error","time":"2022-04-22T15:38:01.853+0200","msg":"Failed to perform vulnerability scan on workload","mod_name":"starboard_vulnerability","error":"running scan job: creating job: jobs.batch \"scan-vulnerabilityreport-68676cd7bc\" already exists"}
Issue: Ensure orphaned Starboard jobs are cleaned up Action: Clear out orphaned Cluster Scanning Jobs
Clear Command:
kubectl delete jobs -n gitlab-agent -l app.kubernetes.io/managed-by=starboard
Error: {"level":"error","time":"2022-06-23T10:55:04.037Z","msg":"Failed to perform vulnerability scan on workload","mod_name":"starboard_vulnerability","error":"running scan job: warning event received: Error creating: pods \"scan-vulnerabilityreport-656cc6fb45-\" is forbidden: error looking up service account gitlab-agent/gitlab-agent: serviceaccount \"gitlab-agent\" not found (FailedCreate)"}
Issue: Cluster image scanning does not work with non-default namespace or service account Action: Create old named service account
Fix Commmand:
kubectl create serviceaccount gitlab-agent -n gitlab-agent
If you decide to scale down any Auto Scaling Group associated with the classroom, always do so by editing the ASG object in the AWS console so that appropropriate job draining, deregistration and other activities are done by the ASG termination hooks.
If this classroom setup is long lived for any reason, the ASGs can used scheduled scaling to scale to zero during unused times (e.g. nights and weekends). The sample apps in this workshop are stateless which allows the EKS cluster to scale to zero nodes at unused times. If you install a Kubernetes runner, it’s registration status does not survive termination.
If the course is being run on GitLab.com with free GitLab.com accounts for participants and/or the instructor and an Ultimate Trial enabled (only works for 30 days from initial trial enablement), the using GitLab HA Scaling Runner Vending Machine for AWS EC2 ASG to deploy runnres is required.
While it is optional if you are working on licensed accounts on GitLab.com or a self-managed instance, using it also allows you fuller control over the runner fleet performance and therefore CI wait times.
To fix slow runners you can deploy GitLab HA Scaling Runner Vending Machine for AWS EC2 ASG. Or if you have already used the vending machine you can scale up runners to increase CI speed. They will come online, register in GitLab in the same group as the original deployment with no interruption or special actions taken by participants.
If you were relying on shared runners, you will need to disable them at the group to force usage of your runner fleet.
If this is a post deployment step, simply update the ASG that is controlling the runners to have more runner instances (increase Maximum and Desired counts). They will come online, register in GitLab in the same group as the original deployment with no interruption or special actions taken by participants.
To scale the cluster, open the ASG console (this link presumes us-east-2, but you can change the region if you deployed elsewhere). Locate the ASG associated with your EKS cluster which also has “UnmanagedASG” in it’s name. Locate the ASG associated with your runners - if you used the exercise defaults to deploy it, is shoudl be called “linux-docker-spotonly”. To change the size, follow Set capacity limits on your Auto Scaling group.
While the Cluster Autoscaler should be active you can also take manual control of the minimum size by manipulating the Minimum size of the cluster.
The most likely scenario is that your Maximum size has been reached and you need to bump it.
To scale the cluster, open the ASG console (this link presumes us-east-2, but you can change the region if you deployed elsewhere). Locate the ASG associated with your EKS cluster which also has “NodeGroupStack” in it’s name. Locate the ASG associated with your EKS cluster which also has “UnmanagedASG” in it’s name. To change the size, follow Set capacity limits on your Auto Scaling group. Keep in mind that Maximum count limits the cluster autoscaler - which you might want to do for stability or you may want to enable it for efficiency.
If students will be doing cluster control activities on a shared cluster, you many want more bastion host instances. However, keep in mind that the existing one supports multiple simultaneous SSM logins and they are not spot instances. Since they are in an ASG this is an easy to change the number of Bastion hosts.
To scale the bastion host ASG, open the ASG console (this link presumes us-east-2, but you can change the region if you deployed elsewhere). Locate the ASG associated with your EKS cluster which also has “BastionStack” in it’s name. To change the size, follow Set capacity limits on your Auto Scaling group.
Once scaling is complete you may wish to use the EC2 console to rename them for unique names.