This page details monitoring and logging for the Cinchy v5 platform when deployed on Kubernetes.
Cinchy v5 on Kubernetes offers and recommends a robust set of open-source tools and applications for monitoring, logging, and visualizing the data on your Cinchy instances.
Click on the respective tool below to learn more about it:
When deploying Cinchy v5 on Kubernetes, Cinchy recommends using OpenSearch Dashboards for your logging. OpenSearch is a community-driven fork of Elasticsearch created by Amazon, and it captures and indexes all your logs into a single, accessible dashboard location. These logs can be queried, searched, and filtered, and Correlation IDs mean that they can also be traced across various components. These logging components take advantage of persistent storage.
You can view OpenSearch documentation here:
These sections guide you through setting up your first Index, Visualization, Dashboard, and Alert.
OpenSearch comes with sample data that you can use to get a feel of the various capabilities. You will find this on the main page upon logging in.
Navigate to your cinchy.kubernetes/environment_kustomizations/instance_template/worker/kustomization.yaml
file.
In the below code, copy the Base64 encoded string in the value parameter.
Decode the value to retrieve your AppSettings.
Navigate to the below Serilog section of the code and update the "Default" parameter as needed to set your log level. The options are:
Ensure that you commit your changes.
Navigate to ArgoCD > Worker Application and refresh.
The following are some common search patterns when looking through your OpenSearch Logs.
If an HTTP request to Cinchy Web/IDP fails, check the page's requests and the relevant response headers to find the "x-correlation-id" header. That header value can be used to search and find all logs associated with the HTTP request.
When debugging batch syncs, filter the "ExecutionId" field in the logs for your batch sync execution ID to narrow down your search.
When debugging real time syncs, search for your data sync config name in the Event Listener or Workers logs to find all the associated logging information.
The first step to utilizing the power of OpenSearch Dashboards is to set up an index to pull data from your sources. An Index Pattern identifies which indices you want to explore. An index pattern can point to a specific index, for example, your log data from yesterday, or all indices that contain your log data.
Login to OpenSearch. You would have configured the access point during your deployment installation; traditionally it will be found at <baseurl>/dashboard.
If this is your first time logging in, the username and password will be set to admin/admin.
We highly recommend you update the password as soon as possible.
Navigate to the Stack Management tab in the left navigation menu (Image 1).
From the left navigation, click on Index Patterns (Image 2).
Click on the Create Index Pattern button.
To set up your index pattern, you must define the source. OpenSearch will list the sources available to you on the screen below. Input your desired source(s) in the text box (Image 3).
You can use the asterisk (*) to match multiple sources.
Configure your index pattern settings (Image 4).
Time field: Select a primary time field to use with the global time filter
Custom index pattern ID: By default, OpenSearch gives a unique identifier to each index pattern. You can use this field to optional override the default ID with a custom one.
Once created, you can review your Index Patterns from the Index Patterns page (Image 5).
Click on your Index Pattern to review your fields (Image 6).
You can pull out any data from your index sources and view them in a variety of visualizations.
From the left navigation pane, click Visualize (Image 7).
If you have any Visualizations, they will appear on this page. To create a new one, click the Create Visualization button (Image 8).
Select your visualization type from the populated list (Image 9).
Choose your source (Image 10). If the source you want to pull data from isn't listed, you will need to set it up as an index first.
Configure the data parameters that appear in the right hand pane of the Create screen. These options will vary depending on what type of visualization you choose in step 3. The following example uses a pie chart visualization (Image 11):
Metrics
Aggregation: Choose how you want your data aggregated. This example uses Count.
Custom Label: You can use this optional field for custom labelling.
Buckets
Aggregation: Choose how you want your data aggregated. This example uses Split Slices > Terms.
Field: This drop down is populated based on the index source your chose. Select which field you want to use in your visualization. This example uses machine.os.keyword.
Order By: Define how you want your data to be ordered. This example uses Metric: Count, in descending order of size 10.
Choose whether to group other values in a separate bucket. If you toggle this on, you will need to label the new bucket.
Choose whether to show missing values.
Advanced
You can optionally choose a JSON input. These will be merged with the OpenSearch aggregation definition.
Options
The variables in the options tab can be used to configure the UI of the visualization itself.
You can also further focus your visualization:
Use DQL to search your index data (Image 12). You can also save any queries you write for easy access by clicking on the save icon.
Add a filter on any of your fields (Image 13).
Update your date filter (Image 14).
Click save when finished with your visualization.
Once you have created your visualizations, you can combine them together on one Dashboard for easy access.
You can also create new visualizations from the Dashboard screen.
From the left navigation pane, click on Dashboards (Image 15).
If you have any Dashboards, they will appear on this page. To create a new one, click the Create Dashboard button (Image 16).
The "Editing New Dashboard" screen will appear. Click on Add an Existing object (Image 17).
Select any of the visualizations you created and it will automatically add to your Dashboard (Image 18). Repeat this step for as many visualizations as you'd like to appear.
Click Save to finish (Image 19).
This capability was added in Cinchy v5.4.
Your OpenSearch password can be updated in your deployment.json file (you may have renamed this during your original deployment).
Navigate to "cluster_component_config > OpenSearch.
OpenSearch has two users that you can configure the passwords for: Admin and Kibana Server. Kibana Server is used for communication between the opensearch dashboard and the opensearch server. The default password for both is set to "password";. To update this, you will need to use a machine with docker available.
Update your Admin password:
Your password must be hashed. You can do so by running the following command on a machine with docker available, inputting your new password where noted:
Navigate to "opensearch_admin_user_hashed_password" and input your hashed password.
You must also provide your password in a base64 encoded format; input your cleartext password here to receive your new encoded password.
Navigate to "opensearch_admin_user_password_base64" and input your encoded password.
Update your Kibana Server password:
Your password must be hashed. You can do so by running the following command on a machine with docker available, inputting your new password where noted:
Navigate to "opensearch_kibanaserver_user_hashed_password" and input your hashed password.
You must also provide your new password in cleartext. Navigate to "opensearch_kibanaserver_user_password" and input your cleartext password.
Run the below command in the root directory of your devops.automations repo to update your configurations. If you have changed the name of your deployment.json file, make sure to update the command accordingly.
Commit and push your changes.
If your environment isn't set-up to automatically apply upon configuration,navigate to the ArgoCD portal and refresh your component(s). If that doesn't work, re-sync.
This page is about the analytics and visualization application Grafana, one of our recommended components for Cinchy v5 on Kubernetes.
Grafana is an open source analytics and interactive visualization web application. When connected to your Cinchy platform, it provides charts, graphs, and alerting capabilities (Image 1).
Grafana, and its paired application Prometheus (which consumes metrics from the running components in your environment) is the recommended visualization application for Cinchy v5 on Kubernetes.
Grafana has a robust library of documentation of tutorials designed to help you learn the fundamentals of the application. We've listed some notable ones below:
When using the default configuration pairing of Grafana and Prometheus, Prometheus is already set up as a data source in your metrics dashboard.
Cinchy comes with some saved dashboards that come out of the box. These dashboards will provide a great jumping off point for your metrics monitoring, and you can always customize, manage, and add further dashboards at your leisure.
Navigate to the left navigation pane, select the Dashboards icon > Manage (Image 2).
2. You will see a list of all of the Dashboards available to you (Image 3). Clicking on any of them will take you to a full metrics view (Image 4).
3. You can favourite any of your commonly used or most important dashboards by clicking on the star (Image 5).
4. Once you favorite a dashboard, you can easily find it by navigating to the left navigation pane, select the Dashboards icon > Home. This will open the Dashboards Home. You can see both your favourite and your recent dashboards in this view (Image 6)
Your Cinchy v5 deployment comes with some out-of-the-box dashboards already made. You are able to customize these to suit your specifications. The following are a few notable ones:
Purpose: This dashboard provides a general overview of your entire cluster including all of your environments and pods (Image 7).
Metrics:
The following are some example metrics that you could expect to see from this dashboard:
CPU Usage
CPU OTA
Memory Use
Memory Requests
Current Network Usage
Bandwidth (Transmitted and Received)
Average Container Bandwidth by Namespace
Rate of Packets
Rate of Packets Dropped
Storage IO & Distribution
Purpose: This dashboard is useful for looking at environment specific details (Image 8). You can use the namespace drop down menu to select which environment you want to visualize (Image 9). This can be particularly helpful during load testing. You are also able to drill down to a specific workload by clicking on its name.
Metrics:
The following are some example metrics that you could expect to see from this dashboard**:**
CPU Usage
CPU OTA
Memory Use
Memory Quota
Current Network Usage
Bandwidth (Transmitted and Received)
Average Container Bandwidth by Workload
Rate of Packets
Rate of Packets Dropped
Grafana lets you to set up push alerts against your dashboards and queries. Once you have created your dashboard, you can follow the steps below to set up your alert.
Grafana doesn't have the capability to run alerts against queries with template variables.
To send emails out from Grafana, you need to configure your SMTP. This would have been done in the automation script run during your initial Cinchy v5 deployment. If you didn't input this information at that time, you must do so before setting up your email alerts.
Your notifications channel refers to who will be receiving your alert. To set one up:
Click on the Alert icon on the left navigation tab (Image 10), and locate "Notifications Channel"
Click the "Add a Channel" button.
Add in the following parameters, including any optional checkboxes you wish to use (Image 11):
Name: The name of this channel.
Type: You have several options here, but email is the most common.
Addresses: Input all the email addresses you want to be notified of this alert, separated by a comma.
Click Test to send out a test email, if desired.
Save your Notification Channel.
The following details how to set up alerts on your dashboards. You can also set up alerts upon creation of your dashboard from the same window.
Navigate to the dashboard and dashboard panel that you want to set up an alert for. This example, sets up an alert for CPU usage on our cluster.
Click on the dashboard name > Edit
Click on the Alert tab (Image 12).
Input the following parameters to set up your alert (Image 13):
Alert Name: A title for your alert
Alert Timing: Choose how often to evaluate and for how long. In this example it's evaluated every minute for five minutes.
Conditions: Here you can set your threshold conditions for when an alert will be sent out. In this example, it's sent when the average of query A is above 75.
Set what happens if there's no data, or an error in your data
Add in your notification channel (who will be sent this notification)
Add a message to accompany the alert.
Click Apply > Save to finalize your alert.
Click on an image to enlarge it.
Below are a few alerts we recommend setting up on your Grafana.
Set up this alert to notify you when the CPU Usage on your nodes exceeds a specified limit.
You can use the following example queries to set up a dashboard that will capture CPU Usage by Node (Image 14).
Set up your alert. This example uses a threshold limit of 75 (Image 15).
Set up this alert to notify you when the Memory Usage on your nodes exceeds a specified limit.
You can use the following example queries to set up a dashboard that will capture CPU Usage by Node (Image 16)
Set up your alert. This example uses a threshold limit of 85 (Image 17).
Set up this alert to notify you when the Disk Usage on your nodes exceeds a specified limit.
You can use the following example queries to set up a dashboard that will capture Disk Usage by Node (Image 18)
Set up your alert. This example uses a threshold limit of 80 (Image 17).
Set up this alert to check the amount of iowait
from the CPU. A high value usually indicates a slow/overloaded HDD or Network.
You can use the following example queries to set up a dashboard that will capture the CPU I/O wait (Image 19).
Set up your alert. This example uses a threshold limit of 60 (Image 19).
This capability was added in Cinchy v5.4.
Your Grafana password can be updated in your deployment.json file (you may have renamed this during your original deployment).
Navigate to cluster_component_config > grafana.
The default password is set to prom-operator; update this with your preferred new password, written in clear text.
Run the below command in the root directory of your devops.automations repository to update your configurations. If you have changed the name of your deployment.json file, make sure to update the command accordingly.
Commit and push your changes.
If your environment isn't set-up to automatically apply upon configuration,navigate to the ArgoCD portal and refresh your component(s). If that doesn't work, re-sync.
Verbose
Verbose is the noisiest level, rarely (if ever) enabled for a production app.
Debug
Debug is used for internal system events that aren't necessarily observable from the outside, but useful when determining how something happened. This is the default setting for Cinchy.
Information
Information events describe things happening in the system that correspond to its responsibilities and functions. Generally these are the observable actions the system can perform.
Warning
When service is degraded, endangered, or may be behaving outside of its expected parameters, Warning level events are used.
Error
When functionality is unavailable or expectations broken, an Error event is used.
Fatal
The most critical level, Fatal events demand immediate attention.
This page serves as a general guide to using ArgoCD for monitoring purposes
ArgoCD is implemented as a Kubernetes controller which continuously monitors running applications and compares the current, live state against the desired target state (as specified in your Git repository).
You can use ArgoCD's dashboard (Image 1) to visually monitor your namespaces and pods, and to quickly visualize deployment issues. It can easily show you what your cluster or pods are doing, and if they're healthy.
ArgoCD has a robust set of documentation that can help you to get started with the application. We recommend the following two pages:
Your ArgoCD dashboard has a lot of important information about how your Cinchy instance is behaving.
The application tiles view is the default view when logging into ArgoCD. You can also access it through the grid widget in the upper right hand corner of the screen.
These application tiles each represent a pod in one of your Cinchy environments. By default, you should have an application tile for each of the following pods per namespace (Image 2):
Base
Connections
Event Listener
IDP
Maintenance CLI
Meta Forms
Web
Worker
Each application tile shows you some important information about the pod, including its health and sync status. You can use the Sync, Refresh, and Delete buttons to manage your pods.
You can also click the star button on any application title to favourite the associated pod.
The pie chart view (Image 3) shows an easy visualization of the health of your applications. Access this view by clicking on the pie chart widget located in the upper right hand corner of the screen.
You can filter your view so that it only shows certain data. You can find the various filter options from the Filter column on the left hand side of any of the data views.
Favorites Only: Filter by only favorite tiles (Image 4).
Sync Status: You can easily view what's out of sync using this filter (Image 5).
Health Status: Check on the health of your applications by filtering using these parameters (Image 6).
Labels: Use these labels to filter by specific instances/environments (Image 7).
Projects: If you choose to group your applications into projects, you can filter them using this tile (Image 8).
Clusters: If you have more than one cluster, you can filter for it using the Clusters tile (Image 9).
Namespace: Lastly, you can filter by Namespace. The below example shows a filter based on the dev-aurora-1 namespace (Image 10).
To bring up a more detailed view of your applications, click on the application tile. This view will show you all components, their health and their sync status (Image 11). You can use the top navigational buttons perform actions such as syncing, rolling back, or refreshing.
This view can be useful for load testing, since you can see each individual pod spinning up and down.
View the status of your apps by looking at the health and sync status along the top of the page (Image 12).
Clicking on any individual pod or component tile in this view will bring up its information, including a Summary, a list of Events, your Manifest, and Parameters (Image 13).
You can also use this screen to edit or delete applications.
From the detailed tile summary you can also set your sync policies, such as automation, resource pruning, and self healing (Image 14).
You can click on the Logs tab in your detailed summary page to view the applicable logs for your selected pod (Image 15). You can filter, follow, snooze, copy, or download any logs.
OpenSearch comes with the ability to set up alerts based on any number of monitors. You can then push these alerts via email, should you desire.
Before you set up a monitor or alert, ensure that you have added your data source as an index pattern.
Definitions:
Your destination will be where you want your alerts to be pushed to. OpenSearch supports various options, but this guide focuses on email.
From the left navigation pane, click Alerting (Image 1).
Click on the Destinations Tab > Add Destination
Add a name to label your destination and select Email as type (Image 2)
You will need to assign a Sender. This is the email address that the alert will send from when you specify this specific destination. To add a new Sender, click Manage Senders (Image 3).
Click Add Sender
Add in the following information (Image 4):
Sender Name
Email Address
Host (this is the host address for the email provider)
Port (this is the Port of the email provider)
Encryption
Ensure that you authenticate the Sender, or your alert won't work.
You will need to assign your Recipients. This is the email address(es) that will receive the alert when you specify this specific destination. To add a new Recipient, you can either type their email(s) into the box, or click Manage Senders to create an email group (Image 5).
Click Update to finish your Destination.
You will need to authenticate your sender for emails to come through. Please contact Cinchy Customer Support to help you with this step.
Via email: support@cinchy.com
Via phone: 1-888-792-6051
Through the support portal: Support Portal
Your monitor is a job that runs on a defined schedule and queries OpenSearch indices. The results of these queries are then used as input for one or more triggers.
From the Alerting dashboard, select Monitors > Create Monitor (Image 6).
Under Monitor Details, add in the following information (Image 7).
Monitor Name
Monitor Type (This example uses Per Bucket)
Whereas query-level monitors run your specified query and then check whether the query’s results triggers any alerts, bucket-level monitors let you select fields to create buckets and categorize your results into those buckets.
The alerting plugin runs each bucket’s unique results against a script you define later, so you have finer control over which results should trigger alerts. Each of those buckets can trigger an alert, but query-level monitors can only trigger one alert at a time.
Monitor Defining Method: the way you want to define your query and triggers. (This example uses Visual Editor)
Visual definition works well for monitors that you can define as “some value is above or below some threshold for some amount of time.”
Query definition gives you flexibility in terms of what you query for (using the OpenSearch query DSL) and how you evaluate the results of that query (Painless scripting).
Schedule: Choose a frequency and time zone for your monitor.
Under Data Source add in the following information (Image 8):
Index: Define the index you want to use as a source for this monitor
Time Field: Select the time field that will be used for the x-axis of your monitor
The Query section will appear differently depending on the Monitor Defining Method selected in step 2 (Image 9). This example is using the visual editor.
To define a monitor visually, select an aggregation (for example, count()
or average()
), a data filter if you want to monitor a subset of your source index, and a group-by field if you want to include an aggregation field in your query. At least one group-by field is required if you’re defining a bucket-level monitor. Visual definition works well for most monitors.
A trigger is a condition that, if met, will generate an alert.
To add a trigger, click the Add a Trigger button (Image 10).
Under New Trigger, define your trigger name and severity level (with 1 being the highest) (Image 11).
Under Trigger Conditions, you will specify the thresholds for the query metrics you set up previously (Image 12). In the below example, our trigger will be met if our COUNT threshold goes ABOVE 10000.
You can also use keyword filters to drill down into a more specific subset of data from your data source.
In the Action section you will define what happens if the trigger condition is met (Image 13). Enter the following information to set up your Action:
Action Name
Message Subject: In the case of an email alert, this will be the email subject line.
Message: In the case of an email alert, this will be the email body.
Perform Action: If you’re using a bucket-level monitor, decide whether the action is performed per execution or per alert.
Throttling: Enable action throttling if you wish. Use action throttling to limit the number of notifications you receive within a given span of time.
Click Send Test Message, if you want to test that the alert functions correctly.
Click Save.
This example pushes an alert based on errors. We will monitor our Connections stream for any instance of 'error', and push out an alert when our trigger threshold is hit.
First we create our Monitor by defining the following (Image 14):
Index: This example looks at Connections.
Time Field
Time Range: Define how far back you want to monitor
Data Filter: We want to monitor specifically whenever the Stream field of our index is stderr (standard error).
This is how our example monitor will appear; it shows when in the last 15 days our Connections app had errors in the log (Image 15).
Once our monitor is created, we need to define a trigger condition. When this condition is met, the alert will be pushed out to our defined Recipient(s). In this example we want to be alerted when there is more than one stderr in our Connections stream (Image 16). Input the following:
Trigger Name
Severity Level
Trigger Condition: In this example, we use IS ABOVE and the threshold of 1.
The trigger threshold will be visible on your monitoring graph as a red line.
This example pushes an alert based on the kubectl.kubernetes.io/restartedAt annotation, which updates whenever your pod restarts. We will monitor this annotation across our entire product-mssql instance, and push out an alert when our trigger threshold is hit.
First we create our Monitor by defining the following (Image 17):
Index: This example looks at the entire product-mssql instance.
Time Field
Query: This example is using the total count of the kubectl.kubernetes.io/restartedAt annotation.
Time Range: Define how far back you want to monitor. This example goes back 30 days.
This is how our example monitor will appear; it shows when in the last 30 days our instance had restarts (Image 18).
2. Once our monitor is created, we need to define a trigger condition. When this condition is met, the alert will be pushed out to our defined Recipient(s). In this example we want to be alerted when there is more than 100 restarts across our instance (Image 19). Input the following:
Trigger Name
Severity Level
Trigger Condition: In this example, we use IS ABOVE and the threshold of 100.
The trigger threshold will be visible on your monitoring graph as a red line.
This example pushes an alert based on status codes. We will monitor our entire instance for 400 status codes and push out an alert when our trigger threshold is hit.
First we create our Monitor by defining the following (Image 20):
Index: This example looks across the entire product-mssql-1 instance.
Time Field
Time Range: Define how far back you want to monitor. The time range for this example is the past day.
Data Filter: We want to monitor specifically whenever the Status Code is 400 (bad request).
This is how our example monitor will appear (note that there are no instances of a 400 status code in this graph) (Image 21).
Once our monitor is created, we need to define a trigger condition. When this condition is met, the alert will be pushed out to the defined Recipient(s). In this example we want to be alerted when there is at least one 400 status code across out instance (Image 22). Input the following:
Trigger Name
Severity Level
Trigger Condition: In this example, we use IS ABOVE and the threshold of 0.
The trigger threshold will be visible on your monitoring graph as a red line.
Monitor
A job that runs on a defined schedule and queries OpenSearch indices. The results of these queries are then used as input for one or more triggers.
Trigger
Conditions that, if met, generate alerts.
Alert
An event associated with a trigger. When an alert is created, the trigger performs actions, which can include sending a notification.
Action
The information that you want the monitor to send out after being triggered. Actions have a destination, a message subject, and a message body.
Destination
A reusable location for an action. Supported locations are Amazon Chime, Email, Slack, or custom webhook.