Azure Kubernetes Service (AKS) official MCP server.
1.6K
15 Tools
Version 4.43 or later needs to be installed to add the server automatically
Tools
Name | Description |
---|---|
az_advisor_recommendation | Retrieve and manage Azure Advisor recommendations for AKS clusters |
az_aks_operations | Unified tool for managing Azure Kubernetes Service (AKS) clusters and related operations. Supported operations: - Cluster: show, list, get-versions, check-network - Nodepool: nodepool-list, nodepool-show - Account: account-list Examples: - Show cluster: operation="show", args="--name myCluster --resource-group myRG" - List nodepools: operation="nodepool-list", args="--cluster-name myCluster --resource-group myRG" |
az_compute_operations | Unified tool for managing Azure Virtual Machines (VMs) and Virtual Machine Scale Sets (VMSS) using Azure CLI. IMPORTANT: VM/VMSS resources are managed by AKS. Write operations should be used carefully and only for debugging purposes. Use resource_type="vm" for single virtual machines or resource_type="vmss" for virtual machine scale sets. Available operation values: - show: Get details of a VM/VMSS - list: List VMs/VMSS in subscription or resource group - get-instance-view: Get runtime status EXAMPLES: List VMSS: operation="list", resource_type="vmss", args="--resource-group myRG" Show VMSS: operation="show", resource_type="vmss", args="--name myVMSS --resource-group myRG" List VMs: operation="list", resource_type="vm", args="--resource-group myRG" |
az_fleet | Run Azure Kubernetes Service Fleet management commands. Available operations and resources: - fleet: list, show, create, update, delete, get-credentials - member: list, show, create, update, delete - updaterun: list, show, create, start, stop, delete - updatestrategy: list, show, create, delete - clusterresourceplacement: list, show, get, create, delete (Kubernetes CRD operations) Examples: - List fleets: operation='list', resource='fleet', args='--resource-group myRG' - Show fleet: operation='show', resource='fleet', args='--name myFleet --resource-group myRG' - Get fleet credentials: operation='get-credentials', resource='fleet', args='--name myFleet --resource-group myRG' - Create member: operation='create', resource='member', args='--name myMember --fleet-name myFleet --resource-group myRG --member-cluster-id /subscriptions/.../myCluster' - Create clusterresourceplacement: operation='create', resource='clusterresourceplacement', args='--name nginx --selector app=nginx --policy PickAll' - List clusterresourceplacements: operation='list', resource='clusterresourceplacement', args='' |
az_monitoring | Unified tool for Azure monitoring and diagnostics operations for AKS clusters. Supported Operations: 1. Metrics - Query Azure Monitor metrics for AKS clusters and nodes - list: Get metric values for specific metrics - list-definitions: Get available metrics for a resource - list-namespaces: Get metric namespaces for a resource Use for: CPU usage, memory consumption, network traffic, pod counts, node health Required parameters: resource (Azure resource ID) Additional for 'list': metrics (metric names) Optional: aggregation, start-time, end-time, interval, filter 2. Resource Health - Get Azure Resource Health events for AKS clusters Use for: Cluster availability issues, platform problems, service health events Required parameters: subscription_id, resource_group, cluster_name, start_time Optional: end_time, status (Available, Unavailable, Degraded, Unknown) 3. Application Insights - Execute KQL queries against Application Insights telemetry Use for: Application performance monitoring, custom telemetry analysis, trace correlation Required parameters: subscription_id, resource_group, app_insights_name, query Optional: start_time + end_time OR timespan (not both) 4. Diagnostics - Check AKS cluster diagnostic settings configuration Use for: Verify logging is enabled, check log retention, validate diagnostic configuration Required parameters: subscription_id, resource_group, cluster_name 5. Control Plane Logs - Query AKS control plane logs Supported log categories: - kube-apiserver - kube-audit - kube-audit-admin - kube-controller-manager - kube-scheduler - cluster-autoscaler - cloud-controller-manager - guard (for authentication/authorization issues) - csi-azuredisk-controller - csi-azurefile-controller - csi-snapshot-controller - fleet-member-agent - fleet-member-net-controller-manager - fleet-mcs-controller-manager PLEASE NOTE: you need to check if the category is enabled in your cluster's diagnostic settings by using the diagnostics tool. Use This Tool When You Need To: - Monitor cluster or other azure resource performance and usage (use metrics) - Check cluster availability and platform health (use resource_health) - Analyze application telemetry and performance (use app_insights) - Verify diagnostic logging configuration (use diagnostics) - Debug Kubernetes API server issues (use control_plane_logs with kube-apiserver) - Investigate authentication/authorization problems (use control_plane_logs with kube-audit, guard) - Troubleshoot pod scheduling issues (use control_plane_logs with kube-scheduler) - Check storage-related problems (use control_plane_logs with csi-azuredisk-controller, csi-azurefile-controller) - Analyze cluster scaling behavior (use control_plane_logs with cluster-autoscaler) - Review security audit events (use control_plane_logs with kube-audit, kube-audit-admin) Examples: metrics: - Get CPU usage: operation="metrics", query_type="list", parameters="{\"resource\":\"/subscriptions/sub-id/resourceGroups/rg/providers/Microsoft.ContainerService/managedClusters/cluster\", \"metrics\":\"node_cpu_usage_percentage\", \"aggregation\":\"Average\", \"start-time\":\"<start-time>\", \"end-time\":\"<end-time>\"}" - List available metrics: operation="metrics", query_type="list-definitions", parameters="{\"resource\":\"/subscriptions/sub-id/resourceGroups/rg/providers/Microsoft.ContainerService/managedClusters/cluster\"}" resource_health: - Check recent cluster health: operation="resource_health", subscription_id="<subscription-id>", resource_group="<resource-group>", cluster_name="<cluster-name>", parameters="{\"start_time\":\"<start-time>\"}" app_insights: - Query request telemetry: operation="app_insights", subscription_id="<subscription-id>", resource_group="<resource-group>", parameters="{\"app_insights_name\":\"myapp-insights\", \"query\":\"requests | where timestamp > ago(1h) | summarize count() by bin(timestamp, 5m)\"}" - Analyze exceptions: operation="app_insights", subscription_id="<subscription-id>", resource_group="<resource-group>", parameters="{\"app_insights_name\":\"myapp-insights\", \"query\":\"exceptions | where timestamp > ago(24h) | summarize count() by type, bin(timestamp, 1h)\"}" - Performance with timespan: operation="app_insights", subscription_id="<subscription-id>", resource_group="<resource-group>", parameters="{\"app_insights_name\":\"myapp-insights\", \"query\":\"performanceCounters | where category == 'Processor' | summarize avg(value) by bin(timestamp, 5m)\", \"timespan\":\"PT1H\"}" diagnostics: - Verify diagnostic settings: operation="diagnostics", subscription_id="<subscription-id>", resource_group="<resource-group>", cluster_name="<cluster-name>", parameters="{}" control_plane_logs: - Query API server logs: operation="control_plane_logs", subscription_id="<subscription-id>", resource_group="<resource-group>", cluster_name="<cluster-name>", parameters="{\"log_category\":\"kube-apiserver\", \"start_time\":\"<start-time>\", \"end_time\":\"<end-time>\", \"max_records\":\"50\"}" - Debug authentication issues: operation="control_plane_logs", subscription_id="<subscription-id>", resource_group="<resource-group>", cluster_name="<cluster-name>", parameters="{\"log_category\":\"guard\", \"start_time\":\"<start-time>\", \"end_time\":\"<end-time>\", \"max_records\":\"100\"}" - Analyze audit events: operation="control_plane_logs", subscription_id="<subscription-id>", resource_group="<resource-group>", cluster_name="<cluster-name>", parameters="{\"log_category\":\"kube-audit\", \"log_level\":\"error\", \"start_time\":\"<start-time>\", \"end_time\":\"<end-time>\", \"max_records\":\"50\"}" |
az_network_resources | Unified tool for getting Azure network resource information used by AKS clusters. Supported resource types: - all: Get information about all network resources - vnet: Get Virtual Network information - nsg: Get Network Security Group information - route_table: Get Route Table information - subnet: Get Subnet information - load_balancer: Get Load Balancer information - private_endpoint: Get Private Endpoint information (private clusters only) Examples: - Get all network resources: resource_type="all" - Get VNet info: resource_type="vnet" - Get NSG info: resource_type="nsg" |
get_aks_vmss_info | Get detailed VMSS configuration for a specific node pool or all node pools in the AKS cluster (provides low-level VMSS settings not available in az aks nodepool show). Leave node_pool_name empty to get info for all node pools. |
inspektor_gadget_observability | Real-time observability tool for Azure Kubernetes Service (AKS) clusters, allowing users to manage gadgets for monitoring and debugging Apart from 'action' param: It supports 'action_params' (type=object) to specify parameters for the action.Available params are: gadget_name, duration, gadget_id, chart_version. Available Gadget names are: observe_dns, observe_tcp, observe_file_open, observe_process_execution, observe_signal, observe_system_calls, top_file, top_tcp. Example: {'action': 'run', 'action_params': {'gadget_name': 'observe_dns', 'duration': 10}} It supports 'filter_params' (type=object) to filter the data captured by the gadget. Available params are: namespace, pod, container, selector,observe_process_execution.command, observe_file_open.path, top_tcp.max_entries, top_file.max_entries, observe_dns.unsuccessful_only, observe_dns.name, observe_dns.nameserver, observe_dns.minimum_latency, observe_tcp.destination_port, observe_tcp.unsuccessful_only, observe_signal.signal, observe_system_calls.syscall, observe_dns.response_code, observe_tcp.source_port, observe_tcp.event_type, observe_file_open.unsuccessful_only. Example: {'action': 'run', 'filter_params': {'namespace': 'default', 'selector': 'app=myapp', 'observe_dns.unsuccessful_only': true}} |
kubectl_cluster | Get information about the Kubernetes cluster and API. Available operations: - cluster-info: Display cluster information - api-resources: Print supported API resources - api-versions: Print supported API versions - explain: Get documentation for a resource Examples: - Cluster info: operation='cluster-info', resource='', args='' - Cluster info dump: operation='cluster-info', resource='dump', args='' - List resources: operation='api-resources', resource='', args='' - Namespaced resources: operation='api-resources', resource='', args='--namespaced=true' - Non-namespaced resources: operation='api-resources', resource='', args='--namespaced=false' - Resources by group: operation='api-resources', resource='', args='--api-group=rbac.authorization.k8s.io' - API versions: operation='api-versions', resource='', args='' - Explain pod: operation='explain', resource='pods', args='' - Explain field: operation='explain', resource='pods.spec.containers', args='' - Explain with version: operation='explain', resource='deployments', args='--api-version=apps/v1' |
kubectl_config | Work with Kubernetes configurations (read-only). Available operations: - diff: Diff the live version against what would be applied - auth: Inspect authorization (can-i) - config: View kubectl configuration contexts (read-only) Config operations (read-only): - current-context: Display the current context - get-contexts: Describe one or many contexts Examples: - Diff config: operation='diff', resource='', args='-f pod.json' - Check auth: operation='auth', resource='can-i', args='create pods --all-namespaces' - Get current context: operation='config', resource='current-context', args='' - List contexts: operation='config', resource='get-contexts', args='' |
kubectl_diagnostics | Diagnose and debug Kubernetes resources. Available operations: - logs: Print logs for a container in a pod - events: Display events - top: Display resource usage (CPU/Memory) - exec: Execute a command in a container - cp: Copy files to/from containers Examples: - Logs for default container: operation='logs', resource='', args='nginx' - Logs for specific container: operation='logs', resource='', args='nginx -c ruby-container' - Logs with selector: operation='logs', resource='', args='-l app=nginx --all-containers=true' - Get events: operation='events', resource='', args='--all-namespaces' - Get events namespace: operation='events', resource='', args='-n default' - Top pods: operation='top', resource='pod', args='' - Top nodes: operation='top', resource='node', args='' - Top with containers: operation='top', resource='pod', args='POD_NAME --containers' - Exec command: operation='exec', resource='', args='mypod -n NAMESPACE -- date' - Copy to pod: operation='cp', resource='', args='/tmp/foo_dir some-pod:/tmp/bar_dir' - Copy from pod: operation='cp', resource='', args='some-namespace/some-pod:/tmp/foo /tmp/bar' - Copy with container: operation='cp', resource='', args='/tmp/foo some-pod:/tmp/bar -c specific-container' |
kubectl_resources | View Kubernetes resources with read-only operations. Available operations: - get: Display one or many resources - describe: Show detailed information about resources Common resources: pods, deployments, services, configmaps, secrets, namespaces, etc. Examples: - Get pods: operation='get', resource='pods', args='-n default' - Get specific pod: operation='get', resource='pods', args='nginx-pod -n default' - Get with selector: operation='get', resource='pods', args='-l app=nginx' - Get all namespaces: operation='get', resource='pods', args='--all-namespaces' - Describe deployment: operation='describe', resource='deployment', args='myapp -n production' - Describe all pods: operation='describe', resource='pods', args='' - Describe with selector: operation='describe', resource='pods', args='-l name=myLabel' |
list_detectors | List all available AKS cluster detectors |
run_detector | Run a specific AKS detector |
run_detectors_by_category | Run all detectors in a specific category |
Manual installation
You can install the MCP server using:
Installation for