Traffic Management using Istio 101

Published in

ITNEXT

13 min readAug 31, 2022

Purpose

Recently I read a blog from a colleague about how Istio bests all other ingress solutions for Kubernetes deployments. The blog mentioned that Istio has a huge feature set for traffic management and that got me interested in Istio and its capabilities. So I delved into Istio documentation which is very comprehensive and descriptive. I am writing this to consolidate my understanding and share what I learned while playing with it. Istio has many features but in this article, we will touch upon the basics of traffic management features provided by the Istio service mesh.

Introducing Istio traffic management

Traffic management is the major feature provided by Istio service mesh apart from Observability, Security, and Extensibility. In order to direct traffic within your mesh, Istio needs to know where all your endpoints are, and which services they belong to. To populate its own service registry, Istio connects to a service discovery system. For example, if you’ve installed Istio on a Kubernetes cluster, then Istio automatically detects the services and endpoints in that cluster.

While Istio’s basic service discovery and load balancing give you a working service mesh, it’s far from all that Istio can do. In many cases, you might want more fine-grained control over what happens to your mesh traffic. You might want to direct a particular percentage of traffic to a new version of the service as part of A/B testing or apply a different load balancing policy to traffic for a particular subset of service instances. You might also want to apply special rules to traffic coming into or out of your mesh or add an external dependency of your mesh to the service registry. You can do all this and more by adding your own traffic configuration to Istio using Istio’s traffic management API.

Like other Istio configurations, the API is specified using Kubernetes custom resource definitions (CRDs), which you can configure using YAML. We will examine most of the traffic management API resources and what you can do with them using an example app provided by Istio.

Istio also supports typical Kubernetes ingress resources. But using the Istio Gateway, rather than Ingress, is recommended to make use of the full feature set that Istio offers, such as rich traffic management and security features. You will use the Istio ingress gateway in your tests.

The Test Setup

Create a GKE cluster https://cloud.google.com/kubernetes-engine/docs/deploy-app-cluster#create_cluster
Open cloud shell, authenticate for GKE cluster so that you can use kubectl and helm in cloud shell.
Run the below shell script in the cloud shell

setup-istio.sh

4. Run the below script to deploy the BookInfo app (provided by Istio for Demo purposes).

setup-biapp.sh

Review the config using the below commands

alias kgar="kubectl api-resources -o name --namespaced=true --verbs=list | xargs -n 1 kubectl get --ignore-not-found=true --show-kind"kgar -n appbi
kgar -n istio-ingress
kgar -n istio-system

5. Configure Gateway and Virtual Service by executing below

bookinfo-gateway.yaml

kubectl apply -f bookinfo-gateway.yaml -n appbi

At this point what you have configured, looks like below (this is a snapshot from my kiali dashboard)

As You can see that the istio-ingress pod directs traffic to productpage-v1 deployment and productpage-v1 connects to reviews app which has 3 deployments with a different version of the app respectively. productpage-v1 also connects to details-v1 deployment. reviews app v2 and v3 connect to ratings-v1 app.

Gateways

You will use a gateway to manage inbound and outbound traffic for your mesh, letting you specify which traffic you want to enter or leave the mesh. Gateway configurations are applied to standalone Envoy proxies that are running at the edge of the mesh i.e istio-ingress (in namespace istio-ingress), rather than sidecar Envoy proxies running alongside your service workloads. The Gateway configuration resources allow external traffic to enter the Istio service mesh and make the traffic management and policy features of Istio available for edge services.

Istio’s Gateway resource just lets you configure layer 4–6 load balancing properties such as ports to expose, TLS settings, and so on. Then instead of adding application-layer traffic routing (L7) to the same API resource, you bind a regular Istio virtual service to the gateway. This lets you manage gateway traffic like any other data plane traffic in an Istio mesh.

Let's examine the Gateway CRD definition.

kubectl get gateway -n appbi -o yaml

selector istio: ingress should be the label from deployment istio-ingress in namespace istio-ingress(stand-alone Envoy proxy)
This gateway configuration lets HTTP traffic from bookinfo.app.io (from host list) into the mesh on port 80, but doesn’t specify any routing for the traffic.
In case you are using subdomains, there can be ‘n’ number of hosts in the hosts list from where you allow ingress traffic into service mesh.
To specify routing and for the gateway to work as intended, you must also bind the gateway to a virtual service. You do this using the virtual service’s gateways field (you will see that later).
You can configure tls config at the gateway as well (not covered in this article), you can check the options using the below command.

kubectl explain gateway.spec.servers.tls

NOTE: For Gateway resources to be utilized, Istio ingress standalone Envoy proxy (istio-ingress deployment in istio-ingress namespace)must exist.

To study and review all options, please refer to the official guide.

Virtual Services

Virtual services, along with destination rules, are the key building blocks of Istio’s traffic routing functionality. A virtual service lets you configure how requests are routed to a service within an Istio service mesh. Each virtual service consists of a set of routing rules that are evaluated in order, letting Istio match each given request to the virtual service to a specific real destination within the mesh. Routing rules are evaluated in sequential order from top to bottom, with the first rule in the virtual service definition being given the highest priority. Your mesh can require multiple virtual services or none depending on your use case.

Virtual Services are very useful where you might want to configure traffic routes based on percentages across different service versions or to direct traffic from your internal users to a particular set of instances. You can read more about how virtual services help with canary deployments in Canary Deployments using Istio.

With a virtual service, you can specify traffic behavior for one or more hostnames. You use routing rules in the virtual service that tell Envoy how to send the virtual service’s traffic to appropriate destinations. Route destinations can be versions of the same service or entirely different services.

Let's examine the virtual service CRD definition that you deployed.

kubectl  get virtualservice -n appbi -o yaml

The hosts field lists the virtual service’s hosts - this is the address or addresses the client uses when sending requests to the service. It is “bookinfo.app.io” in this case.
The gatewaysfield lists the gateway objects associated with the virtual service. you associated this virtual service with the gateway you defined earlier “bookinfo”.
The http section contains the virtual service’s routing rules, describing match conditions and actions for routing HTTP/1.1, HTTP2, and gRPC traffic sent to the destination(s) specified in the hosts field.

Routing rules

There are 2 routing rules as you see.
The first one has a condition to match i.e. if the request uri is “/productpage” or “/static” or “/login” etc., route the request to destination service “productpage.appbi.svc.cluster.local” at port 9080. Envoy proxy will then takeover and loadbalance the requests to the pods associated with service “productpage.appbi.svc.cluster.local”. The above rule applies when a client tries to reach http://bookinfo.app.io/productpage.
The second one doesn't have a condition to match, so it acts as the default route for all requests without any URI context path. The route destination is the same as the last one but there is an additional map “rewrite”. “rewrite” helps to rewrite the URI before forwarding to the destination. So if a client tries to reach http://bookinfo.app.io/, envoy proxy sets destination as the pods associated with service “productpage.appbi.svc.cluster.local” at port 9080 with URI context path as “/productpage”.

We earlier talked about the way you can use virtual services to configure routes based on percentages across different service versions but there is nothing like this you have done so far. Since you have a “reviews” app which has 3 deployments with different versions of the app respectively, you can use a new virtual service to define the routes based on percentage.

vs-review.yaml

kubectl apply -f vs-reviews.yaml -n appbi

This will create a new virtual service “reviews”.

There is no gateway associated with it because this is for traffic within the service mesh.
There are no conditions to match, so it applies to all the traffic directed to the reviews app.
There is one routing rule which defines 3 different destination routes. Each destination has a different weight (in terms of percentage). Although the host field (it's different from the one under virtualservice.spec.hosts) is the same for all destinations but observe that the “subset” is different and is associated with a different version of the same app. Subset definition is covered in the destinationrule CRD (will cover that later). This rule basically says 50% of traffic to service “reviews” should be directed to subset v1, 40 % should be directed to subset v2, and 10% to subset v3.

For the above to work, you need to define a destination rule where you will define what subset v1, v2, and v3 mean.

As you saw above, routing rules are a powerful tool for routing particular subsets of traffic to particular destinations. You can set match conditions on traffic ports, header fields, URIs, and more. A complete reference for all options can be found in the official documentation.

Destination Rules

Along with virtual services, destination rules are a key part of Istio’s traffic routing functionality. You can think of virtual services as how you route your traffic to a given destination, and then you use destination rules to configure what happens to traffic for that destination. Destination rules are applied after virtual service routing rules are evaluated, so they apply to the traffic’s “real” destination.

In particular, you use destination rules to specify named service subsets, such as grouping all a given service’s instances by version. You can then use these service subsets in the routing rules of virtual services to control the traffic to different instances of your services.

Let's do it for the reviews app.

dr-reviews.yaml

kubectl apply -f dr-reviews.yaml -n appbi

Let’s examine the destination rule CRD definition that you deployed.

You have defined the subsets list. List items are maps where you define labels for each subset.
Each subset is defined based on one or more labels, which in Kubernetes are key/value pairs that are attached to objects such as Pods. These labels are applied in the Kubernetes service’s deployment as metadata to identify different versions. The labels are actually selectors for the destination pods. You can check to which pods traffic will be directed for a particular subset. For example, say you want to find the pods for subset v1 you can use the below command to check that.

kubectl get pod -n appbi --selector=version=v1,app=reviews

Now that you have deployed Virtual service and DestinationRule for the reviews app, you can do that test.

kubectl get svc istio-ingress -n istio-ingress -o jsonpath='{.status.loadBalancer.ingress[0].ip}';echo ""

Use the above command to get the Public IP of LB and add it to the hosts' file on your client machine as “bookinfo.app.io”. Observe that the Book Reviews section displays different “Book reviews” UI i.e v1 one without stars, v2 one with black stars, and v3 one with red stars. Review often which UI is displayed when you refresh the app page.

This feature can be leveraged for canary deployments as shown in the official Istio blog.

You can see a complete list of destination rule options in the Destination Rule reference.

Ingress request flow

The below diagram shows how the traffic flows from the client to the pods via the LoadBalancer, Istio Ingress gateway, and service mesh serving the web application.

Traffic enters the mesh as soon as it reaches the standalone envoy proxy running as istio-ingress.

Service entries

You use a service entry to add an entry to the service registry that Istio maintains internally. After you add the service entry, the Envoy proxies can send traffic to the service as if it was a service in your mesh.

Configuring service entries allows you to manage traffic for services running outside of the mesh, including the following tasks:

Redirect and forward traffic for external destinations, such as APIs consumed from the web, or traffic to services in legacy infrastructure.
Define retry, timeout, and fault injection policies for external destinations.
Run a mesh service in a Virtual Machine (VM) by adding VMs to your mesh.

serviceEntry.yaml

kubectl apply -f serviceEntry.yaml -n appbi

In this Service entry example, you are adding a service entry for googleapis.com and then using the external service endpoint in virtual service and destinationrule definition as if the service is part of the service mesh.

Sidecars

By default, Istio configures every Envoy proxy to accept traffic on all the ports of its associated workload, and to reach every workload in the mesh when forwarding traffic. You can use a sidecar configuration to do the following:

Fine-tune the set of ports and protocols that an Envoy proxy accepts.
Limit the set of services that the Envoy proxy can reach.

You might want to limit sidecar reachability like this in larger applications, where having every proxy configured to reach every other service in the mesh can potentially affect mesh performance due to high memory usage.

Sidecar.yaml

kubectl apply -f Sidecar.yaml -n appbi

You can specify that you want a sidecar configuration to apply to all workloads in a particular namespace, or choose specific workloads using a workloadSelector. In the above sidecar configuration, all services in the appbi namespace are configured to only reach services running in the same namespace and the Istio control plane (needed by Istio’s egress and telemetry features).

NOTE: Resources such as Virtual Service, Destination Rule, Service Entry, and Sidecar don’t need Istio ingress standalone Envoy proxy (istio-ingress deployment in istio-ingress namespace).

Network resilience and testing

As well as helping you direct traffic around your mesh, Istio provides opt-in failure recovery and fault injection features that you can configure dynamically at runtime.

Timeouts

A timeout is the amount of time that an Envoy proxy should wait for replies from a given service, ensuring that services don’t hang around waiting for replies indefinitely and that calls succeed or fail within a predictable timeframe. The Envoy timeout for HTTP requests is disabled in Istio by default.

For some applications and services, Istio’s default timeout might not be appropriate. To find and use your optimal timeout settings, Istio lets you easily adjust timeouts dynamically on a per-service basis using virtual services without having to edit your service code.

In vs-reviews.yaml, you specified a 10-second timeout for all the destination routes

Retries

A retry setting specifies the maximum number of times an Envoy proxy attempts to connect to a service if the initial call fails. Retries can enhance service availability and application performance by making sure that calls don’t fail permanently because of transient problems such as a temporarily overloaded service or network. The interval between retries (25ms+) is variable and determined automatically by Istio, preventing the called service from being overwhelmed with requests. The default retry behavior for HTTP requests is to retry twice before returning the error.

Like timeouts, Istio’s default retry behavior might not suit your application needs in terms of latency (too many retries to a failed service can slow things down) or availability. Also like timeouts, you can adjust your retry settings on a per-service basis in virtual services without having to touch your service code.

In vs-reviews.yaml, you configured a maximum of 3 retries to connect to this service subset after an initial call failure, each with a 2-second timeout.

Circuit breakers

Circuit breakers are another useful mechanism Istio provides for creating resilient microservice-based applications. In a circuit breaker, you set limits for calls to individual hosts within a service, such as the number of concurrent connections or how many times calls to this host have failed. Once that limit has been reached the circuit breaker “trips” and stops further connections to that host. Using a circuit breaker pattern enables fast failure rather than clients trying to connect to an overloaded or failing host.

In dr-reviews.yaml, you limited the number of concurrent connections for the reviews service workloads of all subsets to 100.

Fault injection

You can use Istio’s fault injection mechanisms to test the failure recovery capacity of the application as a whole. Fault injection is a testing method that introduces errors into a system to ensure that it can withstand and recover from error conditions. Using fault injection can be particularly useful to ensure that your failure recovery policies aren’t incompatible or too restrictive, potentially resulting in critical services being unavailable.

Currently, the fault injection configuration can not be combined with retry or timeout configuration on the same virtual service, see Traffic Management Problems.

faultInjection.yaml

kubectl apply -f faultInjection.yaml -n appbi

In this example, you deployed the virtual service for the details-v1 app and a destination rule for the same. In the Virtual service definition, you introduced a fault map under specs. You introduced an error with HTTP code 555 for 100% of the incoming requests in that fault map.

You can see the error when you access the URL http://bookinfo.app.io/ or if you watch logs for productpage-v1 pod using the below command.

kubectl logs -n appbi --selector=app=productpage -n appbi -f

Using Nginx Ingress Controller and Istio Traffic management together

My question would be why would you do it? Well, I tested it out of curiosity and it works. I was able to access the bookinfo app using Nginx ingress as well. Request enters the mesh after the pod of the productpage app tries to access other services. Once the request enters the mesh, the envoy proxy still takes over and applies all configurations which you defined in virtual services (ones not associated with Ingress Gateway), destination rules, service entries, and sidecars. It is just that the request didn't enter mesh via the Istio Ingress gateway and you can’t enjoy the features associated with Istio Ingress Gateway.

The End

Well not really!!! It is actually a very basic, introductory 101-type article. It is something to get started with if you are interested in Istio’s traffic management capabilities. Istio’s official documentation is very comprehensive with a lot of examples. As I said at the beginning of this article, the purpose is to touch basics of the traffic management capabilities of Istio which indeed are useful but is complex as well.

Also, Istio has other valuable capabilities which you might be interested in. You can also consider a managed solution like Anthos service mesh which is powered by Istio.

Note: Assume you have istio sidecar injection enabled for a namespace and you want to create a pod that should not have istio sidecar injected, you may use the below annotation to disable sidecar inject to the pod manifest.

annotations:
  sidecar.istio.io/inject: "false"

How can we use WAF with Istio? Here is my article about Using the Application gateway WAF with Istio.

Please read my other articles as well and share your feedback. If you like the content shared please like, comment, and subscribe for new articles.