devops
1577 TopicsOpenShift Service Mesh 2.x/3.x with F5 BIG-IP
Overview OpenShift Service Mesh (OSSM) is Red Hat´s packaged version of Istio Service Mesh. Istio has the Ingress Gateway component to handle incoming traffic from outside of the cluster. Like other ingress controllers, it requires an external load balancer to get the traffic into the ingress PODs. This follows the canonical Kubenetes 2-tier arrangement for getting the traffic inside the cluster. This is depicted in the next figure: This article covers the configuration of OpenShift Service Mesh 2.x/3.x and expose it to the BIG-IP, and how to properly monitor its health, either using BIG-IP´s Container Ingress Services (CIS) or without using it. Exposing OSSM in BIG-IP - VIP configuration It is a customer choice how to publish OSSM in the BIG-IP: A Layer 4 (L4) Virtual Server is more simple and certificate management is done in OpenShift. The advantages of using this mode are the potential higher performance and scalability, including connection mirroring, yet mirroring is not usually used for HTTP traffic due to the typical retry mechanism of HTTP applications. Connection persistence is limited to the source IP. When using CIS, this is done with a TransportServer CR, which creates a fastL4 type virtual server in the BIG-IP. A Layer 7 (L7) Virtual Server requires additional configuration because TLS termination is required. In this mode, OpenShift can take advantage of BIG-IP´s TLS off-loading capabilities and Hardware/Network/SaaS/Cloud HSM integrations, which store private keys securely, including FIPS level support. Working at L7 also allows to do per-application traffic management, including headers and payload rewrites, cookie persistence, etc. It also allows to do per-application multi-cluster. The above features are provided by the LTM (load balancing) module in BIG-IP. The possibilities are further expanded when using modules such as ASM (Advanced WAF) and Access (authentication). When using CIS, this is done with a VirtualServer CR, which creates a standard-type virtual server in the BIG-IP. Exposing OSSM to BIG-IP - pool configuration There are two options to expose Istio Ingress Gateways to BIG-IP: Using ClusterIP addresses, these are POD IPs which are dynamic. This requires the use of CIS for discovering the IP addresses of the Ingress Gateway PODs. Using NodePort addresses, these are reachable from the outside network. When using these, it is not strictly necessary to use CIS, but it is recommended. Exposing OpenShift Service Mesh using ClusterIP This requires the use of CIS with the following parameters --orchestration-cni=ovn --static-routing-mode=true These make CIS create IP routes in the BIG-IP for reaching the POD IPs inside the OpenShift cluster. Please note that this only works if all the OpenShift nodes are directly connected in the same subnet as the BIG-IP. Additionally, it is required following parameter. It is the one that actually makes CIS populate pool members with Cluster (POD) IPs: --pool-member-type=cluster It is not needed to change any configuration in OSSM because ClusterIP mode is the default mode in Istio Ingress Gateways. Exposing OpenShift Service Mesh using NodePort Using NodePort allows to have known IP addresses for the Ingress Gateways, reachable from outside the cluster. Note that when using nodePort, only one Ingress Gateway replica will run per node. The behavior of NodePort varies using the externalTrafficPolicy field: Using the Cluster value, any OpenShift node will accept traffic and will redirect the traffic to any node that has an Ingress Gateway POD, in a load balancing fashion. This is the easiest to setup, but because each request might go to a different node makes health checking not reliable (it is not known which POD goes down). Using the Local value, only the OpenShift nodes that have an Ingress Gateway PODs will accept traffic. The traffic will be delivered to the local Ingress Gateway PODs, without further indirection. This is the recommended way when using NodePort because of its deterministic behaviour and therefore reliable health checking. Next, it is described how to setup a NodePort using the Local externalTrafficPolicy. There are two options for configuring OSSM: Using the ServiceMeshControlPlane CR method: this is the default method in OSSM 2.x for backwards compatibility, but it doesn’t allow to fine tune the configuration of the proxy. See this OSSM 2.x link for further details. This is deprecated and not available in OSSM 3.x. Using Gateway injection method: this is the only method possible in OSSM 3.x and the current recommendation from Red Hat for OSSM 2.x. Using this method allows you to tune the proxy settings. In this article, it will be shown how this tuning is of special interest because at present the Ingress Gateway doesn’t have good default values for allowing reliable health checking. These will be discussed in the Health Checking section. When using ServiceMeshControlPlane CR method, the above will be configured as follows: apiVersion: maistra.io/v2 kind: ServiceMeshControlPlane [...] spec: gateways: ingress: enabled: false runtime: deployment: replicas: 2 service: externalTrafficPolicy: Local ports: - name: status-port nodePort: 30021 port: 15021 targetPort: 15021 - name: http2 nodePort: 30080 port: 80 targetPort: 8080 - name: https nodePort: 30443 port: 443 targetPort: 8443 type: NodePort When using the Gateway injection method (recommended), the Service definition is manually created analogously to the ServiceMeshControlPlane CR: apiVersion: v1 kind: Service [...] spec: externalTrafficPolicy: Local type: NodePort ports: - name: status-port nodePort: 30021 port: 15021 protocol: TCP targetPort: 15021 - name: http2 nodePort: 30080 port: 80 protocol: TCP targetPort: 8080 - name: https nodePort: 30443 port: 443 protocol: TCP targetPort: 8443 Where the ports section is optional but recommended in order to have deterministic ports, and required when not using CIS (because it requires static ports). The nodePort values can be customised. When not using CIS, it is needed to manually configure the pool members in the BIG-IP. It is typical in OpenShift to have the Ingress components (OpenShift Router or Istio) in dedicated infra nodes. See this Red Hat solution for details. When using the ServiceMeshControlPlane method, the configuration is as follows: apiVersion: maistra.io/v2 kind: ServiceMeshControlPlane [...] spec: runtime: defaults: pod: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved When using the Gateway injection method, the configuration is added to the Deployment file directly: apiVersion: apps/v1 kind: Deployment [...] spec: template: metadata: spec: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved The configuration above is also a good practice when using CIS. Additionally, CIS by default adds all nodes IPs to the Service pool regardless of whether the externalTrafficPolicy is set to Cluster or Local value. The health check will discard nodes where there are no Ingress Gateways. It can be limited to the scope of the nodes discovered by CIS with the following parameter: --node-label-selector Health Checking and retries for the Ingress Gateway Ingress Gateway Readiness The Ingress Gateway has the following readinessProbe for Kubernete´s own health checking: readinessProbe: failureThreshold: 30 httpGet: path: /healthz/ready port: 15021 scheme: HTTP initialDelaySeconds: 1 periodSeconds: 2 successThreshold: 1 timeoutSeconds: 3 where the failureThreshold value of 30 is considered way too large and only marks down the Ingress Gateway as not Ready after 90 seconds (tested to be failureThreshold *timeoutSeconds). In this article, it is recommended to mark down an Ingress Gateway no later than 16 seconds. When using CIS, Kubernetes informs whenever a POD is not Ready and CIS automatically, removes its associated pool member from the pool. In order to achieve the desired behaviour of marking down the Ingress Gateway before 16 seconds, it is required to change the default failureThreshold value in the Deployment file by adding the following snippet: apiVersion: apps/v1 kind: Deployment [...] spec: template: metadata: spec: containers: - name: istio-proxy image: auto readinessProbe: failureThreshold: 5 httpGet: path: /healthz/ready port: 15021 scheme: HTTP initialDelaySeconds: 1 periodSeconds: 2 successThreshold: 1 timeoutSeconds: 3 Which keeps all other values equal and sets failureThreshold to 5, therefore marking down the Ingress Gateway after 15 seconds. When not using CIS, a HTTP health check has to be configured manually in the BIG-IP. An example health check monitor is shown next: Connection draining When an Ingress Gateway POD is deleted (because of an upgrade, scale-down event, etc...), it immediately returns HTTP 503 in the /healthz/ready endpoint and keeps serving connections until it is effectively deleted. This is called the drain period and by default is extremely short (3 seconds) for any external load balancer. This value has to be increased so the Ingress Gateway PODs being deleted continue serving connections until the Ingress Gateway POD is removed from the external load balancer (the BIG-IP) and the outstanding connections finalised. This setting can only be tuned using the Gateway injection method and it is applied by adding the following snippet in the Deployment file: apiVersion: apps/v1 kind: Deployment [...] spec: template: metadata: annotations: proxy.istio.io/config: | terminationDrainDuration: 45s In the example above, it has been used as the default drain period of the OpenShift Router (45 seconds). The value can be customised, keeping in mind that: When using CIS, it should allow CIS to update the configuration in the BIG-IP and drain the connections. When not using CIS, it should allow the health check to detect the condition of the POD and drain the connections. Additional recommendations The next recommendations apply to any ingress controller or API manager and have been previously suggested when using OpenShift Router. Handle non-graceful errors with the pool’s reselect tries To deal better with non-graceful shutdowns or transient errors, this mechanism will reselect a new Ingress Gateway POD when a request fails. The recommendation is to set the number of tries to the number of Ingress Gateway PODs -1. When using CIS, this can be set in the VirtualServer or TransportServer CRs with the reselectTries parameter. Set an additional TCP monitor for Ingress Gateway´s application traffic sockets This complementary TCP monitor (for both HTTP and HTTPS listeners) validates that Ready instances can actually receive traffic in the application’s traffic sockets. Although this is handled with the reselect tries mechanism, this monitor will provide visibility that such types of errors are happening. Conclusion and closing remarks We hope this article highlights the most important aspects of integrating OpenShift Service Mesh with BIG-IP. A key aspect for having a reliable Ingress Gateway integration is to modify OpenShift Service Mesh’s terminationDrainDuration and readinessProbe.failureThreshold defaults. F5 has submitted to Red Hat RFE 04270713 to improve these. This article will be updated accordingly. Whether CIS integration is used or not, BIG-IP allows you to expose OpenShift ServiceMesh reliably with extensive L4-L7 security and traffic management capabilities. It also allows fine-grained access control, scalable SNAT or keeping the original source IP, among others. Overall, BIG-IP is able to fulfill any requirement. We look forward to hearing your experience and feedback on this article.69Views1like0CommentsHow I did it - "High-Performance S3 Load Balancing of Dell ObjectScale with F5 BIG-IP"
As AI and data-driven workloads grow, enterprises need scalable, high-performance, and resilient storage. Dell ObjectScale delivers with its cloud-native, S3-compatible design, ideal for AI/ML and analytics. F5 BIG-IP LTM and DNS enhance ObjectScale by providing intelligent traffic management and global load balancing—ensuring consistent performance and availability across distributed environments. This article introduces Dell ObjectScale and its integration with F5 solutions for advanced use cases.1.3KViews6likes1CommentBreaking Down the Quantum Challenge: Why Post-Quantum Cryptography Can't Wait
The Quantum Challenge is Now Post-quantum cryptography represents the next steps of our digital security evolution. Sure, quantum systems capable of breaking current encryption may still be an a few years away, but those beginning their transition now will be well-positioned for when the crypto hits the fan. Nation-state adversaries and sophisticated private entities may be collecting data today hoping to decrypt it tomorrow so it's never to early to start solving the problem now. It's an excellent time to get ahead of the curve with quantum-resistant cryptography. What does this mean for your organization? Any sensitive data encrypted today using standard methods (RSA, ECDSA) could potentially become readable to future quantum-powered attackers. F5 Community Evangelist Chase Abbott discusses the real world implications of quantum computing, and how you can prepare and migrate to NIST-approved hybrid PQC standards. The transition to post-quantum cryptography represents a perfect opportunity to modernize enterprise PKI practices. Those of you that begin planning today have ample time to implement these changes thoughtfully and strategically, positioning yourselves as leaders in the next generation of cybersecurity; high fives all around. The Business Impact: Beyond Technical Considerations Regulatory and Compliance Pressure Government regulations across the globe are creating concrete deadlines for migration strategies: NSA CNSA 2.0 mandates quantum-resistant algorithms for classified systems by 2030 NIST has standardized post-quantum cryptography algorithms (FIPS 203, 204, 205) Industry regulations in finance, healthcare, and defense are beginning to incorporate quantum-safety requirements adhering to the update FIPS governance Your Quantum-Ready Roadmap: A Manageable Transition Phase 1: Assessment and Inventory Action items for leadership: Conduct cryptographic inventory across all systems and applications Identify critical data requiring long-term protection Assess vendor and third-party quantum readiness Establish quantum cryptography governance and budget allocation Phase 2: Pilot Implementation Strategic focus areas: Deploy quantum-resistant algorithms in non-critical environments Train IT and security teams on post-quantum cryptography Establish partnerships with quantum-ready technology vendors Begin updating security policies and procedures Phase 3: Production Migration Enterprise-wide deployment: Implement hybrid classical/quantum-resistant systems and software Migrate critical applications and PKI aggregation points to quantum-safe algorithms Update business continuity and disaster recovery plans Achieve full compliance with regulatory requirements as a priority over other systems Key Takeaways for Business Leaders Start planning now: The quantum threat timeline is uncertain, but the need for preparation is immediate Prioritize critical assets: Focus initial efforts on protecting your most sensitive and long-lived data Invest in capabilities: Quantum cryptography expertise will become as essential as any other IT security skill Engage stakeholders: Quantum security requires coordination across IT, compliance, procurement, and business units Monitor developments: Stay informed about quantum computing advances and regulatory updates Mahalo! Further Reading: Post Quantum Cryptography Coalition: PQC Migration Roadmap Post Quantum Cryptography Coalition: International PQC Requirements Post Quantum Cryptography Coalition: Inventory Workbook Essence of Linear Algebra Quantum Computing for the Very Curious Looking Glass Universe: Why I Left Quantum Computing Research US National Quantum Initiative176Views3likes0CommentsA Simple One-way Generic MRF Implementation to load balance syslog message
The BIG-IP Generic Message Protocol implements a protocol filter compatible with MRF (Message Routing Framework). MRF is designed to implement the most complex use cases, but it can be daunting if you need to create a simple configuration. This article provides a simple baseline to understand the relationships of the MRF components and how they can be combined for a simple one way implementation . A production implementation will in most case be more complex. The following virtual, profiles and iRules load balances a one way stream of new line delimited messages (in this case syslog) to a pool of message consumers. The messages will be parsed and distributed with a simple MLB protocol. Return traffic will not be returned to the client with this configuration. To implement this we will need these configuration objects: Virtual Server - Accepts incoming traffic and configure the Generic Protocol Generic Protocol - Defines message parsing. Generic Router - Configures message routing and point to the Generic Route Generic Route - Points to a Generic Peer Generic Peer - Defines an LTM pool members and points to the Generic Transport Config Generic Transport Config - Defines the server side protocol and server side irule iRule - Defines the message peers (Connections in the message streams) In this case we have a single client that is sending messages to a virtual server that will then be distributed to 3 pool members. Each message will be sent to one pool member only. This can only be configured from the CLI and the official F5 recommendation is to not make any changes in the web GUI to the virtual server. This was tested with BIG-IP 12.1.3.5 and 14.1.2.6. Here is the virtual with a tcp profile and required protocol and routing profiles along with an iRule to setup the connection peer on the client side. ltm virtual /Common/mrftest_simple { destination /Common/10.10.20.201:515 ip-protocol tcp mask 255.255.255.255 profiles { /Common/simple_syslog_protocol { } /Common/simple_syslog_router { } /Common/tcp { } } rules { /Common/mrf_simple } source 0.0.0.0/0 source-address-translation { type automap } translate-address enabled translate-port enabled } The first profile is the protocol. The only difference between the default protocol (genericmsg) is the field no-response must be configured to yes if this is a one way stream. Otherwise the server side will allocate buffers for return traffic that will cause severe free memory depletion. ltm message-routing generic protocol simple_syslog_protocol { app-service none defaults-from genericmsg description none disable-parser no max-egress-buffer 32768 max-message-size 32768 message-terminator %0a no-response yes } The Generic Router profile points to a generic route ltm message-routing generic router simple_syslog_router { app-service none defaults-from messagerouter description none ignore-client-port no max-pending-bytes 23768 max-pending-messages 64 mirror disabled mirrored-message-sweeper-interval 1000 routes { simple_syslog_route } traffic-group traffic-group-1 use-local-connection yes } The Generic Route points to the Generic Peer: ltm message-routing generic route simple_syslog_route { peers { simple_syslog_peer } } The Generic Peer configures the server pool and points to the Generic Transport Config. Note the pool is configured here instead of the more common configuration in the virtual server. ltm message-routing generic peer simple_syslog_peer { pool mrfpool transport-config simple_syslog_tcp_tc } The Generic Transport Config also has the Generic Protocol configured along with the iRule to setup the server side peers. ltm message-routing generic transport-config simple_syslog_tcp_tc { ip-protocol tcp profiles { simple_syslog_protocol { } tcp { } } rules { mrf_simple } } An iRule must be configured on both the Virtual Server and Generic Transport Config. This iRule must be linked as a profile in both the virtual server and generic transport configuration. ltm rule /Common/mrf_simple { when CLIENT_ACCEPTED { GENERICMESSAGE::peer name "[IP::local_addr]:[TCP::local_port]_[IP::remote_addr]:[TCP::remote_port]" } when SERVER_CONNECTED { GENERICMESSAGE::peer name "[IP::local_addr]:[TCP::local_port]_[IP::remote_addr]:[TCP::remote_port]" } } This example is from a user case where a single syslog client was load balanced to multiple syslog server pool members. Messages are parsed with the newline (0x0a) character as configured in the generic protocol, but this can easily be adapted to other message types.2.2KViews2likes4CommentsBIG-IP Next for Kubernetes Nvidia DPU deployment walkthrough
Introduction Modern AI factories—hyperscale environments powering everything from generative AI to autonomous systems—are pushing the limits of traditional infrastructure. As these facilities process exabytes of data and demand near-real-time communication between thousands of GPUs, legacy CPUs struggle to balance application logic with infrastructure tasks like networking, encryption, and storage management. Data Processing Units (DPUs), purpose-built accelerators that offload these housekeeping tasks, freeing CPUs and GPUs to focus on what they do best. DPUs are specialized system-on-chip (SoC) devices designed to handle data-centric operations such as network virtualization, storage processing, and security enforcement. By decoupling infrastructure management from computational workloads, DPUs reduce latency, lower operational costs, and enable AI factories to scale horizontally. BIG-IP Next for Kubernetes and Nvidia DPU Looking at F5 ability to deliver and secure every app, we needed it to be deployed at multiple levels, a crucial one being edge and DPU. Installing F5 BIG-IP Next for Kubernetes on Nvidia DPU requires installing Nvidia’s DOCA framework to be installed. What’s DOCA? NVIDIA DOCA is a software development kit for NVIDIA BlueField DPUs. BlueField provides data center infrastructure-on-a-chip, optimized for high-performance enterprise and cloud computing. DOCA is the key to unlocking the potential of the NVIDIA BlueField data processing unit (DPU) to offload, accelerate, and isolate data center workloads. With DOCA, developers can program the data center infrastructure of tomorrow by creating software-defined, cloud-native, GPU-accelerated services with zero-trust protection. Now, let's explore BIG-IP Next for Kubernetes components, The BIG-IP Next for Kubernetes solution has two main parts: the Data Plane - Traffic Management Micro-kernel (TMM) and the Control Plane. The Control Plane watches over the Kubernetes cluster and updates the TMM’s configurations. The BIG-IP Next for Kubernetes Data Plane (TMM) manages the supply of network traffic both entering and leaving the Kubernetes cluster. It also proxies the traffic to applications running in the Kubernetes cluster. The Data Plane (TMM) runs on the BlueField-3 Data Processing Unit (DPU) node. It uses all the DPU resources to handle the traffic and frees up the Host (CPU) for applications. The Control Plane can work on the CPU or other nodes in the Kubernetes cluster. This makes sure that the DPU is still used for processing traffic. Use-case examples: There are some recently awesome use cases released by F5’s team based on conversation and work from the field. Let’s explore those items: Protecting MCP servers with F5 BIG-IP Next for Kubernetes deployed on NVIDIA BlueField-3 DPUs LLM routing with dynamic load balancing with F5 BIG-IP Next for Kubernetes deployed on NVIDIA BlueField-3 DPUs F5 optimizes GPUs for distributed AI inferencing with NVIDIA Dynamo and KV cache integration. Deployment walk-through In our demo, we go through the configurations from BIG-IP Next for Kubernetes Main BIG-IP Next for Kubernetes features L4 ingress flow HTTP/HTTPs ingress flow Egress flow BGP integration Logging and troubleshooting (Qkview, iHealth) You can find a quick walk-through via BIG-IP Next for Kubernetes - walk-through Related Content BIG-IP Next for Kubernetes - walk-through BIG-IP Next for Kubernetes BIG-IP Next for Kubernetes and Nvidia DPU-3 walkthrough BIG-IP Next for Kubernetes F5 BIG-IP Next for Kubernetes deployed on NVIDIA BlueField-3 DPUs705Views1like1CommentUsing F5 NGINX Plus as the Ingress Controller within Nutanix Kubernetes Platform (NKP)
Managing incoming traffic is a critical component of running applications efficiently within Kubernetes clusters. As organizations continue to deploy a growing number of microservices, the need for robust, flexible, and intelligent traffic management solutions becomes more apparent. In this article, we provide an overview of how F5 NGINX Plus, when used as the ingress controller in the Nutanix Kubernetes Platform (NKP), offers a comprehensive approach to traffic optimization, application reliability, and security.180Views1like0CommentsAutomating F5 Application Delivery and Security Platform Deployments
The F5 ADSP Architecture Automation Project The F5 ADSP reduces the complexity of modern applications by integrating operations, traffic management, performance optimization, and security controls into a single platform with multiple deployment options. This series outlines practical steps anyone can take to put these ideas into practice using the F5 ADSP Architectures GitHub repo. Each article highlights different deployment examples, which can be run locally or integrated into CI/CD pipelines following DevSecOps practices. The repository is community-supported and provides reference code that can be used for demos, workshops, or as a stepping stone for your own F5 ADSP deployments. If you find any bugs or have any enhancement requests, open an issue, or better yet, contribute. The F5 Application Delivery and Security Platform (F5 ADSP) The F5 ADSP addresses four core areas: how you operate day to day, how you deploy at scale, how you secure against evolving threats, and how you deliver reliably across environments. Each comes with its own challenges, but together they define the foundation for keeping systems fast, stable, and safe. Each architecture deployment example is designed to cover at least two of the four core areas: xOps, Deployment, Delivery, and Security. This ensures the examples demonstrate how multiple components of the platform work together in practice. DevSecOps: Integrating security into the software delivery lifecycle is a necessary part of building and maintaining secure applications. This project incorporates DevSecOps practices by using supported APIs and tooling, with each use case including a GitHub repository containing IaC code, CI/CD integration examples, and telemetry options. Resources: F5 Application Delivery and Security Platform GitHub Repo and Guide ADSP Architecture Article Series: Automating F5 Application Delivery and Security Platform Deployments (Intro) F5 Hybrid Security Architectures (Part 1 - F5's Distributed Cloud WAF and BIG-IP Advanced WAF) F5 Hybrid Security Architectures (Part 2 - F5's Distributed Cloud WAF and NGINX App Protect WAF) F5 Hybrid Security Architectures (Part 3 - F5 XC API Protection and NGINX Ingress Controller) F5 Hybrid Security Architectures (Part 4 - F5 XC BOT and DDoS Defense and BIG-IP Advanced WAF) F5 Hybrid Security Architectures (Part 5 - F5 XC, BIG-IP APM, CIS, and NGINX Ingress Controller) Minimizing Security Complexity: Managing Distributed WAF Policies240Views3likes0CommentsDeploying F5 Distributed Cloud Customer Edge on AWS in a scalable way with full automation
Scaling infrastructure efficiently while maintaining operational simplicity is a critical challenge for modern enterprises. This comprehensive guide presents the foundation for a fully automated Terraform solution for deploying F5 Distributed Cloud (F5XC) Customer Edge (CE) nodes on AWS that scales seamlessly from single-node proof-of-concepts to multi-node production deployments.232Views1like0Comments