"AI Gateway" Working Group Proposal

319 views
Skip to first unread message

Shane Utt

unread,
Jun 12, 2025, 8:34:45 PM (6 days ago) Jun 12
to kubernetes-sig-network
Hello SIG Network,

Hope everyone is having a good summer!

With the explosive growth of AI/ML the last few years, and the subsequent success of the Gateway API Inference Extension (GIE) the intersection of networking an AI/ML continues to be a frontier for us in SIG Network. We all want to make sure Kubernetes becomes and remains the platform for running your AI workloads, and there's tons of work all over the SIGs pushing for that. To that end, It's always good to keep stepping back and taking a look at what's next there, and how we continue to achieve that as a SIG.

There is a new proposal which aims at trying to harness this growth, and look into an area here that might be the next good focus point which (for better or worse) we've named "AI Gateway". If you're interested in the future standards of AI/ML networking on Kubernetes, inference, and "AI Gateways" check out the proposal here, and please provide your thoughts and feedback.

Cheers,

Shane

Rob Scott

unread,
Jun 12, 2025, 9:13:16 PM (6 days ago) Jun 12
to Shane Utt, kubernetes-sig-network
Hey Shane,

Thanks for sharing this proposal! The "AI Gateway" space is really exciting and I've definitely seen some interest in standardization here. My question would be if we need a new working group for this purpose. As your proposal mentions, we already have a group working on extending Gateway API for Inference. I'd be worried that creating another very similar group could spread us too thin as a SIG.

Thanks,

Rob


--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-network" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-ne...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/kubernetes-sig-network/6737afa8-e84e-4180-bf19-246796297111n%40googlegroups.com.
Message has been deleted

Flynn -

unread,
Jun 13, 2025, 2:36:06 PM (5 days ago) Jun 13
to Rob Scott, Shane Utt, 'Rob Scott' via kubernetes-sig-network
I would support creating this group. To Rob's point, GIE is very much focused on only inference, rather than the broader AI world, and I'd like to see some coordinated work looking at the whole space.
​ -- Flynn

Dan Sun

unread,
Jun 14, 2025, 1:36:36 AM (5 days ago) Jun 14
to kubernetes-sig-network
+1 on supporting creating this group. AI gateway has much larger scope as documented in the proposal, we need an API that's flexible to support the variety of options between self hosted model routing and cloud model providers. LLM traffic has unique characteristics (e.g. token based billing, diverse provider API, intelligent routing and fallback) that go beyond standard HTTP routing, and we need centralized management for the fleet of LLM model endpoints to monitor, secure and control the cost of LLM traffic.

Antonio Ojea

unread,
Jun 14, 2025, 3:24:20 PM (4 days ago) Jun 14
to Dan Sun, steering, kubernetes-sig-network

Hi all,

Thanks Shane for putting this proposal together and to everyone for the enthusiasm. It's clear that the intersection of AI and networking on Kubernetes is an important space that is rapidly evolving, and it's great to see this Gateway API interest. 

I want to put some thoughts on this proposal, since I think there is a mismatch with the formal definition and purpose of a Kubernetes Working Group (WG). According to the official WG governance documentation, a Working Group is intended to facilitate communication and coordination across multiple SIGs to address a specific, time-limited problem that spans those SIGs.

The mission outlined in the proposal, describing the group as "effectively a 'branch' of the Gateway API and Gateway API Inference Extension (GIE) projects," suggests the work is tied to the existing Gateway API subproject within SIG Network. It sounds much more like a focused effort or a new workstream within the Gateway API community rather than a cross-SIG initiative.

The kind of collaboration being described here is something that the current community structure fully supports and encourages. You can absolutely form a focused group to drive the "AI Gateway" effort within the Gateway API subproject today.

In addition, for cross-SIG work related to serving AI/ML workloads, we already have wg-serving. If the scope of this AI Gateway workstream requires formal collaboration with other SIGs (like SIG Apps or SIG Node), wg-serving seems the best place to coordinate.

To be clear, I think this is a good idea, but I want to ensure we use the right community structures to help to keep efforts organized and avoid overlapping or redundancies with a formal Working Group.

Perhaps the best path forward would be to establish this as a dedicated focus area within the Gateway API project?

Thanks

(cc @steering  because WG need steering approval )

Xunzhuo

unread,
Jun 16, 2025, 4:13:46 AM (3 days ago) Jun 16
to kubernetes-sig-network
+1 on this WG creation, but as Antonio mentioned, the scope and goal of WG AI GW should be clear, which will drive this WG to move forward well.

From my perspective, this WG should be a separated WG between GWAPI and WG-Serving, it solves the gap and brings focus on AI/ML workloads and kubernetes-networking in LLM routing for the LLM researcher and Networing experts. 

No matter it is the GIE or the other solution like many other  LLM routing solutions like envoy ai gateway, the inference gw introduced in AIBrix, or the semantic router introduced by RedHat, etc.

This WG brings standards and a place to gather the expertise to make the AI GW put into real-world production and solve the challenges in LLM.

Thanks, Xunzhuo

Sandor Szuecs

unread,
Jun 16, 2025, 7:33:09 AM (3 days ago) Jun 16
to Xunzhuo, kubernetes-sig-network
Hi!

I have only one wish:
Try to get rid of http body based routing.
For example add a header that is for now optional and publish it that vendors will opt-in to send http header in their clients (prompts, agents, whatever will come next) for routing decisions on proxy servers.

Thanks for helping http proxies to run efficiently.
Sandor Szücs | 418 I'm a teapot



Shane Utt

unread,
Jun 16, 2025, 12:47:43 PM (3 days ago) Jun 16
to kubernetes-sig-network
Antonio: It's noteworthy that you raised this point, as the proposal was intentionally designed to suggest traditional working group. The document consistently uses the term "proposal" to emphasize that the group's main objective is to generate proposals. To clarify, this working group is not a sub-project, but it may suggest sub-projects after conducting time-limited research in collaboration with industry experts. The feedback is appreciated, because it means more clarity is needed about this in the doc. I've made significant updates to the language in the doc, lmkwyt

Regarding your earlier point, Rob: I want to ensure that my colleagues involved in the GIE understand that this does not preclude the GIE. Adapting the GIE to support "AI Gateway" features is definitely an option, but we should take a moment to reflect on it. In brief, my goal for this proposal within our SIG is to extend an arm to more industry specialists to collaborate with us, helping us grow and identify common themes in these features. I want us to strengthen the narrative of Kubernetes as the premier platform for AI/ML workloads. "Industry Specialists" in this case certainly means everyone currently working on the GIE, but creating WG presents an opportunity for even more growth. Ultimately, the WG can determine whether these efforts should be integrated into the GIE or pursued separately based on this collaboration. Many doors open.

Sandor: Yes, this has been a source of frustration for many and I agree we should try to influence it in a better direction. Kubernetes having a stronger overall narrative on AI inference could lend to us having a stronger voice in the wider community to influence change here. I'm with you on this.

David Martin

unread,
Jun 16, 2025, 3:07:06 PM (2 days ago) Jun 16
to kubernetes-sig-network
Thanks for getting this proposal together Shane.
It resonates with me as an area I could definitely get involved with, as someone working on gateway level features, aka policies.
While wg-serving and GIE seem to be focused more on running of inference workloads on kubernetes, I find myself being a passive participant there at most. There's awesome work being done there, but it's not quite the right fit for where I think I can add value. (I also regularly find myself out of my depth on AI terminology and features in those meetings).

The AI Gateway WG proposal feels more approachable for someone working at the HTTP and application layer, but would also like to explore and standardise around inference features at the gateway like prompt guarding, token metrics & rate limiting, and AI specific protocols like A2A etc..

Evan Jones

unread,
Jun 16, 2025, 5:14:31 PM (2 days ago) Jun 16
to kubernetes-sig-network
All,

Shane -- I appreciate the effort pulling this together.

I want to second David's comment. He couldn't have said it better. I fell off the wg-serving meetings because, although I was personally interested, there wasn't enough alignment with my organization's needs as self-hosted models constitute a small fraction of our workloads. I couldn't justify the time commitment.

We have clearer needs for standardizing at the network and application layer for security, monitoring, reliability, protocol, etc. reasons. 

Best

Evan Jones


Craig Brookes

unread,
Jun 17, 2025, 1:30:43 PM (2 days ago) Jun 17
to kubernetes-sig-network
I want to also echo the comments of others here. I see this as being a great and valuable way to approach the needs and use cases in this area that go beyond routing and scheduling . 
Thanks for putting it together, Shane!

Craig



--
Craig Brookes
Kuadrant 
@maleck13 Github

Nir Rozenbaum

unread,
Jun 17, 2025, 4:59:37 PM (2 days ago) Jun 17
to kubernetes-sig-network

+1 from me as well. As stated in previous comments of others, GIE work is great, but the AI gateway has larger scope and this WG can open the door for more collaborations. 
ב-יום שלישי, 17 ביוני 2025 בשעה 16:30:43 UTC+3, Craig Brookes כתב/ה:

Daniel Grimm

unread,
Jun 17, 2025, 6:49:42 PM (2 days ago) Jun 17
to kubernetes-sig-network
+1 from my side as well. If meetings happen in an EU-friendly time slot, I'll definitely join. If not I'll follow along asynchronously.

Best,
Daniel

Hamzah Qudsi

unread,
Jun 17, 2025, 7:11:28 PM (2 days ago) Jun 17
to kubernetes-sig-network
+1 from me as well. This actually comes at an opportune time since we have been exploring similar use cases proposed here at the organization I'm at. So definitely want to contribute to this WG.

--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-network" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-ne...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/kubernetes-sig-network/6737afa8-e84e-4180-bf19-246796297111n%40googlegroups.com.

--

Reply all
Reply to author
Forward
0 new messages