Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
[dupe] Upcoming new HTTP QUERY method (ietf.org)
158 points by tercio on Jan 31, 2022 | hide | past | favorite | 73 comments



Also this ongoing thread:

Upcoming new HTTP QUERY method - https://news.ycombinator.com/item?id=30153995 - Jan 2022 (51 comments)


Two things that come to my mind:

- why not just extend GET to make a payload not "undefined" anymore? Instead now people have to wonder whether to use GET or QUERY. The non-idempotent methods have at least a difference in semantics, while this here seems mostly another way to provide parameters for essentially the same action.

- QUERY is a somewhat bad name choice, given that URL parameters are also refered to as query string


I think this was a situation where either choice (extend GET, create QUERY) was less then ideal but at least if you create a new namespace in QUERY you can avoid arcane business logic around GET by various caches, loadbalancers, proxies, and other software that made who knows what terrible assumptions around GET payload shape.


> you can avoid arcane business logic around GET by various caches, loadbalancers, proxies, and other software that made who knows what terrible assumptions around GET payload shape.

I wish standards bodies weren't afraid of angering the feet-dragging vendors who haven't had to update their shitty middleboxes in 20 years despite charging their customers through the nose for them. Maybe it would be better for a healthier web if these dinosaur machines got broken once in a while.


There is enormous spend and investment that occurs under low volatility assumptions. People don't want their gear to break for features they may not want or need (see: IPv6 uptake, Python 2->3 migration timelines if at all, etc). Standards bodies exist to serve users, not the other way around. Balance backwards compatibility with forward velocity.


If standards bodies ignored compatibility, it's not the middle boxes that would become irrelevant. It'd be the standards that became irrelevant.


It's all fun and games until it turns out your home ISP is one of those dinosaur machines and now you can't log onto HN because none of the routers you can connect to understand GET with a body.


In that case you'd be using https which doesn't get proxied. But yeah, it's all somebody else's problem until it's yours...


> Maybe it would be better for a healthier web if these dinosaur machines got broken once in a while.

I would expect that the majority of all GET requests are non public-facing.

So needlessly breaking all of these machines would have little to no user benefit.


> - QUERY is a somewhat bad name choice, given that URL parameters are also refered to as query string

Slightly confusing, I agree. But given that neither are called just "query", and HTTP methods being usually in all caps, I think it won't be as bad. "Query component" (from URIs as specced) VS "QUERY method" should be clear enough.


I wondered the same thing. Perhaps it is easier to roll out a new request method, rather than increase the scope of an existing request method. For example, a lot of extant middleware might silently drop a GET requests body, whereas they would just error out upon seeing "QUERY ..."


> Perhaps it is easier to roll out a new request method, rather than increase the scope of an existing request method.

It’s easier to define semantics for a new method than to expect everyone to change how an existing methods is handled by all existing software.


And it is often easier to argue that software should add support for a new method than that it should change semantics that other users/customers may depend on even if those semantics aren't ideal.


> QUERY is a somewhat bad name choice, given that URL parameters are also refered to as query string

SEARCH and REPORT, which are probably better names, were already taken by WebDAV, a pre-REST style protocol layered over HTTP that loved (loves?) registering new HTTP methods.


GET with "Query" header + "Vary: Query" header, might be a thing to get escape from the cache dungeon


> QUERY is a somewhat bad name choice, given that URL parameters are also refered to as query string

That has always driven me crazy


I've used VIEW for this kind of thing. I don't even know what VIEW is for


either extension method seems a lost cause, given that most browsers/web servers don't fully support the 7 existing http verbs, even with decades to do so.


> don't fully support the 7 existing http verbs

9 to be precise (RFC 7231 defines GET, HEAD, POST, PUT, DELETE, CONNECT, OPTIONS, TRACE, RFC 5789 defines PATCH).

Some of those are not commonly used because of various security vulnerabilities that have been discovered over the years (see https://www.kb.cert.org/vuls/id/867593, https://www.kb.cert.org/vuls/id/288308, https://www.kb.cert.org/vuls/id/150227) for some examples.

Some tooling goes as far as defining some of those methods as "forbidden", like what the standard for JavaScript Fetch does.


> > the 7 existing http verbs

> 9 to be precise

You are off by 30:

https://www.iana.org/assignments/http-methods/http-methods.x...


Oh, I did not know about that! You learn something new every day at HN :) Thanks!

Granted, I guess you could argue that because you can use whatever method you want in practice, there is a unlimited set of methods available (depending on how long string whatever server you're using could create, if they are parsing it and so on)


A lot of these seem to be specific to e.g. WebDAV though, hardly general purpose HTTP methods. God knows why they had to muddle this up like that.


yah, i should have specified 'common' or 'web', or something like that. i had a feeling someone would raise a more technically precise but also more esoteric rebuttal.


Seems nice to me.


The spec doesn't say how browsers should represent such request in the browser address bar. The nice thing with GET is that you can copy a search url, bookmark it or share it with someone, and they get back the exact same results. How would that work with a QUERY method?


The browser address bar performs GET requests, so I think the answer is "you can't bookmark or share QUERY requests," same as with POST requests.

However, just as with POST requests, it might make sense for the server to respond with a redirect to a URL that could be shared.


In HTTP, clients are user agents and a web browser is only one kind of user agent that is specialized in fetching and displaying web pages. Like PUT and DELETE, this is clearly meant for APIs, not web pages. (Although sufficiently "advanced" web pages may make API calls.)


It's more for the SPA age of web apps, where we want to GET a resource from an API, but have a body in the request. For example fetching the user who is currently logged in and getting a JSON response back; it's not something you would do in the address bar.


I'm not sure it makes sense for an HTTP protocol method standard to describe how browsers should allow it's input in the URL bar - two very different standards realms.

That said you should be able to trigger any custom HTTP request your browser support via pasting/bookmarking a `data:` or `javascript:` construction if you really want. As of now the URL bar is GET only when you enter a URL.


> I'm not sure it makes sense for an HTTP protocol standards to describe how browsers should allow it's input in the URL bar - two very different standards realms.

Just as a small remark, HTTP protocol standards do - or have at least - considered URL input so far in a wider, client-side meaning. Some details are left for the clients, e.g. it differs across browsers how certain URL characters are interpreted when entered into the URL bar (or on the command-line, e.g. with curl(1)).

Apart from that, HTTP URLs are part of the same HTTP standard, aren't they?


HTTP URLs and HTTP methods are two very different things, somewhat conflated in that the browser treats URLs in the URL bar as a shorthand for generating a GET. While URL encodings are well standardized for general use in RFCs the parsing into an HTTP GET when entered into the URL bar is not, it's just good UI. The same would be said about the QUERY method standard, it's not about describing how the browser UI should allow making such a request header just what a valid one is.


same as with POST: not really intended for that case.


Similarly to as if you were using POST requests I'd say. So if you want the users to be able to navigate to a URL and get the same results for a POST request, you'd make the client-side read URL parameters and sculpt the POST request and send it.


probably same as with POST, that is not at all


This is great. Note it is co-authored by James Snell who I know from Node.js core. He has been a very prolific member working on things like QUIC in Node and web APIs.

I can definitely see why this could benefit his work (at Cloudflare) a lot since it would enable caching of results currently done through POST queries.


Is GET with a request body impractical at this point? It has historically been frowned upon [1] but is it really too late to change that convention?

[1]: https://stackoverflow.com/questions/978061/http-get-with-req...


I've seen this done a few times and in most of those cases I agreed it made sense. I've thought about this myself a few times too.

But it does lead to unexpected behavior. Users are trained to copy/paste links and send them. If the body only included stuff like auth tokens then it would be ok, but if relevant query stuff was in there (like page size, for example) that would lead to different results that would ultimately be deleterious IMHO.


The copy/paste URL use case is a fair consideration. The places where I’ve seen request bodies on GET requests are APIs [2] that are not seen by the end user.

[2]: https://www.elastic.co/guide/en/elasticsearch/reference/6.8/...



Interesting that they use an sql-like query in the examples.

I hope this does not mean that this will be taken as the default method of querying.. A JSON query object could've been nice as an example.

How should you handle conflicting parameters (form-encoded + body)?

Also, on the HTML side.. If it's extended to <form method=QUERY>.. how will you be able to distinguish between the query part and the uri part?

How can/should you copy/share links? Will browsers simply base64 encode this somewhere?


From the paragraph above:

> The non-normative examples in this section make use of a simple, hypothetical plain-text based query syntax based on SQL with results returned as comma-separated values. This is done for illustration purposes only. Implementations are free to use any format they wish


I guess that they purposefully did not use JSON or any other widely used format to avoid giving the appearance that that format is required or encouraged.

As for copying/sharing links, I think that was covered in another discussion above: Just like with POST, it's likely just not intended to do QUERY requests via links or through the URL bar.


> This specification defines the HTTP QUERY request method as a means of making a safe, idempotent request that contains content.

I’m already seeing implementors failing at following the spec here – they will equate the QUERY method to querying a mutable database, which won’t give reproducible results, and to give reproducible results the server would need to save it. Now to make it idempotent, will be an ad-hoc decision of each server. It seems contradictory. At this point it’s indistinguishable from a PUT on a resource that represents the query itself.

I’m not seeing the point of the new method.


"Idempotent" does not mean successive identical requests must return identical content. For example, two successive GET requests to the same URL will not return identical content if in between them the owner of the website changes the page at that URL. Similarly, if the source of the request content is a mutable database, two successive QUERY requests with the same request content will not return identical content if the database is mutated in between by someone else.

All "idempotent" actually means is that successive identical requests must induce the same change in the server's state due to the request itself. In the case of verbs like GET and QUERY, that is trivially true since they induce no change in the server's state at all. But the content of the response can of course change due to other events happening on the server. If "idempotent" required that not to happen, no request could ever be guaranteed to be idempotent.


I mean: I don't see the point to have a new method that still gives the same (weak and misnamed) idempotency guarantees of GET. We already have GET, and a new method isn't necessary just to pass request data.

I would rather have a method that _may_ have side-effect on the server-side, but it's actually idempotent from the client-side (as in, truly reproducible results). That already exists in the form of a PUT + some content addressing scheme, for example, but it's open to each implementation.


> that is trivially true since they induce no change in the server's state at all.

Have you heard about visit counters? ;)


If visit counters are considered to be changes in the server state, then servers that implement them are violating the requirement that GET requests must be idempotent.


> I’m already seeing implementors failing at following the spec here – they will equate the QUERY method to querying a mutable database, which won’t give reproducible results,

Safe requests (all of which are necessarily also idempotent) are not reliably reproducible; GET of the same resource changes if there are server-state changes induced by other requests between GETs.

So, what you suggest is not failing to follow the spec.


Yeah it's already created confusion for me. I read it as essentially a side effect free request with a request body. Not that the data behind it was necessarily immutable but that the query didn't do any mutations.


Idempotent is not "side effect free", but rather with only side effects that need to be safe to duplicate and deduplicate.

As in "increase counter by 1" is not idempotent, "set counter to 2" and "set counter (monotonic*) counter to 2 if current value is 0" are idempotent.

It is quite weak as a requirement for example an API that functionally must not be cached can still be idempotent.


Edit: never mind, my response was redundant by the time I finished it.


We should just create a revision for RFC 2616 to extend the GET method to allow request bodies. Most HTTP frameworks already support this behaviour anyway.


> We should just create a revision for RFC 2616

RFC 2616 has already been obsoleted, and there is a draft (I think on its 19th draft or more) RFC obsoleting the set that obsoleted RFC 2616.

> to extend the GET method to allow request bodies.

Adding a new method is safer than changing the semantics of an existing one, “has a body” is a pretty major distinction for an HTTP message, whether a request method or response status.


Many proxies and load balancers do not


I see two problems: redundancy and consistency.

After spending pages telling the reader that the query should not affect the server's state, the only example given appears to do exactly that: initiate a query and return a GET url to retrieve the results.

Either the GET request actually triggers the database request, which makes the QUERY request totally redundant, or the QUERY request relays a call to a database interpreter and the server responds with a location to retrieve the results.

Second, how would the http server know that the query is both idempotent and safe? Providing an example with SQL-like code makes it look even more like a bad joke...

I see this creates (at least) two problems and solves none. Do the authors know which problem they hope to solve?

If I had more time I'd look into their bios. Something tells me this is a typical case of "I put my name in a RFC", unless they work under the umbrella of some GAFAM who needs a new HTPP verb without disclosing why.


If I get it right this would serve the same purpose as GET requests with data in the HTTP request body. [1]

1: https://stackoverflow.com/questions/978061/http-get-with-req...


Is it really OK to add a new method to HTTP 1.1? I think would be quite an unfortunate decision, because it would make the version numbers absolutely meaningless if all of a sudden new methods can pop up without them changing.


HTTP 1.1 makes it clear that the method mechanism is extensible [1].

    Method           = "OPTIONS"
                      | "GET"                 
                      | "HEAD"                 
                      | "POST"                
                      | "PUT"            
                      | "DELETE"   
                      | "TRACE"                
                      | "CONNECT"                
                      | extension-method
    extension-method = token
"The list of methods allowed by a resource can be specified in an Allow header field (section 14.7). The return code of the response always notifies the client whether a method is currently allowed on a resource, since the set of allowed methods can change dynamically. An origin server SHOULD return the status code 405 (Method Not Allowed) if the method is known by the origin server but not allowed for the requested resource, and 501 (Not Implemented) if the method is unrecognized or not implemented by the origin server. The methods GET and HEAD MUST be supported by all general-purpose servers. All other methods are OPTIONAL."

[1] https://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5....


And, indeed, this has been done many times, by e.g. WebDAV (methods COPY, LOCK, MKCOL, MOVE, PROPFIND, PROPPATCH, UNLOCK).


Thanks, it makes sense now. I had the feeling I was missing something, I haven't really ever read the HTTP spec TBH so I was not aware of this.


The short version is that it's perfectly fine. This doesn't modify the HTTP spec, and it uses established extension mechanisms.

The long version:

RFC 7231 defines the standard HTTP methods in section 4.1. [1] Looking at it, the only methods that are required are GET and HEAD. All others are optional, and the spec explicitly calls out that additional methods may be created and registered with the IANA.

Looking closely at this RFC reveals two things:

1. It doesn't modify any of the HTTP RFCs. (There's no "Obsoletes" or "Updates" header.)

2. In section 6 of the QUERY RFC, it requests that the IANA add the new method to its registry. That follows the guidelines in RFC 7231 section 4.1.

[1] https://datatracker.ietf.org/doc/html/rfc7231#section-4.1


Sometimes reality beats fiction: http://rupy.se/doc/

I chose Query.java as the class name for my combined GET/POST container back in 2008:

https://github.com/tinspin/rupy/blob/master/src/se/rupy/http...

Back then it was hosted on google code.


btw, why is it called "rupy" ?


Finding a domain name is hard.

I like short names and make computer games, Zelda (or maybe just me) misspelled rupie rupy.

http://rupy.se

I did not know about Ruby Python back then... now I feel it's too late... eventually I'll try and change the name to binarytask since I bought binarytask.com...


Any reasonable system requires offset. Also, it would be nice to have a search id that allows client to fetch further records in a linear way without risking data shifts when new records get added. The last "must have" would be to have some form of ordering.

QUERY /contacts HTTP/1.1 Host: example.org Content-Type: example/query Accept: text/csv

select surname, givenname, email limit 10 offset 40 key 2eQ8m5


Am I right in thinking that this is basically an alternative to URI query parameters? Presumably to make parsing the query and caching the results easier.


The introduction paints it exactly as such by example, along with avoiding some other limitations like parameter size and encoding overhead.


I think the biggest motivation I've seen every time GET bodies are brought up is the size limit on request bodies is normally much larger than on URIs and especially query builder type interfaces run up against this, which is why e.g. elasticsearch does this.


I’ve always used POST to do exactly what this specifies. Good show.

May take a while, before it is supported enough, in many APIs.


Sounds like a good fit for helping CDNs and caches deal with GraphQL.


It is really surprising that we come so far without it.


How would this interact with HTTP/3, which uses a binary header?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: