[PATCH] Core: support for reading PROXY protocol v2 TLVs
Maxim Dounin
mdounin at mdounin.ru
Mon Sep 5 15:58:38 UTC 2022
Hello!
On Mon, Sep 05, 2022 at 05:23:18PM +0400, Roman Arutyunyan wrote:
> Hi,
>
> On Mon, Sep 05, 2022 at 03:52:49AM +0300, Maxim Dounin wrote:
> > Hello!
> >
> > On Wed, Aug 31, 2022 at 07:52:15PM +0400, Roman Arutyunyan wrote:
> >
> > > # HG changeset patch
> > > # User Roman Arutyunyan <arut at nginx.com>
> > > # Date 1661436099 -14400
> > > # Thu Aug 25 18:01:39 2022 +0400
> > > # Node ID 4b856f1dff939e4eb9c131e17af061cf2c38cfac
> > > # Parent 069a4813e8d6d7ec662d282a10f5f7062ebd817f
> > > Core: support for reading PROXY protocol v2 TLVs.
> >
> > First of all, could you please provide details on the use case?
> > I've seen requests for writing proxy protocol TLVs to upstream
> > servers (see ticket #1639), but not yet seen any meaningful
> > reading requests.
>
> The known cases are these:
>
> - https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-target-groups.html#proxy-protocol
> - https://docs.microsoft.com/en-us/azure/private-link/private-link-service-overview#getting-connection-information-using-tcp-proxy-v2
> - https://cloud.google.com/vpc/docs/configure-private-service-connect-producer#proxy-protocol
>
> The data may need further parsing, but it can be done in njs or perl.
Thanks for the details. So, basically, it's about vendor-specific
endpoint IDs.
> > > The TLV values are available in HTTP and Stream variables
> > > $proxy_protocol_tlv_0xN, where N is a hexadecimal TLV type number with no
> > > leading zeroes.
> >
> > I can't say I like the "hexadecimal TLV type number with no
> > leading zeroes" approach, especially given that the specification
> > uses leading zeroes in TLV types. With leading zeros might be
> > better, to match specification.
> >
> > Also, it might worth the effort to actually add names for known
> > types instead or in addition to numbers.
>
> This is indeed a good idea and we have such plans as a further extenion of this
> work. One of the problems is however that the abovementioned TLV variables
> are specified in internal documents of AWS/Azure/GCP which are not standards.
> They can be changed anytime, while we have to maintain those variables in
> nginx. Also, raw variables give more flexibility in supporting less known TLVs.
Of course I'm not suggesting to ditch raw variables, at least not
for unknown/non-standard values. But for known/standard values it
should be easy enough to provide alternative names for easier use,
probably with type-specific parsing.
With on-demand parsing it would be trivial to support both
$proxy_protocol_tlv_alpn and $proxy_protocol_tlv_0x01. Further,
it will be trivial to support $proxy_protocol_tlv_aws_vpc_id while
still providing $proxy_protocol_tlv_0xea for raw data.
> > Another question is PP2_TYPE_SSL, which is itself a complex
> > structure and a list of multiple subtypes.
>
> This is an obvious one. However we had exactly zero requests for this.
See the ticket mentioned above, it seems to be the main reason why
people want to see proxy protocol v2 to backends.
> > Provided
> > Given the above, not sure if the approach with early parsing and
> > header-like list as in the patch is the good idea. Just
> > preserving TLVs as is and parsing them all during variable
> > evaluation might be easier and more efficient.
>
> In this case, if we have two variables, say $proxy_protocol_tlv_ssl_{sni, alpn},
> we'll parse the entire TLV block twice - once per variable evaluation.
Assuming you mean $proxy_protocol_{authority, alpn} (as these
aren't SSL subtypes), I actually see no difference in on-demand
parsing of the TLV block and looking for a header in the
pre-created list of headers. Further, parsing the block for each
variable evaluation might be actually faster due to better
locality, and should simplify adding alternative names.
And a single TLV block certainly will be more optimal in terms of
memory usage due to no additional allocations. Not to mention the
typical case when TLV variables aren't used at all.
> > Also, the idea of merging TLV values with identical types looks
> > wrong to me, especially given that many TLSs are binary.
> > Specification does not seem to define the behaviour here,
> > unfortunately. As far as I understand, HAProxy itself still
> > doesn't implement PPv2 parsing, so there is not reference
> > implementation either. On the other hand, it should be easy
> > enough to check all TLVs for duplicate by using a 256-bit bitmask
> > and reject connections if there are any duplicates.
>
> This can be added, thanks.
Not sure it's actually needed though, especially given that proxy
protocol is only expected to be accepted from trusted sources
anyway. It might be good enough to just assume there is only one
value with a given type.
--
Maxim Dounin
http://mdounin.ru/
More information about the nginx-devel
mailing list