[PATCH] Core: support for reading PROXY protocol v2 TLVs

Maxim Dounin mdounin at mdounin.ru
Mon Sep 5 15:58:38 UTC 2022


On Mon, Sep 05, 2022 at 05:23:18PM +0400, Roman Arutyunyan wrote:

> Hi,
> On Mon, Sep 05, 2022 at 03:52:49AM +0300, Maxim Dounin wrote:
> > Hello!
> > 
> > On Wed, Aug 31, 2022 at 07:52:15PM +0400, Roman Arutyunyan wrote:
> > 
> > > # HG changeset patch
> > > # User Roman Arutyunyan <arut at nginx.com>
> > > # Date 1661436099 -14400
> > > #      Thu Aug 25 18:01:39 2022 +0400
> > > # Node ID 4b856f1dff939e4eb9c131e17af061cf2c38cfac
> > > # Parent  069a4813e8d6d7ec662d282a10f5f7062ebd817f
> > > Core: support for reading PROXY protocol v2 TLVs.
> > 
> > First of all, could you please provide details on the use case?  
> > I've seen requests for writing proxy protocol TLVs to upstream 
> > servers (see ticket #1639), but not yet seen any meaningful 
> > reading requests.
> The known cases are these:
> - https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-target-groups.html#proxy-protocol
> - https://docs.microsoft.com/en-us/azure/private-link/private-link-service-overview#getting-connection-information-using-tcp-proxy-v2
> - https://cloud.google.com/vpc/docs/configure-private-service-connect-producer#proxy-protocol
> The data may need further parsing, but it can be done in njs or perl.

Thanks for the details.  So, basically, it's about vendor-specific 
endpoint IDs.

> > > The TLV values are available in HTTP and Stream variables
> > > $proxy_protocol_tlv_0xN, where N is a hexadecimal TLV type number with no
> > > leading zeroes.
> > 
> > I can't say I like the "hexadecimal TLV type number with no 
> > leading zeroes" approach, especially given that the specification 
> > uses leading zeroes in TLV types.  With leading zeros might be 
> > better, to match specification.
> > 
> > Also, it might worth the effort to actually add names for known 
> > types instead or in addition to numbers.
> This is indeed a good idea and we have such plans as a further extenion of this
> work.  One of the problems is however that the abovementioned TLV variables
> are specified in internal documents of AWS/Azure/GCP which are not standards.
> They can be changed anytime, while we have to maintain those variables in
> nginx.  Also, raw variables give more flexibility in supporting less known TLVs.

Of course I'm not suggesting to ditch raw variables, at least not 
for unknown/non-standard values.  But for known/standard values it 
should be easy enough to provide alternative names for easier use, 
probably with type-specific parsing.

With on-demand parsing it would be trivial to support both 
$proxy_protocol_tlv_alpn and $proxy_protocol_tlv_0x01.  Further, 
it will be trivial to support $proxy_protocol_tlv_aws_vpc_id while 
still providing $proxy_protocol_tlv_0xea for raw data.

> > Another question is PP2_TYPE_SSL, which is itself a complex 
> > structure and a list of multiple subtypes.
> This is an obvious one.  However we had exactly zero requests for this.

See the ticket mentioned above, it seems to be the main reason why 
people want to see proxy protocol v2 to backends.

> > Provided 
> > Given the above, not sure if the approach with early parsing and 
> > header-like list as in the patch is the good idea.  Just 
> > preserving TLVs as is and parsing them all during variable 
> > evaluation might be easier and more efficient.
> In this case, if we have two variables, say $proxy_protocol_tlv_ssl_{sni, alpn},
> we'll parse the entire TLV block twice - once per variable evaluation.

Assuming you mean $proxy_protocol_{authority, alpn} (as these 
aren't SSL subtypes), I actually see no difference in on-demand 
parsing of the TLV block and looking for a header in the 
pre-created list of headers.  Further, parsing the block for each 
variable evaluation might be actually faster due to better 
locality, and should simplify adding alternative names.

And a single TLV block certainly will be more optimal in terms of 
memory usage due to no additional allocations.  Not to mention the 
typical case when TLV variables aren't used at all.

> > Also, the idea of merging TLV values with identical types looks 
> > wrong to me, especially given that many TLSs are binary.  
> > Specification does not seem to define the behaviour here, 
> > unfortunately.  As far as I understand, HAProxy itself still 
> > doesn't implement PPv2 parsing, so there is not reference 
> > implementation either.  On the other hand, it should be easy 
> > enough to check all TLVs for duplicate by using a 256-bit bitmask 
> > and reject connections if there are any duplicates.
> This can be added, thanks.

Not sure it's actually needed though, especially given that proxy 
protocol is only expected to be accepted from trusted sources 
anyway.  It might be good enough to just assume there is only one 
value with a given type.

Maxim Dounin

More information about the nginx-devel mailing list