Search This Blog

Sunday 19 March 2017

WCF and the infamous identity check failure


I've been planning to write this article for months. The primary reason I didn't do this earlier is because of the complexity of the topic. When you can't see the light at the end of the tunnel, you can find tons of excuses why not to do something. :-) However, waiting another couple of months would have surely erased everything from my mind about this issue. So it became kinda like a "now or never" situation.

I've been using X.509 certificates for code-signing and as SSL/TLS certificate for years. The former was necessary to achieve
in-browser elevated trust for Silverlight 5 applications and the latter for SSL over HTTP and secure WCF communication overall. I bumped into all kinds of errors that related to certs, but one of them was especially painful to understand / accept. Consider the following error message:
Identity check failed for outgoing message. The expected DNS identity of the remote endpoint was 'foo.bar.com' but the remote endpoint provided DNS claim 'foo'. If this is a legitimate remote endpoint, you can fix the problem by explicitly specifying DNS identity 'foo' as the Identity property of EndpointAddress when creating channel proxy.

Based on the error message, you can fix the problem at client side by either
1) specifying an EndpointAddress whose Identity property is set to an instance of DnsEndpointIdentity with the claim value 'foo', like:

var endpointAdress = new EndpointAddress(new Uri("net.tcp://foo.bar.com/DummySvc"), new DnsEndpointIdentity("foo"));

2) using the expected DNS name in the EndpointAddress, if server-side hosting exposes your endpoint via both addresses:

var endpointAdress = new EndpointAddress("net.tcp://foo/DummySvc");

Both approaches would work and you could achieve the same in your app.config too.

But let’s dig a little bit deeper here. WCF (up to .NET 4.6.1, but more on that later) complains about identity mismatch depending on the certificate being used for server authentication / encryption / data integrity. To be more specific, if you have a certificate whose Subject Alternative Name (OID 2.5.29.7 or 2.5.29.17) contains more than a single DNS entry, then WCF will ignore all but the last one in the list.
Figure 1. An X.509 certificate with two DNS Name entries in its SAN field
It was not easy to track down the erroneous behavior in the framework due to the large number of components working together, but in the end I found it in method X509Certificate2.GetNameInfo that makes heavy use of the Crypto APi. The code basically iterates through all the DNS entries in the SAN field of the certificate, but stores them in the same local variable, so each iteration of the loop overwrites the value stored by the previous iteration. Now that seems pretty odd from MS...

In contrast to this behavior, all major browsers respect all DNS entries. So can you write code to support all kinds of DNS entry permutations? Well, you can't... at least if you're running < .NET 4.6.1. All you can do is document your software well and highlight this behavior so that an admin trying to configure e.g. https for your services, requests and uses a properly created certificate.

However, there is hope for those running at least .NET 4.6.1. as this problem has been addressed by MS as described in
Retargeting Changes in the .NET Framework 4.6.1. Keep in mind however, that targeting an older framework version, but running under at least 4.6.1. requires you to add that extra setting to your app.config. For more information on how the target framework can differ from the one actually used to run your application, see an older post of mine. 

Establishing a secure channel b/w a WCF client and service is a very(!) complex process, including but not limited to phases like SSL / TLS initialization, server / client authentication and certificate validation. All these topics deserve books to be written about and of course there are great resources out there. The point I'm trying to make here is that all this complexity manifests in a real pain for the developer when it comes to debugging such problems. Building up a real understanding of the security concepts behind is a great way to make investigations faster and a good start might be msdn and WCF Security Guidance. Otherwise, you might end up digging code based on call stacks like this:
 
0:000> !clrstack
OS Thread Id: 0x49dc (0)
<some columns were removed for brevity>
System.ServiceModel.Security.IdentityVerifier.EnsureIdentity(System.ServiceModel.EndpointAddress, System.IdentityModel.Policy.AuthorizationContext, System.String)
System.ServiceModel.Security.IdentityVerifier.EnsureOutgoingIdentity(System.ServiceModel.EndpointAddress, System.Uri, System.IdentityModel.Policy.AuthorizationContext)
System.ServiceModel.Channels.SslStreamSecurityUpgradeInitiator.ValidateRemoteCertificate(System.Object, System.Security.Cryptography.X509Certificates.X509Certificate, System.Security.Cryptography.X509Certificates.X509Chain, System.Net.Security.SslPolicyErrors)
System.Net.Security.SecureChannel.VerifyRemoteCertificate(System.Net.Security.RemoteCertValidationCallback)
System.Net.Security.SslState.CompleteHandshake()
System.Net.Security.SslState.CheckCompletionBeforeNextReceive(System.Net.Security.ProtocolToken, System.Net.AsyncProtocolRequest)
System.Net.Security.SslState.ProcessReceivedBlob(Byte[], Int32, System.Net.AsyncProtocolRequest)
System.Net.Security.SslState.StartReceiveBlob(Byte[], System.Net.AsyncProtocolRequest)
System.Net.Security.SslState.ProcessReceivedBlob(Byte[], Int32, System.Net.AsyncProtocolRequest)
System.Net.Security.SslState.StartReceiveBlob(Byte[], System.Net.AsyncProtocolRequest)
System.Net.Security.SslState.ProcessReceivedBlob(Byte[], Int32, System.Net.AsyncProtocolRequest)
System.Net.Security.SslState.StartReceiveBlob(Byte[], System.Net.AsyncProtocolRequest)
System.Net.Security.SslState.ForceAuthentication(Boolean, Byte[], System.Net.AsyncProtocolRequest)
System.Net.Security.SslState.ProcessAuthentication(System.Net.LazyAsyncResult)
System.ServiceModel.Channels.SslStreamSecurityUpgradeInitiator.OnInitiateUpgrade(System.IO.Stream, System.ServiceModel.Security.SecurityMessageProperty ByRef)
System.ServiceModel.Channels.StreamSecurityUpgradeInitiatorBase.InitiateUpgrade(System.IO.Stream)
System.ServiceModel.Channels.ConnectionUpgradeHelper.InitiateUpgrade(System.ServiceModel.Channels.StreamUpgradeInitiator, System.ServiceModel.Channels.IConnection ByRef, System.ServiceModel.Channels.ClientFramingDecoder, System.ServiceModel.IDefaultCommunicationTimeouts, System.Runtime.TimeoutHelper ByRef)
System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.SendPreamble(System.ServiceModel.Channels.IConnection, System.ArraySegment`1, System.Runtime.TimeoutHelper ByRef)
System.ServiceModel.Channels.ClientFramingDuplexSessionChannel+DuplexConnectionPoolHelper.AcceptPooledConnection(System.ServiceModel.Channels.IConnection, System.Runtime.TimeoutHelper ByRef)
System.ServiceModel.Channels.ConnectionPoolHelper.EstablishConnection(System.TimeSpan)
System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.OnOpen(System.TimeSpan)
System.ServiceModel.Channels.CommunicationObject.Open(System.TimeSpan)
System.ServiceModel.Channels.ServiceChannel.OnOpen(System.TimeSpan)
System.ServiceModel.Channels.CommunicationObject.Open(System.TimeSpan)
System.ServiceModel.Channels.ServiceChannel+CallOpenOnce.System.ServiceModel.Channels.ServiceChannel.ICallOnce.Call(System.ServiceModel.Channels.ServiceChannel, System.TimeSpan)
System.ServiceModel.Channels.ServiceChannel+CallOnceManager.CallOnce(System.TimeSpan, CallOnceManager)
System.ServiceModel.Channels.ServiceChannel.EnsureOpened(System.TimeSpan)
System.ServiceModel.Channels.ServiceChannel.Call(System.String, Boolean, System.ServiceModel.Dispatcher.ProxyOperationRuntime, System.Object[], System.Object[], System.TimeSpan)
System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(System.Runtime.Remoting.Messaging.IMethodCallMessage, System.ServiceModel.Dispatcher.ProxyOperationRuntime)
System.ServiceModel.Channels.ServiceChannelProxy.Invoke(System.Runtime.Remoting.Messaging.IMessage)
System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(System.Runtime.Remoting.Proxies.MessageData ByRef, Int32)

This might help you understand smaller pieces, but you won't see the big picture for quite some time. BTW, the topmost stack frame of the call stack is interesting. The EndpointAddress parameter of EnsureIdentity() comes from the client proxy and is used to get the DNS claim that needs to be matched with one of the DNS claims obtained from the X.509 certificate through the AuthorizationContext parameter.
 
To be honest, I'm quite happy to see this fixed. Even though I have some scripts that use OpenSSL to generate test certificates with custom SAN entries very quickly, it's much better to get this issue off my black list.