> I'd be amazed if Microsoft held any important keys outside of HSMs. I think the incident you're referring to is where they appear to have been hacked by China (maybe the Chinese govt) and they blamed a "validation error in Microsoft code". HSMs help a bit but if the code that accesses them is insecure then you can still sign bad things. The HSM just means you don't have to do a key rotation to recover.
One of the few things that seem unambigiously worded in Microsoft's disclosure on this topic is that the hackers did actually have a copy of the key itself:
> was forging Azure AD tokens using an acquired Microsoft account (MSA) consumer signing key
> Storm-0558 acquired an inactive MSA consumer signing key and used it to forge authentication tokens for Azure AD enterprise and MSA consumer to access OWA and Outlook.com. All MSA keys active prior to the incident – including the actor-acquired MSA signing key – have been invalidated.
> The method by which the actor acquired the key is a matter of ongoing investigation.
Which sure sounds like something that wouldn't just happen if these lived in a HSM. (Presumably Microsoft data centers have access controls...?)
My guess is that because these keys see a high volume of operations and rolling out new keys is probably annoying because tons of different pieces validate against them that Microsoft chose to not use HSMs for them for cost reasons, because you can just use all compute power in a server to perform operations with the key (much lower $/sign/s) and if you need to scale up you can simply copy the private key to more instances.
One of the few things that seem unambigiously worded in Microsoft's disclosure on this topic is that the hackers did actually have a copy of the key itself:
> was forging Azure AD tokens using an acquired Microsoft account (MSA) consumer signing key
> Storm-0558 acquired an inactive MSA consumer signing key and used it to forge authentication tokens for Azure AD enterprise and MSA consumer to access OWA and Outlook.com. All MSA keys active prior to the incident – including the actor-acquired MSA signing key – have been invalidated.
> The method by which the actor acquired the key is a matter of ongoing investigation.
https://www.microsoft.com/en-us/security/blog/2023/07/14/ana...
Which sure sounds like something that wouldn't just happen if these lived in a HSM. (Presumably Microsoft data centers have access controls...?)
My guess is that because these keys see a high volume of operations and rolling out new keys is probably annoying because tons of different pieces validate against them that Microsoft chose to not use HSMs for them for cost reasons, because you can just use all compute power in a server to perform operations with the key (much lower $/sign/s) and if you need to scale up you can simply copy the private key to more instances.