This blog post assumes that you are familiar with the following principles at least a some general level
- public key cryptography
- ASN.1
- base64 encoding
If you are not use your favorite search engine to find some introductory articles.
While no longer leading edge RSA keys continue to be widely used for ssh. Let’s look into what the key files actually contain an how the fingerprints are calculated. I’ll use standard Linux tools to do that.
Before doing that we need to recall a couple of storage formats, which predate XML, JSON, yaml, etc. which most programmers are more likely to see these days.
- BER: Basing Encoding Rules describe how to binarily encode and store ASN.1 data
- DER: Distinguished Encoding Rules are a special variant of BER. While BER does not generally guarantee a binarily unique encoding for given ASN.1 data, DER does. Think of leading zeros as the most obvious difference. Binary uniqueness is important if we want to compare fingerprints.
- PEM, Privacy-Enhanced Mail is a format to store and send binary data (particlurly cryptographic keys and certificates) in ASCII only. It encodes the binary data using base64 and adds header and a footer line. Look into any PEM encoded file and you immediately understand the principle.
So let’s start with openssh private key files. While their purpose is different host key pairs and user key pairs are encoded exactly the same way. So for the purpose of this blog post we can just talk about the private key (file) and the public key (file). Just don’t do that when you ask someone else to send a public key or verify a fingerprint. Then you should always be clear on whether you mean the host or the user key.
The private key file is a PEM-encoded file. Note that since openssh 7.8, released August 2018 a new storage format is the default. Because most of my keys are older I just handle the old format for now. For simplicity I only look at private key files which are not protect by a password, i.e they are not encypted. While openssh does not use any ending (file extension), if you create a user key pair using Amazon EC2 console, they will give it the ending .pem
. As usual Linux tools don’t take the ending into account.
$ cat id_rsa_test1
-----BEGIN RSA PRIVATE KEY-----
MIIEpgIBAAKCAQEAyPRkF0N/BdSRFy/+YBDfPd4NbrdBhRJUDAqizMZLcNnxHKVY
30j/b1b87ZpHVEmI8MFCUQiTaN5AdgPP1m2MI4cz3lDsKsuxDPVpzuxOWUMQpH/i
AwX4Eh51J8YT+scrg5KoDvrwsk6FF86tnuqsxyFwgP23WDEUQPlOr2DFehJlpzgw
vf5WzhKPd4VmZU6tIEL4+QBTmvnYqvCa2IWUt354h/znByer8RmGhdIvnppSSLLa
...
3fSVevllbKfe9z5Wd82Yvb33aqxi6Hpjm+3UySoHTY7/295yWQCzRtJFAum9WxAf
ZxXu9wD1AoGBAMM/ts1U5jUFxvuTOR3TdrzZDIIHW1yjxAFUrP5ML2nxGMYvbA9S
ctZOJ3OUKL/wdsbRD8a3xeCWWOgkNPl4HOSUePjHAaji0K/0lBKaLHPler874ELj
aBOQgTrLS6/A4uYgXB2iFUKskecjDUOrI+/BRw/SolxIbkwM0wPk7KNB
-----END RSA PRIVATE KEY-----
Let’s decode that
$ grep -v "RSA PRIVATE" id_rsa_test1 | base64 | hexdump -C
00000000 30 82 04 a6 02 01 00 02 82 01 01 00 c8 f4 64 17 |0.............d.|
00000010 43 7f 05 d4 91 17 2f fe 60 10 df 3d de 0d 6e b7 |C...../.`..=..n.|
00000020 41 85 12 54 0c 0a a2 cc c6 4b 70 d9 f1 1c a5 58 |A..T.....Kp....X|
00000030 df 48 ff 6f 56 fc ed 9a 47 54 49 88 f0 c1 42 51 |.H.oV...GTI...BQ|
00000040 08 93 68 de 40 76 03 cf d6 6d 8c 23 87 33 de 50 |..h.@v...m.#.3.P|
00000050 ec 2a cb b1 0c f5 69 ce ec 4e 59 43 10 a4 7f e2 |.*....i..NYC....|
00000060 03 05 f8 12 1e 75 27 c6 13 fa c7 2b 83 92 a8 0e |.....u'....+....|
00000070 fa f0 b2 4e 85 17 ce ad 9e ea ac c7 21 70 80 fd |...N........!p..|
...
Ok, that doesn’t look very helpful.
But If we know that the data is ASN.1 in DER encoding…
$ grep -v "RSA PRIVATE" id_rsa_test1 | base64 -d | openssl asn1parse -inform DER
0:d=0 hl=4 l=1190 cons: SEQUENCE
4:d=1 hl=2 l= 1 prim: INTEGER :00
7:d=1 hl=4 l= 257 prim: INTEGER :C8F46417437F05D491172FFE6010DF3DDE0D6EB7418512540C0AA2CCC64B70D9F11CA558DF48FF6F56FCED9A47544988F0C14251089368DE407603CFD66D8C238733DE50EC2ACBB10CF569CEEC4E594310A47FE20305F8121E7527C613FAC72B8392A80EFAF0B24E8517CEAD9EEAACC7217080FDB758311440F94EAF60C57A1265A73830BDFE56CE128F778566654EAD2042F8F900539AF9D8AAF09AD88594B77E7887FCE70727ABF1198685D22F9E9A5248B2DA98870BB65448BBE113406770CFAC6FF7AD7691A08CB74FCD37125ECBA93A64639B764CE362572E442613A4F52CD73DC898BD0476EF29383983D5D71EBEFE20C37153D225A118F5778E23363B
268:d=1 hl=2 l= 3 prim: INTEGER :010001
273:d=1 hl=4 l= 257 prim: INTEGER :B34064C3E79E37BA2C0E0DEE8F0C9DC5D6FC0E6583194B871312ADB391B21BA11C951BDCD19790DA6532CA53130B5968C0B0F311CB5F8A6CADFD575B1E7FAB40C9CA77A5A1746F2BA84E852A61D9658010295C3563D3003C5154D9BCB377C00FCD9695387F9912C43DAE45DAD365ABE718A9FD35D444E3CA98468A6CA01AA6CE8AAEC239051E8662995B2BBE9B4F2ED14289ABD642041F28573414E67C1F2A9F7A1C084492A084FF8C488D5053AAD2EF869A5039F8BFE534B589027912BB79DD282ADA78028B9853D534DB82E55AFBDD6575A7B30A3755C42DD03E16281BE15D2962A695607C7846C5DA2773A1D6D9FE39FE78CDFBACD70B6D7ED0D8755072E9
534:d=1 hl=3 l= 129 prim: INTEGER :F2D7124D7ACAAC2E430C6492D946750EE66D508E5714FBD417D5E98F28328F749E76236166D1540F5DDB391B93653A3BB5326E5165B8BFCEDB5D62301895FA8A98CF0572A996838C237323C911FAF6E8843F3E37B20677A4C19728C6AB2F56D12B917F22CDD434DAABA597F16655F4B1EE4BB6160C59DF3CF4337C2BDB86E5DF
666:d=1 hl=3 l= 129 prim: INTEGER :D3D83DB6129BC85CC8DEAF64A403FBC71CC424E17C3AD41D912AF21C6404F46CCD54812A5EE2766D7B0E94903AE11DFDADD67FDCFD2DF45F103723DA959E7D5834003B591E5CA9F646A8C9C9F13E331A977810350555AE4C78E3AB0631051DB50B08CBC7D176165E10F752534244E0571A7FF72E11A463F7E1BAC7B51399A325
798:d=1 hl=3 l= 129 prim: INTEGER :A4C664A7E81ADF8C2078A741B1668A854ABB7FFEA57E1A86468A2289BDD7D8D963B07BBF5A99CD3504157D81859919536C56C4DE3C6C88D1DEAD55B396EB256EA7D3493A0D7290DE252BBA6B73E4DB66D85D65653B4A0222EC2D1A40FBE50A3EB2166EB2FA00F4C02FDA13E87BECF5354AA15AF348FC2E6AD8B49A9BD3C08BF7
930:d=1 hl=3 l= 129 prim: INTEGER :A704645AE8BEE32FABBA4D43A63FF1BFE0810FA6AA7FE2FDD096B03D0BEA101EBB9F751A47A679C204F3D0D30968B4716D1DA0BF44E877327FA149662AF1C256C8E0A9E9B013547872EADDF4957AF9656CA7DEF73E5677CD98BDBDF76AAC62E87A639BEDD4C92A074D8EFFDBDE725900B346D24502E9BD5B101F6715EEF700F5
1062:d=1 hl=3 l= 129 prim: INTEGER :C33FB6CD54E63505C6FB93391DD376BCD90C82075B5CA3C40154ACFE4C2F69F118C62F6C0F5272D64E27739428BFF076C6D10FC6B7C5E09658E82434F9781CE49478F8C701A8E2D0AFF494129A2C73E57ABF3BE042E3681390813ACB4BAFC0E2E6205C1DA21542AC91E7230D43AB23EFC1470FD2A25C486E4C0CD303E4ECA341
… we see that the file contains a sequences of 9 integers.
And openssl has even support to show us the semantics of those integers:
$ grep -v "RSA PRIVATE" id_rsa_test1 | base64 -d | openssl rsa -inform DER -text
Private-Key: (2048 bit)
modulus:
00:c8:f4:64:17:43:7f:05:d4:91:17:2f:fe:60:10:
...
1e:be:fe:20:c3:71:53:d2:25:a1:18:f5:77:8e:23:
36:3b
publicExponent: 65537 (0x10001)
privateExponent:
00:b3:40:64:c3:e7:9e:37:ba:2c:0e:0d:ee:8f:0c:
...
fe:39:fe:78:cd:fb:ac:d7:0b:6d:7e:d0:d8:75:50:
72:e9
prime1:
00:f2:d7:12:4d:7a:ca:ac:2e:43:0c:64:92:d9:46:
...
ab:a5:97:f1:66:55:f4:b1:ee:4b:b6:16:0c:59:df:
3c:f4:33:7c:2b:db:86:e5:df
prime2:
00:d3:d8:3d:b6:12:9b:c8:5c:c8:de:af:64:a4:03:
...
10:f7:52:53:42:44:e0:57:1a:7f:f7:2e:11:a4:63:
f7:e1:ba:c7:b5:13:99:a3:25
exponent1:
00:a4:c6:64:a7:e8:1a:df:8c:20:78:a7:41:b1:66:
...
2f:da:13:e8:7b:ec:f5:35:4a:a1:5a:f3:48:fc:2e:
6a:d8:b4:9a:9b:d3:c0:8b:f7
exponent2:
00:a7:04:64:5a:e8:be:e3:2f:ab:ba:4d:43:a6:3f:
...
4d:8e:ff:db:de:72:59:00:b3:46:d2:45:02:e9:bd:
5b:10:1f:67:15:ee:f7:00:f5
coefficient:
00:c3:3f:b6:cd:54:e6:35:05:c6:fb:93:39:1d:d3:
...
91:e7:23:0d:43:ab:23:ef:c1:47:0f:d2:a2:5c:48:
6e:4c:0c:d3:03:e4:ec:a3:41
...
writing RSA key
-----BEGIN RSA PRIVATE KEY-----
...
These are the 8 last numbers we have seen in ASN.1 decoding above. The first one was zero, I’d guess it could be a version number, or possibly an indicator that there is no encryption, but I did not check the source or spec.
And of course we could have the same results a bit easier, because openssl assumes PEM even by default:
$ openssl rsa -text <id_rsa_test1
For efficiency reasons openssh does not store just the private key as required by the original RSA algorithm but a couple of precalculated values used by the implementation. For us not going into the cryptography here it is enough to know that the first 2 two numbers make up the public key, the remaining 6 numbers need to be kept private.
Yes, that means that the whole key pair is contained in the private key file. You
can lose/ignore/delete your public key file and always recreate it from the private key file:
$ ssh-keygen -y -f id_rsa_test1
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDI9GQXQ38F1JEXL/5gEN893g1ut0GFElQMCqLMxktw2fEcpVjfSP9vVvztmkdUSYjwwUJRCJNo3kB2A8/WbYwjhzPeUOwqy7EM9WnO7E5ZQxCkf+IDBfgSHnUnxhP6xyuDkqgO+vCyToUXzq2e6qzHIXCA/bdYMRRA+U6vYMV6EmWnODC9/lbOEo93hWZlTq0gQvj5AFOa+diq8JrYhZS3fniH/OcHJ6vxGYaF0i+emlJIstqYhwu2VEi74RNAZ3DPrG/3rXaRoIy3T803El7LqTpkY5t2TONiVy5EJhOk9SzXPciYvQR27yk4OYPV1x6+/iDDcVPSJaEY9XeOIzY7
If you compare it with the original key file id_rsa_test1.pub the only difference is an additional comment at the end. Functionality is the same regardless of the comment. This also explains why the AWS EC2 console only knows about pem files and no public key file is ever mentioned. They just produce it when needed.
This brings us to the next question. What is the format of an openssh public key file? Obviously it is ASCII and it contains 3 fields
- the type of key
- some base64 encoded data
- an optional comment
So what does the base64 part contain?
$ awk '{print $2}' id_rsa_test1.pub | base64 -d | hexdump -C
00000000 00 00 00 07 73 73 68 2d 72 73 61 00 00 00 03 01 |....ssh-rsa.....|
00000010 00 01 00 00 01 01 00 c8 f4 64 17 43 7f 05 d4 91 |.........d.C....|
00000020 17 2f fe 60 10 df 3d de 0d 6e b7 41 85 12 54 0c |./.`..=..n.A..T.|
00000030 0a a2 cc c6 4b 70 d9 f1 1c a5 58 df 48 ff 6f 56 |....Kp....X.H.oV|
00000040 fc ed 9a 47 54 49 88 f0 c1 42 51 08 93 68 de 40 |...GTI...BQ..h.@|
00000050 76 03 cf d6 6d 8c 23 87 33 de 50 ec 2a cb b1 0c |v...m.#.3.P.*...|
00000060 f5 69 ce ec 4e 59 43 10 a4 7f e2 03 05 f8 12 1e |.i..NYC.........|
00000070 75 27 c6 13 fa c7 2b 83 92 a8 0e fa f0 b2 4e 85 |u'....+.......N.|
00000080 17 ce ad 9e ea ac c7 21 70 80 fd b7 58 31 14 40 |.......!p...X1.@|
00000090 f9 4e af 60 c5 7a 12 65 a7 38 30 bd fe 56 ce 12 |.N.`.z.e.80..V..|
000000a0 8f 77 85 66 65 4e ad 20 42 f8 f9 00 53 9a f9 d8 |.w.feN. B...S...|
000000b0 aa f0 9a d8 85 94 b7 7e 78 87 fc e7 07 27 ab f1 |.......~x....'..|
000000c0 19 86 85 d2 2f 9e 9a 52 48 b2 da 98 87 0b b6 54 |..../..RH......T|
000000d0 48 bb e1 13 40 67 70 cf ac 6f f7 ad 76 91 a0 8c |H...@gp..o..v...|
000000e0 b7 4f cd 37 12 5e cb a9 3a 64 63 9b 76 4c e3 62 |.O.7.^..:dc.vL.b|
000000f0 57 2e 44 26 13 a4 f5 2c d7 3d c8 98 bd 04 76 ef |W.D&...,.=....v.|
00000100 29 38 39 83 d5 d7 1e be fe 20 c3 71 53 d2 25 a1 |)89...... .qS.%.|
00000110 18 f5 77 8e 23 36 3b |..w.#6;|
This is a proprietary format invented by openssh, so this time we cannot
decode ASN.1. Still it’s pretty easy to read, so I skip writing a decoder.
- a four byte big-endian length: 7 Bytes
- 7 ASCII Bytes reading “ssh-rsa”
- a four byte big-endian length: 3 Bytes
- 01 00 01: This is the public exponent we have seen above in ASN.1 of the private key
- a four byte big-endian length: 257 Bytes
- the 257 bytes of public modulus we have seen above in the ASN.1 of the private key
This matches all nicely. So why do they need 257 bytes to store a 2048 bit modulus? This comes from ASN.1, which stores signed integers as 2’s complement. If the first bit of a positive integer is set a leading zero byte needs to be inserted to mark it as a positive integer.
The last question is how the fingerprints are calculated. The openssh fingerprint always refers to the public key. (After all private keys should not be shared so comparing fingerprints makes little sense.) The fingerprint is just calculated on the base64 decoded data field of the public key in openssh’s proprietary storage format:
$ awk '{print $2}' id_rsa_test1.pub | base64 -d | md5sum
c5cada07a97ff5b50a9ca3d0e76266b2 -
$ ssh-keygen -l -E md5 -f id_rsa_test1.pub
2048 MD5:c5:ca:da:07:a9:7f:f5:b5:0a:9c:a3:d0:e7:62:66:b2 foo@bar (RSA)
To get the same format using colon separators
$ awk '{print $2}' id_rsa_test1.pub | base64 -d | openssl md5 -c
(stdin)= c5:ca:da:07:a9:7f:f5:b5:0a:9c:a3:d0:e7:62:66:b2
Of course md5 is considered obsolete, so let’s do it using SHA256 (which has been selected by default in openssh for a while). Here we need to know that openssh does not show a hex string, but a base64 encoded result:
$ awk '{print $2}' id_rsa_test1.pub | base64 -d | openssl sha256 -binary | base64
nIw6/G+YUBiao6JPufR1JX+Mq/+kh44eYcMnmSrtSGw=
$ ssh-keygen -l -E sha256 -f id_rsa_test1.pub
2048 SHA256:nIw6/G+YUBiao6JPufR1JX+Mq/+kh44eYcMnmSrtSGw foo@bar (RSA)
AWS EC2 users might have wondered why the fingerprints shown on the EC2
console never match to anything shown by openssh. The reason is that AWS does not calculate the fingerprint from the openssh proprietary format of the public key, but from the private key in PKCS#8 format. Well, that’s a bit of a special case, you are indeed sharing your private key with AWS.
openssl pkcs8 -in AWS1.pem -nocrypt -topk8 -outform DER | openssl sha1 -c
(stdin)= 08:44:96:40:4a:35:65:5a:81:a2:fd:b1:cf:11:fa:82:46:77:98:2a
This post has already gotten too long so I’ll leave it to dig deeper into what this PKCS#8 contains. The following 2 commands should be enough of a hint for the curious reader:
$ openssl pkcs8 -in AWS1.pem -nocrypt -topk8 | openssl asn1parse
$ openssl rsa -outform DER