Proposal Details

Right now,

mime/type.go includes what seems to be a somewhat arbitrary list of built-in types:

var builtinTypesLower = map[string]string{
    ".avif": "image/avif",
    ".css":  "text/css; charset=utf-8",
    ".gif":  "image/gif",
    ".htm":  "text/html; charset=utf-8",
    ".html": "text/html; charset=utf-8",
    ".jpeg": "image/jpeg",
    ".jpg":  "image/jpeg",
    ".js":   "text/javascript; charset=utf-8",
    ".json": "application/json",
    ".mjs":  "text/javascript; charset=utf-8",
    ".pdf":  "application/pdf",
    ".png":  "image/png",
    ".svg":  "image/svg+xml",
    ".wasm": "application/wasm",
    ".webp": "image/webp",
    ".xml":  "text/xml; charset=utf-8",
}

I think some guidance on what should be included in this would be good, rather than a consumer of the package not realizing there are arbitrary gaps. In the meantime I will submit a PR that will incorporate all MDN defined "Common Types" (which also I have to admit is arbitrary, but at least covers more common usecases.)

Comment From: seankhliao

what's included is based on WHATWG mime sniffing https://mimesniff.spec.whatwg.org/ this gives us a clear spec to adhere to, rather than an arbitrary list.

Comment From: AidanWelch

@seankhliao Wow, thanks for the quick response, but I'm confused as to where that actually specifies specifically just the mime types specified in builtinTypes. From my understanding that would be more relevant for net/http's DetectContentType that is actually sniffing. But, for mime's ExtensionsByType and TypeByExtension don't we have the assumption that the file extension/type is truthful and we're trying to determine the most likely type from that- whereas sniffing wouldn't even care about the given type or extension? (And so sniffing would give most(all?) plaintext types for example the same extension/type)

Comment From: gopherbot

Change https://go.dev/cl/614376 mentions this issue: mime: extend "builtinTypes" to include a more complete list of common types

Comment From: gabyhelp

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

Comment From: neild

what's included is based on WHATWG mime sniffing https://mimesniff.spec.whatwg.org/ this gives us a clear spec to adhere to, rather than an arbitrary list.

net/http.DetectContentType is based on WHATWG's spec; this proposal is for the type/extension mapping used by mime.TypeByExtension and other functions in the mime package when the system MIME database (/etc/mime.types or similar) isn't present.

Comment From: milhoan

Per conversation here https://github.com/whatwg/mimesniff/issues/51#issuecomment-2415555310, the intent of the Mimesniff spec is

"Based on the recent trajectory of changes to this spec, it seems to me that the scope of the spec is client-side sniffing for cross-browser compatibility and protection for the user against malicious files"

Mimesniff spec is not an appropriate spec for a http server use case. It would be better to adopt a different spec for this.

Alternatively, a new function that is server side appropriate that implements a different spec is needed. (EDIT: This comment was regarding DetectContentType, not TypeByExtension)

Comment From: AidanWelch

@milhoan But as of now, this doesn't mimesniff. It just maps file extensions to mime types

Comment From: milhoan

@milhoan But as of now, this doesn't mimesniff. It just maps file extensions to mime types

Sorry, I saw the discussion above about DetectContentType being based on that spec(imo it should not be). Disregard my comment as this is not about that function. I'm 100% in favor of more mime type coverage for TypeByExtension

Comment From: seankhliao

Looking at what the browsers do for matching file extensions to mime type:

Chromium https://chromium.googlesource.com/chromium/src/+/master/net/base/mime_util.cc#129 Maintains a primary and secondary mapping, with the preference order being: primary, platform, secondary.

Firefox https://searchfox.org/mozilla-central/source/uriloader/exthandler/nsExternalHelperAppService.cpp#2968 list at https://searchfox.org/mozilla-central/source/uriloader/exthandler/nsExternalHelperAppService.cpp#455 const defs https://searchfox.org/mozilla-central/source/netwerk/mime/nsMimeTypes.h Maintains a default and extra mapping, with the preference order being: default, platform, extras.

Below is a table mapping file extensions to go mime types and chromium / firefox inclusion in primary (1) or secondary (2) lists, and their mime type if it differs from what go has.

extension go mime type chrome firefox
3g2 2 (video/3gpp2)
3gp 2 (video/3gpp)
3gpp 2 (video/3gpp)
aac 2 (audio/aac)
ai 2 (application/postscript) 2 (application/postscript)
apk 2 (application/vnd.android.package-archive) 2 (application/vnd.android.package-archive)
apng 1 (image/apng) 2 (image/apng)
appcache 2 (text/cache-manifest)
arj 2 (application/x-arj)
art 2 (image/x-jg)
avif image/avif 1 2
bin 2 (application/octet-stream) 2 (application/octet-stream)
bmp 2 (image/bmp) 2 (image/bmp)
cer 2 (application/x-x509-ca-cert)
com 2 (application/octet-stream) 2 (application/octet-stream)
crt 2 (application/x-x509-ca-cert)
crx 1 (application/x-chrome-extension)
css text/css 1 2
csv 1 (text/csv) 2 (text/csv)
cur 2 (image/x-icon)
doc 2 (application/msword) 2 (application/msword)
docx 2 (application/vnd.openxmlformats-officedocument.wordprocessingml.document) 2 (application/vnd.openxmlformats-officedocument.wordprocessingml.document)
dot 2 (application/msword)
ehtml 2 (text/html) 2 (text/html)
eml 2 (message/rfc822) 2 (message/rfc822)
eps 2 (application/postscript) 2 (application/postscript)
epub 2 (application/epub+zip)
exe 2 (application/octet-stream) 2 (application/octet-stream)
flac 1 (audio/flac) 2 (audio/flac)
ftl 1 (text/plain)
gif image/gif 1 2
gz 2 (application/x-gzip) 2 (application/gzip)
htm text/html 1 2
html text/html 1 2
ical 2 (text/calendar)
icalendar 2 (text/calendar)
ico 2 (image/vnd.microsoft.icon) 2 (image/x-icon)
ics 2 (text/calendar) 2 (text/calendar)
ifb 2 (text/calendar)
jfif 2 (image/jpeg) 2 (image/jpeg)
jpeg image/jpeg 1 2
jpg image/jpeg 1 2
js text/javascript 2 (application/javascript) 2 (application/x-javascript)
jsm 2 (application/x-javascript)
json application/json 2 2
jxl 2 (image/jxl)
locale 1 (text/plain)
m3u8 2 (application/x-mpegurl)
m4a 1 (audio/x-m4a) 2 (audio/mp4)
m4b 2 (audio/mp4)
m4v 1 (video/mp4)
mht 1 (multipart/related)
mhtml 1 (multipart/related)
mid 2 (audio/x-midi)
mjs text/javascript 1 2 (application/x-javascript)
mml 2 (application/mathml+xml)
mp2 2 (audio/mpeg)
mp3 1 (audio/mp3) 2 (audio/mpeg)
mp4 1 (video/mp4) 2 (video/mp4)
mpeg 2 (video/mpeg)
mpega 2 (audio/mpeg)
mpg 2 (video/mpeg)
odg 2 (application/vnd.oasis.opendocument.graphics)
odp 2 (application/vnd.oasis.opendocument.presentation)
ods 2 (application/vnd.oasis.opendocument.spreadsheet)
odt 2 (application/vnd.oasis.opendocument.text)
oga 1 (audio/ogg) 2 (audio/ogg)
ogg 1 (audio/ogg) 2 (application/ogg)
ogm 1 (video/ogg)
ogv 1 (video/ogg) 2 (video/ogg)
opus 1 (audio/ogg) 2 (audio/ogg)
p7c 2 (application/pkcs7-mime)
p7m 2 (application/pkcs7-mime)
p7s 2 (application/pkcs7-signature)
p7z 2 (application/pkcs7-mime)
pdf application/pdf 2 2
pjp 2 (image/jpeg) 2 (image/jpeg)
pjpeg 2 (image/jpeg) 2 (image/jpeg)
png image/png 2 (image/x-png) 2
ppt 2 (application/vnd.ms-powerpoint) 2 (application/vnd.ms-powerpoint)
pptx 2 (application/vnd.openxmlformats-officedocument.presentationml.presentation) 2 (application/vnd.openxmlformats-officedocument.presentationml.presentation)
properties 1 (text/plain)
ps 2 (application/postscript) 2 (application/postscript)
rdf 2 (application/rdf+xml) 2 (application/rdf+xml)
rss 2 (application/rss+xml)
rtf 2 (application/rtf) 2 (application/rtf)
sh 2 (text/x-sh)
shtm 1 (text/html)
shtml 1 (text/html) 2 (text/html)
svg image/svg+xml 1 2
svgz 1 (image/svg+xml)
swf 2 (application/x-shockwave-flash)
swl 2 (application/x-shockwave-flash)
tar 2 (application/x-tar)
text 2 (text/plain) 2 (text/plain)
tgz 2 (application/x-gzip)
tif 2 (image/tiff) 2 (image/tiff)
tiff 2 (image/tiff) 2 (image/tiff)
txt 2 (text/plain) 2 (text/plain)
vcard 2 (text/vcard)
vcf 2 (text/vcard)
vtt 2 (text/vtt) 2 (text/vtt)
wasm application/wasm 1 2
wav 1 (audio/wav) 2 (audio/x-wav)
weba 2 (audio/webm)
webm 1 (audio/webm) 2 (audio/webm)
webp image/webp 1 2
woff 2 (application/font-woff)
xbl 2 (text/xml) 2 (text/xml)
xbm 2 (image/x-xbitmap) 2 (image/x-xbitmap)
xht 1 (application/xhtml+xml) 2 (application/xhtml+xml)
xhtm 1 (application/xhtml+xml)
xhtml 1 (application/xhtml+xml) 2 (application/xhtml+xml)
xls 2 (application/vnd.ms-excel) 2 (application/vnd.ms-excel)
xlsx 2 (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet) 2 (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
xml text/xml 1 2
xpi 2 (application/x-xpinstall)
xsl 2 (text/xml) 2 (text/xml)
xslt 2 (text/xml)
xul 2 (application/vnd.mozilla.xul+xml)
yuv 2 (video/x-raw-yuv)
zip 2 (application/zip) 2 (application/zip)

Comment From: seankhliao

If we are to add more, I propose we limit it to what both browsers have decided to include in their built in lists.

Comment From: AidanWelch

That sounds good to me, I can update the PR if that is what's decided on

Comment From: neild

Interestingly, the one case where we override the platform value (on Windows, we ignore a registry entry mapping .js to text/plain) is one where Chrome and Firefox apparently prefer the platform setting.

Limiting our list of builtin mappings to what both Chrome and Firefox include seems reasonably principled. I'd support that.

Comment From: AidanWelch

Okay, my new commit implements the types supported by both. When there was disagreement I went with what IANA lists.

In the case of .wav IANA lists nothing. However, RFC 2361 describes a standard for a MIME type using audio/vnd.wave. Chrome uses audio/wav which is supported by the long expired draft-ema-vpim-wav-00 which states:

RFC 2361, "WAVE and AVI Codec Registries," is an informational draft describing IANA namespaces for codecs registered in Microsoft's WAVE and AVI registries. Such codecs may be described in the following format: audio/vnd.wave; codec = [codec ID]. This format is not suited to the description of a wave file as defined in this document, as it does not indicate the format standard that audio/wav must adhere to for interoperability between messaging systems. On desktop-oriented messaging systems, audio/wav (rather than audio/vnd.wave) is the defacto standard.

Firefox uses audio/x-wav which (similar to audio/wav) was used as an example in a few RFCs but never actually described as a standard. So I decided to go with audio/wav despite that seemingly not actually standardized.

Comment From: neild

To recap the proposal:

The mime package contains a built-in table mapping file extensions to MIME types. For example: ".png" maps to "image/png". This table is only used when the system MIME database is not present. The table currently contains 16 entries: https://cs.opensource.google/go/go/+/refs/tags/go1.23.4:src/mime/type.go;l=53

Chrome and Firefox both include similar built-in tables. The proposal is to add all entries present in both the Chrome and Firefox tables to the mime package. This expands the table to 64 entries. CL: https://go.dev/cl/614376

In the future, if Chrome and Firefox add new entries, we would follow suit.

Comment From: aclements

Do the Chrome or Firefox tables ever change in a way that isn't purely additive?

Do Chrome and/or Firefox have sources we could mechanically pull this from? There are other tables (e.g., tzdata) that we update as part of the release process and if we go this route for the MIME types, it would certainly be preferable to automate updating this table as well.