When creating a pandas Series/Index/DataFrame, I think we generally differentiate between passing a pandas object with object dtype and a numpy array with object dtype:

>>> pd.options.future.infer_string = True
>>> pd.Index(pd.Series(["foo", "bar", "baz"], dtype="object"))
Index(['foo', 'bar', 'baz'], dtype='object')
>>> pd.Index(np.array(["foo", "bar", "baz"], dtype="object"))
Index(['foo', 'bar', 'baz'], dtype='str')

So for pandas objects, we preserve the dtype, for numpy arrays of object dtype, we essentially treat that as a sequence of python objects where we infer the dtype (@jbrockmendel that's also your understanding?)

But for categorical that doesn't seem to happen:

>>> pd.options.future.infer_string = True
>>> pd.Categorical(pd.Series(["foo", "bar", "baz"], dtype="object"))
['foo', 'bar', 'baz']
Categories (3, str): [bar, baz, foo]   # <--- categories inferred as str

So we want to preserver the dtype for the categories here as well?

Comment From: jbrockmendel

(@jbrockmendel that's also your understanding?)

Yes.

So we want to preserver the dtype for the categories here as well?

Makes sense.