pd.concat
accepts iterables that may contain Series
, DataFrame
and None
, where None
are simply ignored.
pd.concat([None, series, None, series, None]) # works, same as pd.concat([series, series])
pd.concat([None]) # raises
It would be great to deprecate/remove accepting None
s. This will help to better annotate concat
https://github.com/pandas-dev/pandas-stubs/issues/888
Pandas-stubs currently has to choose between either false-positives (accept concat([None])
) or false-negatives (reject some concat(Iterable[Any/Unknown])
that could succeed).
Comment From: phofl
I am -1 on simply deprecating this because of typing reasons. Let's think about use cases before we make a decision. We definitely can't do this without a deprecation
Comment From: twoertwein
I agree, it would be good first to understand which use case supporting None
enables!
@Dr-Irv found a way to get it to type check without having to decide between false negatives/positives :)
Comment From: Multihuntr
If the function should accept None's in objs
, then it seems most natural to me for it to return an empty DataFrame if objs
contains only None's.
This is backwards compatible and would also resolve the typing issue.
A hypothetical use case for Iterable[DataFrame | None]
: you have a list of files to load and parse and you want to put them into a single DataFrame, but not all of the filenames you've been given exist, or maybe parsing is allowed to fail. I could see someone implementing this as a function that returns DataFrame | None
, and putting that into a list comprehension. So a situation where you just want to put everything you have into one table, and if everything you have is None, then your table is empty.
Comment From: jbrockmendel
I was surprised to find this works. Looks like Nones are filtered out in _clean_keys_and_objs, which has a comment pointing to #1649 back in 2012. I don't see much conversation there about whether this should be allowed.
+1 on deprecating.