- [ ] With free-threading, could _convert_column_data be called in parallel for each column?
- [ ] (free-threading) For large files, split into chunks and parse in parallel, then concat?
- [ ] In a pyarrow-always-available world, could
_string_box_utf8
allocate a buffer+mask rather than ndarray[object]? - [ ] #17743
Anyone else have more ideas?
Comment From: samukweku
Is it possible to prefilter? @jbrockmendel
Comment From: jbrockmendel
Can you describe what you have in mind?
Comment From: samukweku
@jbrockmendel I mean do a filter on the rows while reading the csv, instead of returning the entire CSV. I'm thinking of what happens with parquet, where a user can return a subset of the data, filtered based on partitioning. CSV doesn't have that; maybe there is a different way to make that happen? Polars has something similar with scan_csv; I believe duckdb has similar capabilities with their CSV reader