Describe the bug
The Parser::parse implementations for numeric types fail to parse strings that contain leading or trailing whitespace.
In practice, this happens quite often when reading data from CSVs or other text-based sources where values may be padded with spaces, tabs, or newline characters. Instead of parsing successfully, these inputs currently return None.
To Reproduce
use arrow_array::types::*;
use arrow_cast::parse::Parser;
// they return None instead of the parsed number
assert_eq!(Float32Type::parse(" 1.5 "), None); // expected Some(1.5)
assert_eq!(Int32Type::parse(" 42 "), None); // expected Some(42)
assert_eq!(Int64Type::parse("\t100\n"), None); // expected Some(100)
assert_eq!(UInt64Type::parse(" 7 "), None); // expected Some(7)
Expected behavior
Numeric parsers should ignore leading and trailing whitespace before parsing. For example, " 42 " should parse successfully to Some(42) rather than returning None.
This behavior is consistent with how most data ingestion systems handle text-to-number conversion.
Additional context
The issue originates in arrow-cast/src/parse.rs. The float parsers pass string.as_bytes() directly to lexical_core::parse, and the parser_primitive! macro (used for integers and durations) similarly operates on the input without trimming.
A simple fix would be to call .trim() on the input string before attempting to parse.
Describe the bug
The Parser::parse implementations for numeric types fail to parse strings that contain leading or trailing whitespace.
In practice, this happens quite often when reading data from CSVs or other text-based sources where values may be padded with spaces, tabs, or newline characters. Instead of parsing successfully, these inputs currently return None.
To Reproduce
Expected behavior
Numeric parsers should ignore leading and trailing whitespace before parsing. For example, " 42 " should parse successfully to Some(42) rather than returning None.
This behavior is consistent with how most data ingestion systems handle text-to-number conversion.
Additional context
The issue originates in arrow-cast/src/parse.rs. The float parsers pass string.as_bytes() directly to lexical_core::parse, and the parser_primitive! macro (used for integers and durations) similarly operates on the input without trimming.
A simple fix would be to call .trim() on the input string before attempting to parse.