Skip to content

Add documentation for types: scalars, lists, and NumPy arrays#81

Draft
PiotrPich2024 wants to merge 2 commits intomainfrom
types_page
Draft

Add documentation for types: scalars, lists, and NumPy arrays#81
PiotrPich2024 wants to merge 2 commits intomainfrom
types_page

Conversation

@PiotrPich2024
Copy link

Added documentation about passed types and how they are converted into C variables.
Added warning inside documentation about passing wrong precision floating-point types numbers.


# Types: scalars, lists, NumPy arrays (and float32 precision)

Most `pyamtrack` numerical functions are backed by C/C++ code and exposed through **nanobind**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

info about nanobind is not relevant to physicists (users)

pyamtrack.converters.beta_from_energy(150.0) # float OK
```

Many wrapped functions internally convert scalar inputs to C++ `double` using `nb::cast<double>(...)`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description of how does it work under the hood should go here: https://github.com/libamtrack/pyamtrack/tree/master/docs

In this docs we write in a language suitable for users


## 2. Lists (vectorized “element-wise” calls)

Many multi-argument functions are wrapped so that you can pass **Python lists** and get **vectorized** results.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single argument functions as well. I'd say here "functions".
users is not familiar with "vectorized" word.

Say in simple words: you pass the list in, and the list gets out. There is no need to manually write "for loop".

You can start with "for loop" example and then provide an example of how it could be replaced.

You can also use an analogy of some simple numpy function, like np.sqrt which calculates square root of many numbers without writing for loop.

You can also say we were inspired by this design

```

### Broadcasting scalars against lists/arrays
If at least one argument is list/array, scalar arguments are broadcast to match the vector length.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User doesn't know what broadcasting means. Say rather about passing of arguments as lists or arrays of various length.

Comment on lines +99 to +107
Some wrappers accept **0-D** or **1-D** arrays (for vectorized evaluation).
For certain vectorized wrappers, NumPy array inputs are required to be **1-D** (otherwise a `ValueError` is raised).

If you have higher-dimensional arrays, flatten them explicitly if that matches your intent:

```python
x = np.asarray(x)
x1d = x.reshape(-1) # or x.ravel()
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say that in simpler words. Mention "single numbers" or "scalars". Scalar is something opposite to the vector.

Copy link
Contributor

@grzanka grzanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments

## 2. Lists (vectorized “element-wise” calls)

Many multi-argument functions are wrapped so that you can pass **Python lists** and get **vectorized** results.
The wrapper checks whether an argument is a Python `list` and then applies the computation element-by-element.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User may expect that it passes a list and gets back a list. Will he/she get back list or numpy array ?


`pyamtrack` is a Python interface to the C/C++ **libamtrack** library. Many functions accept either a **single number** or a **set of numbers** (a Python list or a NumPy array). Internally, most continuous physical quantities are computed using C/C++ **double precision** (`double`).

This page explains what you can pass to `pyamtrack` functions and how to avoid the most common numerical pitfalls—especially when using `numpy.float32`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't expose the numpy.float32 so much, rather put it at the end as a separate header 1 entry.


# Input types (Python / NumPy) and numerical precision

`pyamtrack` is a Python interface to the C/C++ **libamtrack** library. Many functions accept either a **single number** or a **set of numbers** (a Python list or a NumPy array). Internally, most continuous physical quantities are computed using C/C++ **double precision** (`double`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set of numbers : set is something that is not ordered, I would rather say "vector"/"matrix".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also avoid here mentioning the double precision stuff.

what I'd expect here is a few sentences like "The aim of this page is to familiarize the user with the input/output datatypes for functions available in libamtrack...."


If you only read one section, read this:

- For **single values** (energy, LET, etc.): use Python `float`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scientists like exponential notation, mentioning number like 1e-3 would be nice as well


- For **single values** (energy, LET, etc.): use Python `float`
(example: `150.0`, not `np.float32(150)`).
- For **arrays of values**: use NumPy arrays with `dtype=np.float64`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See:

>>> np.array([10.0, 15.]).dtype
dtype('float64')

by default we get float64.

Its safe to recommend np.array and give such example.

(example: `150.0`, not `np.float32(150)`).
- For **arrays of values**: use NumPy arrays with `dtype=np.float64`.
- For **IDs** (material IDs, model IDs): use Python `int` or integer NumPy arrays (`np.int32` / `np.int64`).
- Avoid `numpy.float32` / `dtype=np.float32` for inputs to physics calculations unless you really know you can tolerate reduced precision.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that should go to bottom of the page

import numpy as np
import pyamtrack

energies = np.asarray([50.0, 100.0, 150.0], dtype=np.float64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend numpy array when plotting is needed and you need hundreds of values, like:

energies_MeV = np.linspace(start=10, stop=1000, num=500)

using numpy array just for few numbers is weird.

You can mention plotting, its frequent use case


---

## 2) Continuous values vs IDs (different “kinds” of inputs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continous vs discrete ?

import pyamtrack

energies = np.asarray([50.0, 100.0, 150.0], dtype=np.float64)
materials = np.asarray([1, 1, 1], dtype=np.int32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here for materials I'd use simple list, not an array.

I'd also use the opportunity to say to the user that you can mix types, like one argument being np.array, another list and another string.

energies = np.asarray([50.0, 100.0, 150.0], dtype=np.float64)
materials = np.asarray([1, 1, 1], dtype=np.int32)

pyamtrack.stopping.electron_range(energies, material=materials, model="tabata")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we support strings as model ?

Many functions behave like this:

- If you pass a **single value**, you get a **single value** back (Python `float`).
- If you pass a **list** or **NumPy array**, you get a **vector of results** back (often a NumPy array).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always a numpy array ?

Copy link
Contributor

@grzanka grzanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants