Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion episodes/files/snakeviz-worked-example/example.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

"""
This is a synthetic program intended to produce a clear profile with cProfile/snakeviz
Method names, constructed from a hex digit and a number clearly denote their position in the hierarchy.
Function names, constructed from a hex digit and a number clearly denote their position in the hierarchy.
"""
def a_1():
for i in range(3):
Expand Down
2 changes: 1 addition & 1 deletion episodes/optimisation-introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,7 @@ You may already have a different testing workflow in-place for validating the co

::: instructor

* Fixtures: A test fixture is a common class which multiple tests can inherit from. This class will typically include methods that perform common initialisation and teardown actions around the behaviour to be tested. This reduces repeated code.
* Fixtures: A test fixture is a common class which multiple tests can inherit from. This class will typically include functions that perform common initialisation and teardown actions around the behaviour to be tested. This reduces repeated code.
* Mocking: If you wish to test a feature which would relies on a live or temperamental service, such as making API calls to a website. You can mock that API, so that when the test runs synthetic responses are produced rather than the real API being used.
* Test skipping: You may have configurations of your software that cause certain tests to be unsupported. Skipping allows conditions to be added to tests, to decide whether they should be executed or skipped.

Expand Down
17 changes: 8 additions & 9 deletions episodes/optimisation-numpy.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,8 @@ Packages like NumPy and Pandas work similarly: They have been written in compile

## Using NumPy (Effectively)

[NumPy](https://numpy.org/) is a commonly used package for scientific computing, which provides a wide variety of methods.

It adds restriction via its own [basic numeric types](https://numpy.org/doc/stable/user/basics.types.html) and static arrays to enable even greater performance than that of core Python. However if these restrictions are ignored, the performance can become significantly worse.
[NumPy](https://numpy.org/) is a commonly used package for scientific computing. It provides a wide variety of functions, as well as its own [basic numeric types](https://numpy.org/doc/stable/user/basics.types.html) and static arrays to enable even greater performance than that of core Python.
However, these more specialised types come with restrictions—and if these restrictions are ignored, the performance can become significantly worse.

![Illustration of a NumPy array and a Python list.](episodes/fig/numpy-array-vs-list.png){alt="A diagram illustrating the difference between a NumPy array and a Python list. The NumPy array is a raw block of memory containing numerical values. A Python list contains a header with metadata and multiple items, each of which is a reference to another Python object with its own header and value."}
<!-- Figure inspired by https://jakevdp.github.io/blog/2014/05/09/why-python-is-slow/#3.-Python's-object-model-can-lead-to-inefficient-memory-access -->
Expand All @@ -55,7 +54,7 @@ def array_resize():
for i in range(1, N):
ar.resize(i+1)
ar[i] = i

repeats = 1000
print(f"list_append: {timeit(list_append, number=repeats):.2f}ms")
print(f"array_resize: {timeit(array_resize, number=repeats):.2f}ms")
Expand Down Expand Up @@ -201,7 +200,7 @@ In contrast, numbers in a Python list [are spread across memory in a fairly comp

::::::::::::::::::::::::::::::::::::::::::::::::

Earlier it was demonstrated that using core Python methods over a list will outperform a loop, performing the same calculation faster. The below example takes this a step further by demonstrating the calculation of a dot product.
Earlier it was demonstrated that using core Python functions on a list will outperform a loop, performing the same calculation faster. The below example takes this a step further by demonstrating the calculation of a dot product.

<!-- Inspired by High Performance Python Chapter 6 example
Added Python sum array, skipped a couple of others-->
Expand Down Expand Up @@ -376,11 +375,11 @@ Similar to NumPy, Pandas enables greater performance than pure Python implementa

### Operating on Rows

Pandas' methods by default operate on columns. Each column or series can be thought of as a NumPy array, highly suitable for vectorisation.
Pandas' functions by default operate on columns. Each column or series can be thought of as a NumPy array, highly suitable for vectorisation.

Following the theme of this episode, iterating over the rows of a data frame using a `for` loop is not advised. The pythonic iteration will be slower than other approaches.

Pandas allows its own methods to be applied to rows in many cases by passing `axis=1`, where available these functions should be preferred over manual loops. Where you can't find a suitable method, `apply()` can be used, which is similar to `map()`, to apply your own function to rows.
Pandas allows its own functions to be applied to rows in many cases by passing `axis=1`, where available these functions should be preferred over manual loops. Where you can't find a suitable function, `apply()` can be used, which is similar to `map()`, to apply your own function to rows.

```python
from timeit import timeit
Expand Down Expand Up @@ -449,9 +448,9 @@ def vectorize():
vertical = df["f_vertical"]
horizontal = df["f_horizontal"]

# Your code goes here
result = ... # Your code goes here

return pandas.Series(results)
return pandas.Series(result)
```

Once you’ve done that, measure your performance by running
Expand Down
6 changes: 3 additions & 3 deletions episodes/profiling-conclusion.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,11 @@ Why programmers can benefit from profiling:

- Narrows down the costly areas of code, allowing optimisation to be prioritised or decided to be unnecessary.

When to Profile:
When to profile:

- Profiling should be performed on functional code, either when concerned about performance or prior to release/deployment.

What to Profile:
What to profile:

- The collection of profiling metrics will often slow the execution of code, therefore the test-case should be narrow whilst remaining representative of a realistic run.

Expand All @@ -49,7 +49,7 @@ How to function-level profile:
How to line-level profile:

- Import `profile` from `line_profiling`
- Decorate targeted methods with `@profile`
- Decorate targeted functions with `@profile`
- Execute `line_profiler` via `python -m kernprof -lvr <script name> <arguments>`

::::::::::::::::::::::::::::::::::::::::::::::::
Loading
Loading