Skip to content

Reduce dp-frontend-router's HOT-PATH log.Event() memory allocations by upto 31% (and all the way to 100%)#6

Open
redhug1 wants to merge 36 commits into
ONSdigital:masterfrom
redhug1:test/optimize-allocations
Open

Reduce dp-frontend-router's HOT-PATH log.Event() memory allocations by upto 31% (and all the way to 100%)#6
redhug1 wants to merge 36 commits into
ONSdigital:masterfrom
redhug1:test/optimize-allocations

Conversation

@redhug1
Copy link
Copy Markdown
Contributor

@redhug1 redhug1 commented Jul 13, 2020

Reduce log.Event() memory total allocations and total memory allocated to reduce pressure on garbage collector.

dp-frontend-router calls THREE log.Event() on its HAPPY HOT-PATH and when it is under heavy load can allocate
a LOT of memory.

These changes, when simulated and measured before and after the changes with BenchmarkLog4() for
a HAPPY PATH, HOT-PATH sequence of three log.Event()'s for the longest URL path that gets reversed proxied
shows a saving of up to 31% for the log.Event()'s impact on HEAP allocations per request.

See the file memory-allocations.txt for further notes on running the Benchmark.

'SpanID' has been removed, saving 100 bytes within the total bytes saved by these changes. It does not appear to be used in any of the ONS repositories.

redhug1 added 30 commits May 3, 2020 10:37
Try alternative struct{} code to see if any faster than o.attach(&e)
Add Benchmarking code into log_test.go
See comments above BenchmarkLog() for how to run this function when only this benchmark is uncommented.

See comments above BenchmarkLog2() for how to run this function and BenchmarkLog3() when only these two benchmarks are uncommented.

Results are commented in code, almost identical execution speed for "switch v := o.(type) {...}" compared to "o.attach(&e)" ... so leave code as original.

-=-=-
Add folder "main_assembler" to produce assembler code to compare what is produced for "switch v := o.(type) {...}" compared to "o.attach(&e)"
See comments in "main.go" (in folder main_assembler) for how to build and added comments on observations of assembler code.

-=-=-
Added main.go at top level of folders to have minimal amount of code to exercise log.Event() and aid in creating benchmark code.
See file "runtime-stack.txt" in folder 'main_assembler' for an updated analysis and explanation of why in this code using the switch{} code as opposed to the o.attach(&e) is best for high performance server code.
Investigate memory allocations with a view to reducing them.
Fix the setting of 'ctx' to get the trace_id in the output event.
Use a function to 'cast' a byte array into a string pointer which saves an allocation.
This reduces allocations from 5772 to 4073 bytes - a saving of ~29.5%
Add extra decodes within createEvent() and test code.
Use my createEvent() code in log.go
Remove 'SpanID' saved about 100 bytes (ass its not used in whole of ONS repo's).
Update notes in memory-allocations.txt, see info starting 13th July in file.
Do direct io.Write() instead of fmt.Fprintln() to save on an expensive allocation that the original code did with the piece of code: string(b)
Update Test and Benchmark code.
Remove 'SpanID' saved about 100 bytes (as its not used in whole of ONS repo's).
Update notes in memory-allocations.txt, see info starting 13th July in file.
Do direct io.Write() instead of fmt.Fprintln() to save on an expensive allocation that the original code did with the piece of code: string(b)
Update Test and Benchmark code.
See file memory-allocations.txt, log entry for 14th July (Stage 1) for full details.
New code inlines and unrolls some of the existing code.
New code puts back previous changes to test code and changes in log.go, such that the changes are more confined.
New code uses a smaller EventData structure to save every byte.
Make sure you read the log entry.
Utilising a sync.Pool to reduce memory allocations by replacing json.NewEncoder().encode() with inlined and unrolled code.
See new notes for 15th July in file: memory-allocations.txt for log of actions and results.
This code checked in has more todo, as flaged by '!!!' marker comments.
Pick out the elements of 'Data' structure for direct writing into buffer.
Remove line feed that gets added when json.NewEncoder(buf).Encode() is called.
See memory-allocations.txt for more details.
Pick out the elements of 'EventHTTP' structure for direct writing into buffer.
See memory-allocations.txt for more details.
Enhance Benchmark 4 and 5 to add "URL.Query()"  in a similar manner to the gorilla library to add in the path that is reverse proxied to (that is, as best as can be for test purposes).
Rebasing the original bytes to save from after adding the "query" field now increases the amount saved, together with replacing a few strconv functions with optimized ones that write directly to the sync.Pool buffer.
See memory-allocations.txt for more details.
Rebasing the original bytes to save from after adding the "query" field now increases the amount saved, together with replacing a few strconv functions with optimized ones that write directly to the sync.Pool buffer.
See memory-allocations.txt for more details.
Added BenchmarkLog7() that demonstrates changes needed to get to 91.6% savings of HEAP allocations.
For a discussion on how to get then get from 91.6% to 100% savings, see memory-allocations.txt   ...  Stage 6 section.
See memory-allocations.txt, log entry at 18th July 2020 for full write up of tests and results - in particular the sections: Observations, Action.
Added environment variable 'MINIMUM_ALLOC' to use new log.go code via command line.
Fix small bug in new code implementation.
Added test result spreadsheet and plot PDF's.
Add some commented out code for possibly more work.
There is more work to be done to complete this new function.
For details see memory-allocations.txt for log entry for 19th July 2020
See memory-allocations.txt for details of old and new events created and captured.
Code cleaned up and improved.
The original TestLog() was not restoring a few things it mocked, which then broke new test function TestLogNew1() when ALL tests run.
+ increase initial size of buffer's to be large enough to handle most requests before doing more alloc's in test code.
Change function names from expand... to unroll... to better reflect what they do.
Move new code to separate implementation, test and benchmarking files.
Also rename the sync.Pool used to : eventBufPool
Added saving and restoring of 'Namespace'.
Added extra parameter to event setup in TestLogNew1eventAll().
redhug1 added 6 commits July 22, 2020 14:50
Code testing on local computer works, but last check in failed github/CI test, so added cast.
The original test code is complaining when receiving an Event() with all fields populated ... so don't do that particular test
Travis uses go1.12 and my testing is using go1.14.6 and the compilers do not initialise the "test.v" flag in the same order which stopped my new test code that more fully tests Event() did not end up calling the eventWithOptionsCheck().
When adjustments made to fix this, another problem came to light with the existing code which called eventWithOptionsCheck() directly and that works, BUT calling Event() caused a panic.
The fix is to add a check to see if what is passed to eventWithOptionsCheck() is a pointer.
dp-frontend-router's HAPPY HOT-PATH has 3 log Events that contain replicated data in the 2nd and 3rd Events.
This replication has been removed, and in the test code, this saves 690 bytes out of 1814 - a reduction of 38%
Possibly more could be saved - thats a ONS decision.
See document log_line_length_Optimization-2.odt for ideas.
Add CustomSaveMoney functions to eliminate ALL HEAP Allocations from the 3 log functions that are on the HAPPY HOT-PATH of dp-frontend-router.
Add Benchmarking and Test code for new functions.
The FINAL code with logging ~38% less runs over twice as fast.
See notes for 4th Aug 2020 in memory-allocations.txt for details.
@redhug1 redhug1 changed the title Reduce dp-frontend-router's HOT-PATH log.Event() memory allocations by upto 31% Reduce dp-frontend-router's HOT-PATH log.Event() memory allocations by upto 31% (and all the way to 100%) Sep 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant