Skip to content

Updated logs to match Apache HTTP Server Common Log Format style#22

Merged
douxxtech merged 9 commits intomainfrom
common-log-format
Mar 24, 2026
Merged

Updated logs to match Apache HTTP Server Common Log Format style#22
douxxtech merged 9 commits intomainfrom
common-log-format

Conversation

@douxxtech
Copy link
Owner

@douxxtech douxxtech commented Mar 24, 2026

Summary by CodeRabbit

  • New Features

    • Added Apache-style Combined Log Format (extended) logging including User-Agent and Referer fields.
    • Server now captures User-Agent and Referer from requests for inclusion in logs.
  • Bug Fixes

    • Prevented leaking previous username in 401 Unauthorized responses.
  • Documentation

    • README updated to list Combined Log Format support and note that logging to file is planned.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 24, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Adds two header-parsing macros, relocates timestamp buffers, introduces an Apache Combined Log Format Extended logger, parses User-Agent/Referer at shutdown and emits CLFE logs; README updated to advertise CLF support and that file logging is not yet supported.

Changes

Cohort / File(s) Summary
Logging Macros & Constants
macros/logutils.asm
Moved timestamp buffers from .data to .bss; added CLFE formatting constants and placeholders; replaced old LOG_REQUEST with new LOG_REQUEST_CLFE that builds an Apache-style combined log line, using clock_gettimelocaltime_rstrftime, status and size formatting, and quoted referer/user-agent fields.
Header Parsing Utilities
macros/httputils.asm
Added PARSE_UA_HEADER and PARSE_REFERER_HEADER NASM macros to scan an input buffer for the respective header names and copy their values into provided output buffers with bounds and CR/LF termination handling.
Runtime Integration
program.asm
Allocated .bss buffers user_agent and referer (1024+1); on connection shutdown calls the new parsing macros and emits logs via LOG_REQUEST_CLFE; removed prior LOG_REQUEST invocation and clears username before sending 401 responses.
Documentation
README.md
Updated "What it supports" to list Apache-HTTP-Server-like combined-format (Common Log Format) per-request logging; added a "What it does NOT support" note stating "Logging to file" is planned but not supported.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Server
    participant Parser as "PARSE_UA/REFERER"
    participant Time as "clock_gettime/localtime_r/strftime"
    participant Logger as "LOG_REQUEST_CLFE"
    Client->>Server: send HTTP request (headers + request line)
    Server->>Parser: scan request buffer for Referer/User-Agent
    Parser-->>Server: populate `referer`, `user_agent` buffers
    Server->>Time: request localized timestamp
    Time-->>Server: return formatted timestamp
    Server->>Logger: supply IP, user, time, request, status, size, referer, user-agent
    Logger->>Server: emit single CLFE log line to stdout
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰
I nibbled headers in the night,
Found Referer, User-Agent—just right.
I stitched a CLF line, tidy and neat,
A hop, a log, a crunchy beat. 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately summarizes the main change: converting the logging format to Apache HTTP Server Common Log Format style, which is the primary focus across all modified files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

START + GET /: 148 syscalls
Fastest: 0.000017s (arch_prctl)
Slowest: 0.849558s (accept)
threads: 4 (auto), conns: 50, duration: 20s

[wrk]
Running 20s test @ http://127.0.0.1:8080/
  4 threads and 50 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.40ms    9.09ms 215.96ms   99.47%
    Req/Sec     1.53k   335.61     2.54k    71.88%
  122113 requests in 20.04s, 186.91MB read
  Socket errors: connect 0, read 128, write 0, timeout 0
Requests/sec:   6094.34
Transfer/sec:      9.33MB

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@macros/httputils.asm`:
- Around line 357-437: PARSE_UA_HEADER scans the entire buffer and only matches
exact "User-Agent" casing, so fix it to stop scanning at the end-of-headers
marker and perform a case-insensitive field-name compare: in the PARSE_UA_HEADER
macro (and the analogous Referer macro), change the scan loop to break when you
detect the header terminator sequence CR LF CR LF (0x0d0a0d0a) so you don't
march into the body, and replace the exact dword/byte comparisons for
"User-Agent: " with a case-folded comparison (e.g., normalize bytes to lowercase
by masking ASCII alphabetic characters or compare each letter allowing both
cases) when matching the field name and the following ": " delimiter; ensure the
copy logic and bounds checks (r8/r9/%2/%4) remain intact and set out_buf[0]=0
when not found.

In `@macros/logutils.asm`:
- Around line 370-392: The request-printing logic (labels %%pt5, %%req_scan,
%%req_print) currently emits the request buffer and quoted markers
(clfe_qm/clfe_qm_len and PRINT r10,r9) verbatim, allowing quotes and control
bytes (and bare LF 0x0A) to break logs; modify the scan and print path to (1)
treat 0x0A as a terminator in %%req_scan, (2) on print escape any double-quote
characters and non-printable/control bytes (e.g. emit \" for " and \xHH for
control bytes) instead of raw bytes, and (3) reuse the same sanitization for
other quoted fields (Referer and User-Agent) referenced in the later block
(lines 423-462) so all PRINT usages output safe, escaped strings rather than raw
input.
- Around line 407-422: The size column currently prints the prepared header
value content_length_b at label %%pt7/%%no_len, which reports the Content-Length
header rather than actual bytes sent; change both branches to compute and PRINT
the runtime body-bytes counter (the variable that tracks bytes actually written,
e.g., bytes_sent or response_body_bytes) instead of STRLEN
content_length_b/PRINT content_length_b. Locate the output code at labels %%pt7
and %%no_len and replace the STRLEN/PRINT of content_length_b with logic that
converts the actual bytes-written counter into rcx (string length) and then
PRINT that buffer, ensuring both successful and no-length branches use the real
bytes-sent value.
- Around line 324-339: The code in label %%pt3 prints the decoded username
(username / PRINT r10, rcx) based solely on strlen check, which can leak
caller-supplied names for failed auth; instead, change the branch to check the
actual authentication result flag set by program.asm (e.g., auth_ok or similar)
and only PRINT r10, rcx when that auth flag indicates success; otherwise jump to
%%no_auth and PRINT clfe_missing (the "-" token); ensure log_space/
log_space_len printing remains unchanged and that the new branch replaces the
current cmp rcx,0 / je %%no_auth logic so username is emitted only on successful
auth.
- Around line 144-149: The reserved size for status_buf is too small for the
ITOA macro which uses buffer + 19 as scratch space and will write past the end;
increase the reservation for status_buf (the resb value) to at least 20 bytes
(or to the exact required length used by ITOA) so the ITOA macro’s temporary
writes and subsequent copy-back into status_buf cannot overflow it.

In `@README.md`:
- Line 25: The README currently labels the access log as "Common Log Format" but
the new log line includes Referer and User-Agent, so update the documentation
text to call it "combined-format logging" (or "Apache combined log format")
instead of "Common Log Format" and mention that the access log includes the
extra Referer and User-Agent fields so consumers should not expect the 7-field
CLF; update the single line describing the logging to reflect "combined" and
note the additional fields.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 47f29f84-bab7-4511-8cab-a129f4210166

📥 Commits

Reviewing files that changed from the base of the PR and between 9000b85 and aceb801.

📒 Files selected for processing (4)
  • README.md
  • macros/httputils.asm
  • macros/logutils.asm
  • program.asm

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@github-actions
Copy link

START + GET /: 148 syscalls
Fastest: 0.000017s (clock_gettime)
Slowest: 0.828395s (accept)
threads: 4 (auto), conns: 50, duration: 20s

[wrk]
Running 20s test @ http://127.0.0.1:8080/
  4 threads and 50 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.50ms   13.34ms 420.90ms   99.42%
    Req/Sec     1.56k   318.22     2.39k    68.00%
  124749 requests in 20.07s, 190.95MB read
  Socket errors: connect 0, read 3149, write 0, timeout 0
Requests/sec:   6216.49
Transfer/sec:      9.52MB

@github-actions
Copy link

START + GET /: 148 syscalls
Fastest: 0.000018s (arch_prctl)
Slowest: 0.940763s (accept)
threads: 4 (auto), conns: 50, duration: 20s

[wrk]
Running 20s test @ http://127.0.0.1:8080/
  4 threads and 50 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.26ms   11.48ms 420.04ms   99.39%
    Req/Sec     1.63k   338.82     2.45k    67.38%
  130113 requests in 20.05s, 199.16MB read
  Socket errors: connect 0, read 382, write 0, timeout 0
Requests/sec:   6489.30
Transfer/sec:      9.93MB

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
macros/logutils.asm (1)

222-223: Consider removing deprecated macro in a follow-up PR.

The "Not used anymore!!" comment indicates this macro is obsolete. If it's truly unused, removing dead code improves maintainability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/logutils.asm` around lines 222 - 223, Remove the deprecated macro
located next to the "Not used anymore!!" comment in macros/logutils.asm (the
macro definition immediately adjacent to that comment); delete its entire
definition and any internal references, search the repo for any remaining usages
and update or remove those call sites, then run the assembler/build and tests to
ensure nothing depends on it before committing as a follow-up PR.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@macros/logutils.asm`:
- Around line 452-464: The CLF output currently emits an extra trailing space by
printing log_space before the final LF in both the %%pt9 and %%no_ua branches;
update those branches (the PRINT calls for log_space, log_space_len) to stop
emitting the trailing separator—either remove the PRINT log_space call or reduce
its length so no space is output—and leave the final LF at label %%done
untouched; make this change for the PRINT invocations alongside PRINT
user_agent/PRINT clfe_qm so both branches produce no trailing space.

---

Nitpick comments:
In `@macros/logutils.asm`:
- Around line 222-223: Remove the deprecated macro located next to the "Not used
anymore!!" comment in macros/logutils.asm (the macro definition immediately
adjacent to that comment); delete its entire definition and any internal
references, search the repo for any remaining usages and update or remove those
call sites, then run the assembler/build and tests to ensure nothing depends on
it before committing as a follow-up PR.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e702e14c-d6e7-404b-8e9a-04024bb1cdfb

📥 Commits

Reviewing files that changed from the base of the PR and between aceb801 and d311c15.

📒 Files selected for processing (2)
  • README.md
  • macros/logutils.asm
✅ Files skipped from review due to trivial changes (1)
  • README.md

@github-actions
Copy link

START + GET /: 148 syscalls
Fastest: 0.000020s (brk)
Slowest: 0.801872s (accept)
threads: 4 (auto), conns: 50, duration: 20s

[wrk]
Running 20s test @ http://127.0.0.1:8080/
  4 threads and 50 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.95ms    6.62ms 215.64ms   99.71%
    Req/Sec     1.58k   365.00     2.59k    69.88%
  126373 requests in 20.05s, 193.43MB read
  Socket errors: connect 0, read 217, write 0, timeout 0
Requests/sec:   6302.62
Transfer/sec:      9.65MB

@github-actions
Copy link

START + GET /: 147 syscalls
Fastest: 0.000017s (clock_gettime)
Slowest: 0.742293s (accept)
threads: 4 (auto), conns: 50, duration: 20s

[wrk]
Running 20s test @ http://127.0.0.1:8080/
  4 threads and 50 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.11ms    7.05ms 213.42ms   99.66%
    Req/Sec     1.55k   380.15     2.52k    72.50%
  123426 requests in 20.04s, 188.92MB read
  Socket errors: connect 0, read 484, write 0, timeout 0
Requests/sec:   6158.02
Transfer/sec:      9.43MB

@douxxtech
Copy link
Owner Author

I should fix the workflow to run only once per pr ngl

@douxxtech douxxtech merged commit d0e6e7e into main Mar 24, 2026
2 of 3 checks passed
@douxxtech douxxtech deleted the common-log-format branch March 24, 2026 23:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant