docs: list prominent architecture decisions#882
Conversation
642f828 to
559f647
Compare
e6d4afe to
c399e1e
Compare
florentc
left a comment
There was a problem hiding this comment.
Thanks for the docs. You got a good grasp of the whole project pretty fast.
This should give an overview to new contributors on how certain aspects of the dataflow are implemented. Co-authored-by: Valentin Gagarin <valentin.gagarin@tweag.io>
c399e1e to
5f893b3
Compare
|
We had split out the several concerns this was trying to address into smaller separate PRs that have been merged already. What's left are the architecture notes that aren't otherwise summarised anywhere. There's still some cleanup left to do, such as moving the different aspects of the contributing guide to their own files. But that can be done in another (series of) pull request(s). |
There was a problem hiding this comment.
I must admit I find the file structure of the various doc files confusing.
We have:
- The project README: this one is ok
- A README in docs: this one is about project architecture, which I find weird for the readme in docs, I'd expect the info there to be in their own design document:
- The individual design documents: that's a good idea but they are not easily discoverable IMO
- The CONTRIBUTING file which is where new contributors will go to in priority by convention and has technical info too
I find this all confusing, I don't know why something belongs somewhere and not elsewhere nor how it's organized.
I think we should have the following to bring order and consistency:
- README stays as it is now
- CONTRIBUTING contains only the contribution culture and conventions
- Everything else from CONTRIBUTING moves to their own individual design document
- CONTRIBUTING has a single link to docs/README.md
- Everything from docs/README about architecture goes to their individual design document. For example I don't see why the things about pgpubsub would be in that kind of front page rather than its own design document.
- docs/README.md becomes only a table of content for the individual design documents, nothing else
- Each entry in the table of content has 1 sentence to describe it. For now, the only table of contents we have is the directory structure itself and the filenames are not enough to understand what it is about for new people (e.g. "linkage")
- We drop the numbers in front of each design document as there is no real order in which we can read them and we can still add links from one design doc to another if needed
To me it'd be ok to do it in this PR. The documentation is not the target of a lot of concurrent modifications and this would be mostly copy/cut/paste the current content, not writing new one.
Beside:
- I think the typo fixes in code comments and imports should be dropped from this PR
|
My idea for what ended up remaining from this PR was to just take the additional content somewhere because it was indeed missing/helpful, and then do what you described in a separate effort. The reshuffling of sections and files is surely good but not urgent now; I'd rather finish this and focus on current priorities, then come back to docs when we have a breather (or renewed pain). |
Description
Adds
docs/CODEBASE.md, a developer guide covering the generic flow and understanding of the system for new contributors. Documents the two data pipelines (Nixpkgs package fetch and CVE ingestion), how they converge at the automatic matching step, the event-driven listener architecture, key database models, management commands, architectural patterns, and a glossary.