Skip to content

filter out invalid data #5

@VaticanEmbassy

Description

@VaticanEmbassy

The input files contain a lot of garbage.
We must at least clean up:

  • HTML files from missing pages
  • lines containing extra information (usually separated by " | ")
  • lines containing multiple ":" separators? Not sure about this one, we may miss some useful data

Consider that this may be done in golastipass, also.

Metadata

Metadata

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions