Skip to content

ES mapping that fits our use case#3

Open
Adiguptakn wants to merge 5 commits intomasterfrom
elasticseach
Open

ES mapping that fits our use case#3
Adiguptakn wants to merge 5 commits intomasterfrom
elasticseach

Conversation

@Adiguptakn
Copy link
Copy Markdown

I have created a mappings file which fits our use case after we push the data into ES.

I have created a mappings file which fits our use case after we push the data into ES.
Copy link
Copy Markdown
Collaborator

@MMAThijssen MMAThijssen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapping is looking nice. Good use of the type keyword for country, city, email and link. The number of shards and replicas is to be determined later on in the project?

Comment thread forward43/Mappings.txt Outdated
Copy link
Copy Markdown
Contributor

@andrewsutjahjo andrewsutjahjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a couple of things that might interfere here:
json in and of itself doesn't allow for comments. I'm not too sure how it would interact with the es-python package when adding a body though.

Also this doesn't have a clean way to interface with the rest of the codebase as it's written (I believe) to interact with Kibana.

The cleanest way to get a mapping would be to create a function
def get_mapping() -> dict:
which reads the mappings.txt file and turns it into a dict. That dict can then be picked up by whatever function that ultimately creates the index.

Same goes for the settings file

Comment thread forward43/Mappings.txt Outdated
Comment on lines +29 to +30
"contactPerson.name": {"type": "text"},
"cotactPerson.email": {"type": "keyword"},
Copy link
Copy Markdown
Contributor

@andrewsutjahjo andrewsutjahjo Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also go for camelcase: contact_person as the rest is in camel case as well.

I'm also not too sure how ES would resolve dots in the naming of keys, but the usual way is:

"contact_person" : {
    "properties" : {
        "name": {"type": "text"},
        "email": {"type": "keyword"},
        "number": {"type": "long"}
    }
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elasticsearch translates fields using a dot notation into using the “properties” parameter behind the scenes. So,

      "contact_person.name": {"type": "text"}, 
      "cotact_person.email": {"type": "keyword"},
      "contact_person.number": {"type": "long"} 

would translate to

"contact_person" : {
    "properties" : {
        "name": {"type": "text"},
        "email": {"type": "keyword"},
        "number": {"type": "long"}
    }
}

Comment thread forward43/Mappings.txt Outdated
Comment on lines +9 to +14
{
"settings": {
"number_of_shards" : x, # x is the number of shards
"number_of_replicas" : y # y is the number of replicas
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some optimizations ES has for keywords called normalizers, which you can add into your settings

https://www.elastic.co/guide/en/elasticsearch/reference/current/normalizer.html

@Adiguptakn
Copy link
Copy Markdown
Author

There's a couple of things that might interfere here:
json in and of itself doesn't allow for comments. I'm not too sure how it would interact with the es-python package when adding a body though.

Also this doesn't have a clean way to interface with the rest of the codebase as it's written (I believe) to interact with Kibana.

The cleanest way to get a mapping would be to create a function
def get_mapping() -> dict:
which reads the mappings.txt file and turns it into a dict. That dict can then be picked up by whatever function that ultimately creates the index.

Same goes for the settings file

Thanks for the review.
This mappings was specifically written to interact with Kibana. I will make the necessary changes and create a python function to read the mapping file and turns it into a dict.

Copy link
Copy Markdown
Collaborator

@MMAThijssen MMAThijssen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add normalisers to the settings. The example Andrew mentioned shows a good example that makes sure that search queries are lowercase and without accent marks.

The format used in settings.txt and mappings.txt does not follow the format assumed by ES:

      "contact_person.name": {"type": "text"}, 
      "cotact_person.email": {"type": "keyword"},
      "contact_person.number": {"type": "long"} 

"type" is missing here.



def get_settings(settingsFile):
'''(file) -> dictionary
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to me what file is. Is it settingsFile? The type of that isn't a dictionary but a string (of the path name / file name).

def get_settings(settingsFile):
'''(file) -> dictionary

Propertes for creating the index n ES
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two typos: Properties for creating the index in ES.

Return the contents of the text file with a dictionary.
'''

with open(''+settingsFile) as f:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not familiar with the method you open the file. What is the purpose of adding the '' in front of settingsFile? It seems redundant.



def get_mappings(mappingsFile):
'''(file) -> dictionary
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above. If mappingsFile is meant by file, the type is string. In the code the content of mappingsFile is transformed to a dict.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants