namebot.normalization module¶
Helpers to normalize inputs and text.
-
namebot.normalization.
chop_duplicate_ends
(word)[source]¶ Remove duplicate letters on either end, if the are adjacent.
- Args:
- words (list): The list of words
- Returns:
- list: An updated word list with duplicate ends removed for each word.
-
namebot.normalization.
clean_sort
(words)[source]¶ A function for cleaning and prepping words for techniques.
- Args:
- words (list): The list of words
- Returns:
- list: An updated word list with words cleaned and sorted.
-
namebot.normalization.
filter_words
(words)[source]¶ Filter words by default min/max settings in the settings module.
- Args:
- words (list): The list of words
- Returns:
- list: The filtered words
-
namebot.normalization.
flatten
(lst)[source]¶ Flatten a list with arbitrary levels of nesting.
- CREDIT: http://stackoverflow.com/questions/10823877/
- what-is-the-fastest-way-to-flatten-arbitrarily-nested-lists-in-python
- Changes made include:
- Adding error handling,
- Renaming variables,
- Using any instead of or.
See http://creativecommons.org/licenses/by-sa/3.0/ for specific details.
- Args:
- lst (list): The nested list.
- Returns:
- (generator): The new flattened list of words.
-
namebot.normalization.
key_words_by_pos_tag
(words)[source]¶ Key words by the pos tag name, given when using pos_tag on a list.
- Args:
- words (list): The list of words, where each item is a 2-tuple.
- Returns:
- dict: An updated dictionary keyed by pos tag,
- with values as a list of matching pos matching words.
-
namebot.normalization.
remove_bad_words
(words)[source]¶ Remove naughty words that might come from wordnet synsets and lemmata.
- Args:
- words (list): The list of words
- Returns:
- list: An updated word list with bad words removed.
-
namebot.normalization.
remove_odd_sounding_words
(words)[source]¶ Remove random odd sounding word combinations via regular expressions.
- Args:
- words (list): The list of words
- Returns:
- list: An updated word list with words cleaned.
-
namebot.normalization.
remove_stop_words
(words)[source]¶ Remove all stop words.
- Args:
- words (list): The list of words
- Returns:
- list: An updated word list with stopwords removed.