namebot.normalization module

Helpers to normalize inputs and text.

namebot.normalization.chop_duplicate_ends(word)[source]

Remove duplicate letters on either end, if the are adjacent.

Args:
words (list): The list of words
Returns:
list: An updated word list with duplicate ends removed for each word.
namebot.normalization.clean_sort(words)[source]

A function for cleaning and prepping words for techniques.

Args:
words (list): The list of words
Returns:
list: An updated word list with words cleaned and sorted.
namebot.normalization.filter_words(words)[source]

Filter words by default min/max settings in the settings module.

Args:
words (list): The list of words
Returns:
list: The filtered words
namebot.normalization.flatten(lst)[source]

Flatten a list with arbitrary levels of nesting.

CREDIT: http://stackoverflow.com/questions/10823877/
what-is-the-fastest-way-to-flatten-arbitrarily-nested-lists-in-python
Changes made include:
  1. Adding error handling,
  2. Renaming variables,
  3. Using any instead of or.

See http://creativecommons.org/licenses/by-sa/3.0/ for specific details.

Args:
lst (list): The nested list.
Returns:
(generator): The new flattened list of words.
namebot.normalization.key_words_by_pos_tag(words)[source]

Key words by the pos tag name, given when using pos_tag on a list.

Args:
words (list): The list of words, where each item is a 2-tuple.
Returns:
dict: An updated dictionary keyed by pos tag,
with values as a list of matching pos matching words.
namebot.normalization.remove_bad_words(words)[source]

Remove naughty words that might come from wordnet synsets and lemmata.

Args:
words (list): The list of words
Returns:
list: An updated word list with bad words removed.
namebot.normalization.remove_odd_sounding_words(words)[source]

Remove random odd sounding word combinations via regular expressions.

Args:
words (list): The list of words
Returns:
list: An updated word list with words cleaned.
namebot.normalization.remove_stop_words(words)[source]

Remove all stop words.

Args:
words (list): The list of words
Returns:
list: An updated word list with stopwords removed.
namebot.normalization.stem_words(words)[source]

Stem words to their base linguistic stem to remove redundancy.

Args:
words (list): The list of words
Returns:
list: An updated word list with words stemmed.
namebot.normalization.uniquify(words)[source]

Remove duplicates from a list.

Args:
words (list): The list of words
Returns:
list: An updated word list with duplicates removed.