- 阅读权限
- 255
- 威望
- 0 级
- 论坛币
- 50288 个
- 通用积分
- 83.6306
- 学术水平
- 253 点
- 热心指数
- 300 点
- 信用等级
- 208 点
- 经验
- 41518 点
- 帖子
- 3256
- 精华
- 14
- 在线时间
- 766 小时
- 注册时间
- 2006-5-4
- 最后登录
- 2022-11-6
|
- Spelling correction with Enchant
- How to do it...
- We will create a new class called SpellingReplacer in replacers.py, and this
- time, the replace() method will check Enchant to see whether the word is valid.
- If not, we will look up the suggested alternatives and return the best match using
- nltk.metrics.edit_distance():
- import enchant
- from nltk.metrics import edit_distance
- class SpellingReplacer(object):
- def __init__(self, dict_name='en', max_dist=2):
- self.spell_dict = enchant.Dict(dict_name)
- self.max_dist = max_dist
- def replace(self, word):
- if self.spell_dict.check(word):
- return word
- suggestions = self.spell_dict.suggest(word)
- if suggestions and edit_distance(word, suggestions[0]) <=
- self.max_dist:
- return suggestions[0]
- else:
- return word
- The preceding class can be used to correct English spellings, as follows:
- >>> from replacers import SpellingReplacer
- >>> replacer = SpellingReplacer()
- >>> replacer.replace('cookbok')
- 'cookbook'
复制代码
|
|