楼主: ReneeBK
8792 40

Machine Learning Projects for .NET Developers Apress 2015 [推广有奖]

11
Lisrelchen 发表于 2015-8-2 23:44:18

Listing 2-9. Using regular expressions to tokenize a line of text

  1. Listing 2-9. Using regular expressions to tokenize a line of text
  2. let casedTokenizer (text:string) =
  3. text
  4. |> matchWords.Matches
  5. |> Seq.cast<Match>
  6. |> Seq.map (fun m -> m.Value)
  7. |> Set.ofSeq
  8. let casedTokens =
  9. training
  10. |> Seq.map snd
  11. |> vocabulary casedTokenizer
  12. evaluate casedTokenizer casedTokens
复制代码

12
Lisrelchen 发表于 2015-8-2 23:45:42

Listing 2-10. Extracting least frequently used tokens

  1. Listing 2-10. Extracting least frequently used tokens
  2. let rareTokens n (tokenizer:Tokenizer) (docs:string []) =
  3. let tokenized = docs |> Array.map tokenizer
  4. let tokens = tokenized |> Set.unionMany
  5. tokens
  6. |> Seq.sortBy (fun t -> countIn tokenized t)
  7. |> Seq.take n
  8. |> Set.ofSeq
  9. let rareHam = ham |> rareTokens 50 casedTokenizer |> Seq.iter (printfn "%s")
  10. let rareSpam = spam |> rareTokens 50 casedTokenizer |> Seq.iter (printfn "%s")
复制代码

13
tbs20 发表于 2015-8-3 06:14:28
look and look

14
qingxunz 发表于 2015-8-3 06:46:45
thanks

15
ydb8848 发表于 2015-8-3 08:18:30

16
lhf8059 发表于 2015-8-3 08:32:23
看看!

17
sqy 发表于 2015-8-3 08:47:13
ding!!!!!!!!

18
东风夏日 发表于 2015-8-3 10:02:55
谢谢楼主分享!

19
hyq2003 发表于 2015-8-3 10:21:00

20
nabula_456 发表于 2015-8-3 11:02:47
支持分享!

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2025-12-30 04:18