楼主: ReneeBK
1544 8

Scala/Java Machine Learning Library:Nak [推广有奖]

  • 1关注
  • 62粉丝

VIP

已卖:4897份资源

学术权威

14%

还不是VIP/贵宾

-

TA的文库  其他...

R资源总汇

Panel Data Analysis

Experimental Design

威望
1
论坛币
49635 个
通用积分
55.7537
学术水平
370 点
热心指数
273 点
信用等级
335 点
经验
57805 点
帖子
4005
精华
21
在线时间
582 小时
注册时间
2005-5-8
最后登录
2023-11-26

楼主
ReneeBK 发表于 2016-4-21 09:52:34 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Nak

Nak is a Scala/Java library for machine learning and related tasks, with a focus on having an easy to use API for some standard algorithms. It is formed from Breeze, Liblinear Java, and Scalabha. It is currently undergoing a pretty massive evolution, so be prepared for quite big changes in the API for this and probably several future versions.

We'd love to have some more contributors: if you are interested in helping out, please see the #helpwanted issues or suggest your own ideas.

本帖隐藏的内容

https://github.com/scalanlp/nak/tree/master/src/main

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Learning earning Library machine BRARY interested currently learning prepared standard

本帖被以下文库推荐

沙发
ReneeBK 发表于 2016-4-21 09:53:14
  1. Example

  2. Here's an example of how easy it is to train and evaluate a text classifier using Nak. See TwentyNewsGroups.scala for more details.

  3. def main(args: Array[String]) {
  4.   val newsgroupsDir = new File(args(0))
  5.   implicit val isoCodec = scala.io.Codec("ISO-8859-1")
  6.   val stopwords = Set("the","a","an","of","in","for","by","on")

  7.   val trainDir = new File(newsgroupsDir, "20news-bydate-train")
  8.   val trainingExamples = fromLabeledDirs(trainDir).toList
  9.   val config = LiblinearConfig(cost=5.0)
  10.   val featurizer = new BowFeaturizer(stopwords)
  11.   val classifier = trainClassifier(config, featurizer, trainingExamples)

  12.   val evalDir = new File(newsgroupsDir, "20news-bydate-test")
  13.   val maxLabelNews = maxLabel(classifier.labels) _
  14.   val comparisons = for (ex <- fromLabeledDirs(evalDir).toList) yield
  15.     (ex.label, maxLabelNews(classifier.evalRaw(ex.features)), ex.features)
  16.   val (goldLabels, predictions, inputs) = comparisons.unzip3
  17.   println(ConfusionMatrix(goldLabels, predictions, inputs))
  18. }
复制代码

藤椅
ReneeBK 发表于 2016-4-21 09:54:33
  1. package nak.liblinear;


  2. final class ArraySorter {

  3.     /**
  4.      * <p>Sorts the specified array of doubles into <b>descending</b> order.</p>
  5.      *
  6.      * <em>This code is borrowed from Sun's JDK 1.6.0.07</em>
  7.      */
  8.     public static void reversedMergesort(double[] a) {
  9.         reversedMergesort(a, 0, a.length);
  10.     }

  11.     private static void reversedMergesort(double x[], int off, int len) {
  12.         // Insertion sort on smallest arrays
  13.         if (len < 7) {
  14.             for (int i = off; i < len + off; i++)
  15.                 for (int j = i; j > off && x[j - 1] < x[j]; j--)
  16.                     swap(x, j, j - 1);
  17.             return;
  18.         }

  19.         // Choose a partition element, v
  20.         int m = off + (len >> 1); // Small arrays, middle element
  21.         if (len > 7) {
  22.             int l = off;
  23.             int n = off + len - 1;
  24.             if (len > 40) { // Big arrays, pseudomedian of 9
  25.                 int s = len / 8;
  26.                 l = med3(x, l, l + s, l + 2 * s);
  27.                 m = med3(x, m - s, m, m + s);
  28.                 n = med3(x, n - 2 * s, n - s, n);
  29.             }
  30.             m = med3(x, l, m, n); // Mid-size, med of 3
  31.         }
  32.         double v = x[m];

  33.         // Establish Invariant: v* (<v)* (>v)* v*
  34.         int a = off, b = a, c = off + len - 1, d = c;
  35.         while (true) {
  36.             while (b <= c && x[b] >= v) {
  37.                 if (x[b] == v) swap(x, a++, b);
  38.                 b++;
  39.             }
  40.             while (c >= b && x[c] <= v) {
  41.                 if (x[c] == v) swap(x, c, d--);
  42.                 c--;
  43.             }
  44.             if (b > c) break;
  45.             swap(x, b++, c--);
  46.         }

  47.         // Swap partition elements back to middle
  48.         int s, n = off + len;
  49.         s = Math.min(a - off, b - a);
  50.         vecswap(x, off, b - s, s);
  51.         s = Math.min(d - c, n - d - 1);
  52.         vecswap(x, b, n - s, s);

  53.         // Recursively sort non-partition-elements
  54.         if ((s = b - a) > 1) reversedMergesort(x, off, s);
  55.         if ((s = d - c) > 1) reversedMergesort(x, n - s, s);
  56.     }

  57.     /**
  58.      * Swaps x[a] with x[b].
  59.      */
  60.     private static void swap(double x[], int a, int b) {
  61.         double t = x[a];
  62.         x[a] = x[b];
  63.         x[b] = t;
  64.     }

  65.     /**
  66.      * Swaps x[a .. (a+n-1)] with x[b .. (b+n-1)].
  67.      */
  68.     private static void vecswap(double x[], int a, int b, int n) {
  69.         for (int i = 0; i < n; i++, a++, b++)
  70.             swap(x, a, b);
  71.     }

  72.     /**
  73.      * Returns the index of the median of the three indexed doubles.
  74.      */
  75.     private static int med3(double x[], int a, int b, int c) {
  76.         return (x[a] < x[b] ? (x[b] < x[c] ? b : x[a] < x[c] ? c : a) : (x[b] > x[c] ? b : x[a] > x[c] ? c : a));
  77.     }


  78. }
复制代码

板凳
ReneeBK 发表于 2016-4-21 09:55:09
  1. package nak.liblinear;


  2. final class DoubleArrayPointer {

  3.     private final double[] _array;
  4.     private int            _offset;


  5.     public void setOffset(int offset) {
  6.         if (offset < 0 || offset >= _array.length) throw new IllegalArgumentException("offset must be between 0 and the length of the array");
  7.         _offset = offset;
  8.     }

  9.     public DoubleArrayPointer( final double[] array, final int offset ) {
  10.         _array = array;
  11.         setOffset(offset);
  12.     }

  13.     public double get(final int index) {
  14.         return _array[_offset + index];
  15.     }

  16.     public void set(final int index, final double value) {
  17.         _array[_offset + index] = value;
  18.     }
  19. }
复制代码

报纸
ReneeBK 发表于 2016-4-21 09:55:44
  1. package nak.liblinear;

  2. /**
  3. * @since 1.9
  4. */
  5. public interface Feature {

  6.     int getIndex();

  7.     double getValue();

  8.     void setValue(double value);
  9. }
复制代码

地板
ReneeBK 发表于 2016-4-21 09:57:00
  1. package nak.classify

  2. /*
  3. Copyright 2009 David Hall, Daniel Ramage

  4. Licensed under the Apache License, Version 2.0 (the "License");
  5. you may not use this file except in compliance with the License.
  6. You may obtain a copy of the License at

  7. http://www.apache.org/licenses/LICENSE-2.0

  8. Unless required by applicable law or agreed to in writing, software
  9. distributed under the License is distributed on an "AS IS" BASIS,
  10. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  11. See the License for the specific language governing permissions and
  12. limitations under the License.
  13. */


  14. import nak.data._
  15. import breeze.linalg._

  16. /**
  17. * Represents a classifier from observations of type T to labels of type L.
  18. * Implementers should only need to implement score.
  19. *
  20. * @author dlwh
  21. */
  22. trait Classifier[L, -T] extends (T => L) {  outer =>
  23.   /** Return the most likely label */
  24.   def apply(o: T) = classify(o)

  25.   /** Return the most likely label */
  26.   def classify(o: T) = scores(o).argmax

  27.   /** For the observation, return the score for each label that has a nonzero
  28.     * score.
  29.     */
  30.   def scores(o: T): Counter[L, Double]

  31.   /**
  32.    * Transforms output labels L=>M. if f(x) is not one-to-one then the max of score
  33.    * from the L's are used.
  34.    */
  35.   def map[M](f: L => M): Classifier[M, T] = new Classifier[M, T] {
  36.     def scores(o: T): Counter[M, Double] = {
  37.       val ctr = Counter[M, Double]()
  38.       val otherCtr = outer.scores(o)
  39.       for (x <- otherCtr.keysIterator) {
  40.         val y = f(x)
  41.         ctr(y) = ctr(y) max otherCtr(x)
  42.       }
  43.       ctr;
  44.     }
  45.   }
  46. }



  47. object Classifier {

  48.   trait Trainer[L, T] {
  49.     type MyClassifier <: Classifier[L, T]

  50.     def train(data: Iterable[Example[L, T]]): MyClassifier;
  51.   }

  52. }
复制代码

7
ReneeBK 发表于 2016-4-21 10:24:51
  1. package nak.classify
  2. /*
  3. Copyright 2010 David Hall, Daniel Ramage
  4. Licensed under the Apache License, Version 2.0 (the "License")
  5. you may not use this file except in compliance with the License.
  6. You may obtain a copy of the License at
  7. http://www.apache.org/licenses/LICENSE-2.0
  8. Unless required by applicable law or agreed to in writing, software
  9. distributed under the License is distributed on an "AS IS" BASIS,
  10. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  11. See the License for the specific language governing permissions and
  12. limitations under the License.
  13. */



  14. import nak.serialization.DataSerialization.ReadWritable
  15. import nak.serialization.DataSerialization
  16. import breeze.linalg._
  17. import breeze.linalg.operators._
  18. import breeze.math.{MutableTensorField, VectorField}

  19. /**
  20. * A LinearClassifier is a multi-class classifier with decision
  21. * function:
  22. * <code>
  23. * \hat y_i = \arg\max_y w_y^T x_i + b_y
  24. * </code>
  25. *
  26. * @author dlwh
  27. *
  28. */
  29. @SerialVersionUID(1L)
  30. class LinearClassifier[L,TW, TL, TF]
  31.     (val featureWeights: TW, val intercepts: TL)
  32.     (implicit viewT2 : TW<:<NumericOps[TW], vspace: MutableTensorField[TL, L, Double],
  33.      mulTensors : OpMulMatrix.Impl2[TW, TF, TL]) extends Classifier[L,TF] with Serializable {
  34.   import vspace._
  35.   def scores(o: TF) = {
  36.     val r = (featureWeights * o) + intercepts
  37.     val ctr = Counter[L, Double]()
  38.     for((l, v) <- r.iterator) {
  39.       ctr(l) = v
  40.     }
  41.     ctr
  42.   }
  43. }

  44. object LinearClassifier {
  45.   implicit def linearClassifierReadWritable[L, T2, TL, TF](implicit viewT2 : T2<:<NumericOps[T2], vspace: MutableTensorField[TL, L, Double],
  46.                                                            mulTensors : OpMulMatrix.Impl2[T2, TF, TL],
  47.                                                         view: TL <:< Tensor[L, Double],
  48.                                                         tfW: DataSerialization.ReadWritable[T2],
  49.                                                         tlW: DataSerialization.ReadWritable[TL]) = {
  50.     new ReadWritable[LinearClassifier[L,T2,TL,TF]] {
  51.       def write(sink: DataSerialization.Output, what: LinearClassifier[L,T2,TL,TF]) = {
  52.         tfW.write(sink,what.featureWeights)
  53.         tlW.write(sink,what.intercepts)
  54.       }

  55.       def read(source: DataSerialization.Input) = {
  56.         val t2 = tfW.read(source)
  57.         val tl = tlW.read(source)
  58.         new LinearClassifier(t2,tl)
  59.       }
  60.     }
  61.   }
  62. }
复制代码

8
redlamp 发表于 2016-4-22 17:04:46
great! too good.

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
jg-xs1
拉您进交流群
GMT+8, 2026-1-2 11:20