- Deep Learning NLP Pipeline implemented on Tensorflow. Following the 'simplicity' rule, this project aims to use the deep learning library of Tensorflow to implement new NLP pipeline. You can extend the project to train models with your own corpus/languages. Pretrained models of Chinese corpus are distributed. Free RESTful NLP API are also provided. Visit http://www.deepnlp.org/api/v1.0/pipeline for details.
- Brief Introduction
- Modules
- Installation
- Tutorial
- Segmentation
- POS
- NER
- Pipeline
- Textsum
- Textrank
- Textcnn
- Train your model
- Web API Service
- 中文简介
- 安装说明
- Reference
- Modules
- NLP Pipeline Modules:
- Word Segmentation/Tokenization
- Part-of-speech (POS)
- Named-entity-recognition(NER)
- textsum: automatic summarization Seq2Seq-Attention models
- textrank: extract the most important sentences
- textcnn: document classification
- Web API: Free Tensorflow empowered web API
- Planed: Parsing, Automatic Summarization
- Algorithm(Closely following the state-of-Art)
- Word Segmentation: Linear Chain CRF(conditional-random-field), based on python CRF++ module
- POS: LSTM/BI-LSTM network, based on Tensorflow
- NER: LSTM/BI-LSTM/LSTM-CRF network, based on Tensorflow
- Textsum: Seq2Seq with attention mechanism
- Texncnn: CNN
- Pre-trained Model
- Chinese: Segmentation, POS, NER (1998 china daily corpus)
- English: POS (brown corpus)
- For your Specific Language, you can easily use the script to train model with the corpus of your language choice.
- Installation
- Requirements
- CRF++ (>=0.54)
- Tensorflow(1.0) This project is up to date with the latest tensorflow release. For tensorflow (<=0.12.0), use deepnlp <=0.1.5 version. See RELEASE.md for more details
- Pip
- # linux, run the script:
- pip install deepnlp
复制代码本帖隐藏的内容
https://github.com/rockingdingo/deepnlp#segmentation