2013年12月4日 星期三

Notes

Multilingual Issues Part 1: Word Segmentation


Indexing Chinese in Solr http://java.dzone.com/articles/indexing-chinese-solr

基於Lucene的企業搜尋引擎 http://www.open-open.com/doc/view/19cde45ffaa04ac687cb739d19fa59c4

Solr/Lucene開發經驗 http://www.open-open.com/doc/view/e83ae23d5347455c97c6f22701cdfffa

百度中文分詞技術淺析 http://www.open-open.com/doc/view/5cbb96e301a7407a9ff4fe6b670b1a3d

Lucene 全文搜尋引擎的應用 http://www.open-open.com/doc/view/9d653f49c95149c49961c201436359e4


The Pitfalls and Complexities of Chinese to Chinese Conversion(漢字簡繁轉換的複雜性和陷阱) http://www.kanji.org/cjk/c2c/c2cbasis.htm

CC-CEDICT http://cc-cedict.org/wiki/start

CharFilter - normalize characters before tokenizer https://issues.apache.org/jira/browse/SOLR-822

CKIP繁體中文斷詞 http://ckipsvr.iis.sinica.edu.tw/

MMSEG介绍及基于分类的中文分词算法遐想

A Finite-State Automata Based IME Model Design for Intelligent Mobile http://www.cipsc.org.cn/jsip/detail_content.php?&xuhao=2379

中国中文信息学会 http://www.cipsc.org.cn/

吳毓傑 (NTCIR-2011  RITE  No.1   2011國際繁體字蘊含大賽第名; SIGHAN-2010  Chinese word segmentation bake-off  No.1   2010國際繁體中文斷詞大賽第一名 ) http://140.115.112.118/

Ansj中文分詞 https://github.com/ansjsun/ansj_seg
ictclas的java实现.基本上重写了所有的数据结构和算法.词典是用的开源版的ictclas所提供的.并且进行了部分的人工优化内存中中文分词每秒钟大约100万字(速度上已经超越ictclas) 文件读取分词每秒钟大约30万字准确率能达到96%以上. 目前实现了.中文分词. 中文姓名识别 . 用户自定义词典. 可以应用到自然语言处理等方面,适用于对分词效果要求搞的各种项目.

http://stackoverflow.com/questions/4431135/searching-for-numbers-product-codes-in-solr

http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/solr/core/src/test-files/solr/conf/solrconfig-phrasesuggest.xml

How to execute an SSIS package from .NET ?
http://stackoverflow.com/questions/273751/how-to-execute-an-ssis-package-from-net

Running SSIS package programmatically
http://blogs.msdn.com/b/michen/archive/2007/03/22/running-ssis-package-programmatically.aspx

how to write VC# codes to create and automate a Microsoft Excel instance
http://code.msdn.microsoft.com/office/CSAutomateExcel-7f89a439
http://blogs.msdn.com/b/eric_carter/archive/2009/03/12/attaching-to-an-already-running-office-application-from-your-application-using-getactiveobject-or-bindtomoniker.aspx
How to call Excel macro programmatically in C#
http://social.msdn.microsoft.com/Forums/en-US/exceldev/thread/2e33b8e5-c9fd-42a1-8d67-3d61d2cedc1c

Import Data From Excel using SSIS - Part 1
http://www.mssqltips.com/sqlservertip/2770/importing-data-from-excel-using-ssis--part-1/


pavlova 鮮奶油蛋白霜甜點

http://blog.roodo.com/spoon/archives/998316.html

淺談-"-巧克力原料與調溫-"
http://damontainan.pixnet.net/blog/post/26050103-%E6%B7%BA%E8%AB%87-%22-%E5%B7%A7%E5%85%8B%E5%8A%9B%E5%8E%9F%E6%96%99%E8%88%87%E8%AA%BF%E6%BA%AB-%22

Play music in Air
4 AirPlay Receivers That Are Cheaper Than Apple TV
http://www.makeuseof.com/tag/4-airplay-receivers-that-are-cheaper-than-apple-tv/

How to build your own AirPlay audio system
http://www.macworld.com/article/1160417/how_to_build_your_own_airplay_audio_system.html