<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-5174464120141530461</id><updated>2012-02-16T01:17:12.124-08:00</updated><category term='sentimental analysis'/><category term='Percentile'/><category term='boosting'/><category term='machine learning'/><category term='information retrieval'/><category term='Quantile'/><category term='perceptron'/><category term='natural language processing'/><title type='text'>Text Intelligence and Knowledge Discovery</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://nlpdigger.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5174464120141530461/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://nlpdigger.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>nlp-digger</name><uri>http://www.blogger.com/profile/14270103207768750759</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://2.bp.blogspot.com/_MSowlbxkoHM/Scu8QUX7A4I/AAAAAAAAA0Y/8I5qya5OFL0/S220/ffliu.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>5</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-5174464120141530461.post-7295107600637396719</id><published>2009-12-31T11:08:00.000-08:00</published><updated>2009-12-31T11:22:19.521-08:00</updated><title type='text'>Solution for using UTF8 format bibtex from Zotero with Latex</title><content type='html'>When exporting the bibtex file from Zotero, the file is in utf-8 encoding. So there is a problem when I directly used it in latex. I dug for quite a while online but didn't find any ideal solution for this. The problem is when converting utf-8 to iso-8859-1, there will be some unidentified characters, leading to abnormal display in generated pdf file. &lt;br /&gt;Finally, I got this solution.&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(51, 102, 255); font-weight: bold;"&gt;1. Locate the Zotero data directory &lt;/span&gt;&lt;p&gt; By default, Zotero data is stored within your Firefox profile in these &lt;acronym title="Operating System"&gt;OS&lt;/acronym&gt;-dependent directories.  &lt;/p&gt;  &lt;p&gt; On a Mac:&lt;br /&gt; /Users/&lt;username&gt;/Library/Application Support/Firefox/Profiles/&lt;randomstring&gt;/zotero &lt;/p&gt;  &lt;p&gt; On Windows 2000/XP:&lt;br /&gt; C:\Documents and Settings\&lt;username&gt;\Application Data\Mozilla\Firefox\Profiles\&lt;randomstring&gt;\zotero &lt;/p&gt;  &lt;p&gt; On Windows Vista:&lt;br /&gt; C:\Users\&lt;user&gt;\AppData\Roaming\Mozilla\Firefox\Profiles\&lt;randomstring&gt;\zotero &lt;/p&gt;  &lt;p&gt; On most Linux distributions:&lt;br /&gt;  ~/.mozilla/firefox/&lt;randomstring&gt;/zotero&lt;br /&gt;&lt;/p&gt;&lt;p style="color: rgb(51, 102, 255); font-weight: bold;"&gt;2. Locate translator file BibTex.js in the $zoterodir/translators directory&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="color: rgb(51, 102, 255); font-weight: bold;"&gt;3. Open that file and change the following &lt;/span&gt;&lt;br /&gt;&lt;/p&gt;Zotero.addOption("exportCharset", "UTF-8");&lt;br /&gt;&lt;br /&gt;into&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;Zotero.addOption("exportCharset", "ISO-8859-1");&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Now you can export the bib file again from zotero, and the file is in ascll format ready for use.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5174464120141530461-7295107600637396719?l=nlpdigger.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nlpdigger.blogspot.com/feeds/7295107600637396719/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://nlpdigger.blogspot.com/2009/12/solution-for-using-utf8-format-bibtex.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5174464120141530461/posts/default/7295107600637396719'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5174464120141530461/posts/default/7295107600637396719'/><link rel='alternate' type='text/html' href='http://nlpdigger.blogspot.com/2009/12/solution-for-using-utf8-format-bibtex.html' title='Solution for using UTF8 format bibtex from Zotero with Latex'/><author><name>nlp-digger</name><uri>http://www.blogger.com/profile/14270103207768750759</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://2.bp.blogspot.com/_MSowlbxkoHM/Scu8QUX7A4I/AAAAAAAAA0Y/8I5qya5OFL0/S220/ffliu.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5174464120141530461.post-5341640071426186928</id><published>2009-06-15T10:50:00.000-07:00</published><updated>2009-06-15T10:58:08.106-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Percentile'/><category scheme='http://www.blogger.com/atom/ns#' term='Quantile'/><title type='text'>Calculating Percentile</title><content type='html'>Procedure of calculating (k%) percentile:&lt;br /&gt;&lt;br /&gt;Assume that we have an array M of n numbers&lt;br /&gt;(1) Sort in increasing order, calculate (n-1)*k%, the integer part is i, and decimal part is j.&lt;br /&gt;(2) Result=(1-j)*M_(i+1) + j*M_(i+2).&lt;br /&gt;Special cases:&lt;br /&gt;if j=0, the results is M_(i+1); if M_(i+1)=M_(i+2)  either of them is the result.&lt;br /&gt;&lt;br /&gt;Quartile can be calculated this way,&lt;br /&gt;1st Quartile k%=25%&lt;br /&gt;2nd Quartile k%=50%&lt;br /&gt;3rd Quartile k%=75%&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5174464120141530461-5341640071426186928?l=nlpdigger.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nlpdigger.blogspot.com/feeds/5341640071426186928/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://nlpdigger.blogspot.com/2009/06/calculating-percentile.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5174464120141530461/posts/default/5341640071426186928'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5174464120141530461/posts/default/5341640071426186928'/><link rel='alternate' type='text/html' href='http://nlpdigger.blogspot.com/2009/06/calculating-percentile.html' title='Calculating Percentile'/><author><name>nlp-digger</name><uri>http://www.blogger.com/profile/14270103207768750759</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://2.bp.blogspot.com/_MSowlbxkoHM/Scu8QUX7A4I/AAAAAAAAA0Y/8I5qya5OFL0/S220/ffliu.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5174464120141530461.post-83215923490505445</id><published>2009-04-02T15:21:00.000-07:00</published><updated>2009-04-02T16:06:05.652-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='perceptron'/><category scheme='http://www.blogger.com/atom/ns#' term='boosting'/><category scheme='http://www.blogger.com/atom/ns#' term='machine learning'/><title type='text'>Boosting and Perceptron Learning for Ranking</title><content type='html'>&lt;span style="font-weight: bold;"&gt;Boosting&lt;/span&gt; can be considered to be a greedy algorithm for finding the parameter &lt;span style="font-weight: bold;"&gt;w&lt;/span&gt; that minimize the loss function. &lt;span style="font-weight: bold;"&gt;w&lt;/span&gt; initially is set to (w&lt;span style="font-size:85%;"&gt;0&lt;/span&gt;,0,...,0), w&lt;span style="font-size:85%;"&gt;0&lt;/span&gt; is set based on the base model. When applied to ranking problems, the loss function  is an upper bound on the number of "ranking errors", a ranking error being a case where an incorrect candidate gets a higher value than a correct candidate. The iterations is going through all the possible features where a single feature is chosen(greedy) and its weight is updated at each iteration. The number of rounds of iteration will be decided using the cross validation. The final output is the final parameter setting &lt;span style="font-weight: bold;"&gt;w&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Perceptron algorithm&lt;/span&gt; for ranking makes a pass over the traning set instead of features, at each tranining example storing a parameter vector w(i) i=1,2,...,n, which is initially set to be all zeros, only modified when a mistake is made on an example. The update would be the difference of the offending examples' representations(between the 1st rank candidate and the candidate of this example with the highest score based on current w). Generally w(n) is taken as the final parameter to decide the ranking given a new test example. Since during the training, n parameter settings have been constructed, each of which will have its own highest ranking candidate. The idea of taking each of the settings to "vote" for a candidate is called voted perceptron.&lt;br /&gt;&lt;br /&gt;The ranking function for both cases: F(x,&lt;span style="font-weight: bold;"&gt;w&lt;/span&gt;)=&lt;span style="font-weight: bold;"&gt;w&lt;/span&gt;*&lt;span style="font-weight: bold;"&gt;h&lt;/span&gt;(x)  where &lt;span style="font-weight: bold;"&gt;h&lt;/span&gt;(x) is the feature vectors representing x.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5174464120141530461-83215923490505445?l=nlpdigger.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nlpdigger.blogspot.com/feeds/83215923490505445/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://nlpdigger.blogspot.com/2009/04/boosting-and-perceptron-learning-for.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5174464120141530461/posts/default/83215923490505445'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5174464120141530461/posts/default/83215923490505445'/><link rel='alternate' type='text/html' href='http://nlpdigger.blogspot.com/2009/04/boosting-and-perceptron-learning-for.html' title='Boosting and Perceptron Learning for Ranking'/><author><name>nlp-digger</name><uri>http://www.blogger.com/profile/14270103207768750759</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://2.bp.blogspot.com/_MSowlbxkoHM/Scu8QUX7A4I/AAAAAAAAA0Y/8I5qya5OFL0/S220/ffliu.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5174464120141530461.post-6168080551825743914</id><published>2009-03-27T10:31:00.000-07:00</published><updated>2009-04-02T15:20:00.962-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='information retrieval'/><category scheme='http://www.blogger.com/atom/ns#' term='natural language processing'/><title type='text'>Search Engines Beyond Google</title><content type='html'>&lt;p&gt;Natural language processing search engine is increasingly on demand which aims to more accurately meet different information needs as well as support automatic understanding and digestion.&lt;/p&gt; &lt;p&gt;&lt;a href="http://www.powerset.com/"&gt;Powerset&lt;/a&gt; is first applying its natural language processing to search, aiming to improve the way we find information by unlocking the meaning encoded in ordinary human language. Powerset's first product is a search and discovery experience for Wikipedia, launched in May 2008. Powerset's technology improves the entire search process. In the search box, you can express yourself in keywords, phrases, or simple questions. On the search results page, Powerset gives more accurate results, often answering questions directly, and aggregates information from across multiple articles. Finally, Powerset's technology follows you into enhanced Wikipedia articles, giving you a better way to quickly digest and navigate content.&lt;/p&gt; &lt;p&gt;The following article discussed five search engines which may be a good choice when we want some specific answers.&lt;/p&gt;&lt;br /&gt;&lt;h3 class="post-title entry-title"&gt;&lt;a href="http://top10listblog.blogspot.com/2009/03/top-5-non-google-search-engines.html"&gt;Top 5 Non-Google Search Engines&lt;/a&gt;&lt;/h3&gt;   &lt;p class="MsoNormal"&gt;Many times you can't just live your life using only one search engine. There's alot of good options out there, that are just as good, maybe even better than Google. Google is great for general information, but when you want some more concrete, and reliable answers, it may be best to look elsewhere. Here are the top 5 Non-Google search engines (sans Yahoo and MSN)&lt;br /&gt;&lt;br /&gt;1. &lt;a href="http://www.sweetsearch.com/"&gt;Sweet Search&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Sweet Search is a new engine provided by FindingDulcinea.com, an encylopedic guide site. You'll find that it's a lot more selective than your average search engine: For a search that might return millions on Google you'll get 500 from Sweet Search. This isn't necessarily a bad thing. All the results from Sweet Search are reliable: They are hand selected by findingDulcinea's staff to ensure that they are all high quality websites. You won't be finding any R. Kelly fansites from Kim in Wisconsin. It also doesn't hurt that Sweet Search prefaces their search results with a handful of relevant guide selections from FindD.&lt;br /&gt;&lt;br /&gt;2. &lt;a href="http://www.kosmix.com/"&gt;Kosmix&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The great thing about this page is that it covers all the bases: Anything you search, you get the&lt;br /&gt;web results, audio, video, tweets, shopping, images, conversations taking place on sites like Yahoo! Answers and Answerbag, and related searches, all on the same page. Needless to say it's extremely comprehensive, and if you're searching on a more general level you'll get more then enough information.&lt;br /&gt;&lt;br /&gt;3. &lt;a href="http://www.ask.com/?o=0&amp;amp;l=dir"&gt;Ask.com&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Ask.com's greatest feature isn't in it's web search function, but it's answer search. It has questions and answers catalogued from all kinds of sites like Yahoo! Answers, Ehow, Askville, Answerbag, and Wiki Answers. Those however, are not the only sites it is limited to. It includes really any site that the relevant question, or questions close to it, has been asked or discussed.&lt;br /&gt;&lt;br /&gt;4. &lt;a href="http://silkwise.com/content/index"&gt;Silkwise&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Where as Ask.com is an answer search, Silkwise is more of a comprehensive question database. You ask your question and in time it is answered by at least one expert. Every answer you get is nigh guaranteed to be comprehensive and highly detailed.&lt;br /&gt;&lt;br /&gt;5. &lt;a href="http://www.chacha.com/"&gt;ChaCha&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Chacha is a question and answer site but with a twist: You text your questions to them and the answers are texted back to you on the fly, written by real people. Unlike pages like Ask.com or Silkwise which have only specific questions answered, you can literally ask anything at ChaCha. Of course you are not going to be asking deep philosophical questions, or for a how-to on assembling a car, but for a quick fact check or just a short answer it's great. It doesn't hurt that you can look in the online database for anything that might've already been asked.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5174464120141530461-6168080551825743914?l=nlpdigger.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nlpdigger.blogspot.com/feeds/6168080551825743914/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://nlpdigger.blogspot.com/2009/03/search-engines-beyond-google.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5174464120141530461/posts/default/6168080551825743914'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5174464120141530461/posts/default/6168080551825743914'/><link rel='alternate' type='text/html' href='http://nlpdigger.blogspot.com/2009/03/search-engines-beyond-google.html' title='Search Engines Beyond Google'/><author><name>nlp-digger</name><uri>http://www.blogger.com/profile/14270103207768750759</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://2.bp.blogspot.com/_MSowlbxkoHM/Scu8QUX7A4I/AAAAAAAAA0Y/8I5qya5OFL0/S220/ffliu.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5174464120141530461.post-7913606364713603618</id><published>2009-03-26T10:39:00.000-07:00</published><updated>2009-03-26T10:59:32.164-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='sentimental analysis'/><title type='text'>Different Levels on Sentimental Analysis</title><content type='html'>Sentimental analysis is becoming increasingly hot in the natural language processing fields. It is dealing with not only different review data, but the web social media data such as newsgroups and blogs. An effective analysis requires more than a simple bag-of-words approach. Many studies have been done on different linguistic granularities using different level of linguistic analysis.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; color: rgb(204, 102, 0);"&gt;Granularity layers&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Word&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Phrase&lt;/li&gt;&lt;li&gt;Pattern&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Contextual expressions(subsentential)&lt;/li&gt;&lt;li&gt;Sentence level&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-weight: bold; color: rgb(204, 102, 0);"&gt;Linguistic Analysis Levels&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Lexical&lt;/li&gt;&lt;li&gt;Syntactic Parsing&lt;/li&gt;&lt;li&gt;Semantic&lt;/li&gt;&lt;li&gt;Discourse&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5174464120141530461-7913606364713603618?l=nlpdigger.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nlpdigger.blogspot.com/feeds/7913606364713603618/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://nlpdigger.blogspot.com/2009/03/different-levels-on-sentimental.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5174464120141530461/posts/default/7913606364713603618'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5174464120141530461/posts/default/7913606364713603618'/><link rel='alternate' type='text/html' href='http://nlpdigger.blogspot.com/2009/03/different-levels-on-sentimental.html' title='Different Levels on Sentimental Analysis'/><author><name>nlp-digger</name><uri>http://www.blogger.com/profile/14270103207768750759</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='28' height='32' src='http://2.bp.blogspot.com/_MSowlbxkoHM/Scu8QUX7A4I/AAAAAAAAA0Y/8I5qya5OFL0/S220/ffliu.jpg'/></author><thr:total>1</thr:total></entry></feed>
