Difference between revisions of "BA/app"

From Fiamma
Jump to navigationJump to search
Line 7: Line 7:
  
 
Linguistics:  
 
Linguistics:  
[https://is.muni.cz/th/180075/ff_b/Thesis_2nd_draft.txt | Dissertation partly about interferences]. Has a nice error classification, error taxonomy, borrowing, tranfer etc etc. Seems like a nice intro to "What exists"
+
[https://is.muni.cz/th/180075/ff_b/Thesis_2nd_draft.txt Dissertation partly about interferences]. Has a nice error classification, error taxonomy, borrowing, tranfer etc etc. Seems like a nice intro to "What exists"
  
 
== CL/ML resources ==
 
== CL/ML resources ==
Line 22: Line 22:
  
 
[http://www.aclweb.org/anthology/O13-1022 error detection using local word bigram and trigram] + some others
 
[http://www.aclweb.org/anthology/O13-1022 error detection using local word bigram and trigram] + some others
 +
[http://www.mitpressjournals.org/doi/pdf/10.1162/COLI_a_00072 Automatic error analysis of machine translation output] -- more about possible errors and ways to classify them
  
 
=== Somewhat similar problems being solved ===
 
=== Somewhat similar problems being solved ===
Line 29: Line 30:
 
* [http://web.eecs.umich.edu/~mihalcea/papers/mihalcea.acl09.pdf] -- lie detector
 
* [http://web.eecs.umich.edu/~mihalcea/papers/mihalcea.acl09.pdf] -- lie detector
 
* [http://delivery.acm.org/10.1145/2390000/2388617/p1-hauch.pdf?ip=149.205.109.95&id=2388617&acc=OPEN&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&CFID=1008304166&CFTOKEN=69973089&__acm__=1511275273_f72fd72f6e2433e82566681fc1a564cb Linguistic Cues to Deception Assessed by Computer Programs: A Meta-Analysis] -- also ideas of possible features that might be interesting to look into.
 
* [http://delivery.acm.org/10.1145/2390000/2388617/p1-hauch.pdf?ip=149.205.109.95&id=2388617&acc=OPEN&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&CFID=1008304166&CFTOKEN=69973089&__acm__=1511275273_f72fd72f6e2433e82566681fc1a564cb Linguistic Cues to Deception Assessed by Computer Programs: A Meta-Analysis] -- also ideas of possible features that might be interesting to look into.
 +
 +
== Linguistics  ==
 +
=== Typical errors ===
 +
==== Russian ====
 +
*[http://www.simonf.com/lang/mistakes_russian_win.html Similar-sounding and semantically non-identical words + idioms]
 +
* ''[http://www.study.ru/support/lib/note281.html Grammar]. Articles, connecting verbs, future tenss, negative sentences, commas etc etc -- really nice.''
 +
==== German ====
 +
* [https://www.englishwithnick.de/resources-for-germans/typical-grammar-mistakes-made-by-germans/ list of sentences]
 +
* [http://londonschool.de/top-english-mistakes-made-german-learners-volume-1/ also examples, hard to generalize]
 +
* [https://englishwithkirsty.com/2014/07/15/10-typical-mistakes-made-by-german-speakers-who-are-learning-english/ examples, a bit better ones?]
 +
* [http://www.jabbalab.com/blog/966/how-and-when-to-use-german-reflexive-verbs German reflexive verbs list] which could be used to see differences between English and German reflexive verbs.
 +
 +
==== Indian ====
 +
* [https://en.wikipedia.org/wiki/Indian_English#Morphology_and_syntax Wikipedia - Indian English] I thing this could be done just statistically?
 +
 +
==== Italian ====
 +
???
  
 
== Random ==
 
== Random ==
 
[https://www.safaribooksonline.com/library/view/natural-language-annotation/9781449332693/ Natural Language Annotation for Machine Learning] ebook, seems to cover quite a lot
 
[https://www.safaribooksonline.com/library/view/natural-language-annotation/9781449332693/ Natural Language Annotation for Machine Learning] ebook, seems to cover quite a lot
  
[http://lit.eecs.umich.edu/downloads.html#Cross-Cultural%20Deception downloads and demos -- incl datasets for CL lying detection -- generally interesting
+
[http://lit.eecs.umich.edu/downloads.html#Cross-Cultural%20Deception downloads and demos -- datasets for CL lying detection] -- generally interesting
 +
 +
[https://www.uclassify.com/browse/uclassify/ Classification-as-a-service with free examples]. Gender, MBTI, etc etc etc, pretty nice

Revision as of 14:16, 26 November 2017

Primary sources

Computer linguistics: CL intro

Genetic Algorithms: An introduction to genetic algorithms

Linguistics: Dissertation partly about interferences. Has a nice error classification, error taxonomy, borrowing, tranfer etc etc. Seems like a nice intro to "What exists"

CL/ML resources

Text classification

Natural language classification with Python:Book, especially learning to classify text

With machine learning:

Error Detection

error detection using local word bigram and trigram + some others Automatic error analysis of machine translation output -- more about possible errors and ways to classify them

Somewhat similar problems being solved

Cross-cultural Deception Detection. It uses unigrams + LIWC (which is more psychological and less relevant)

Linguistics

Typical errors

Russian

German

Indian

Italian

???

Random

Natural Language Annotation for Machine Learning ebook, seems to cover quite a lot

downloads and demos -- datasets for CL lying detection -- generally interesting

Classification-as-a-service with free examples. Gender, MBTI, etc etc etc, pretty nice