Created: Fri 2017-12-01
sat dec 30
- still reading that paper, which I started reading yesterday. It is packed with dry statements and technical terms. Few, if any, of these terms are explained. But the authors did illustrated how some of the methods work by giving examples and tables. It is taking longer than I thought.
- The “reward cycle” of studying is actually short. If you invest time and energy, you get reward next month, maybe next week. [credit to Humar] It keeps you going.
- I'm thinking about this because: My last assignment, which I handed in on Dec 18, got a partial “very good”. (Although overall it's a fail because I wasn't able to finish it.) Good news is, I won't need to redo all four assignments now. I only need to redo three and half assignments.
fri dec 29
- was reading a research paper when a line struck me: “the ability to recognize previously unknown entities is an essential part of ... systems.”
- this is the whole point of letting computers do the things that might overwhelm us humans. i need to correct my “labor-intensive” mindset in handling information
- reading the survey paper is goddamn difficult. Many people proposed rather smart ideas to allow computers better understand certain word patterns.
- then my thoughts diffused: the concept “page range” that the google founders invented, and that has been used by google to sort search results, is going to become obsolete. Because when AI becomes sufficiently smart, the search engine may well return the most relevant and correct results to you. The weight of each webpage calculated based on inbounding hyperlinks is essentially an indicator of authority as if voted by humans, ie. website creators. (I'm lost mid-sentence. Forgot what was I going to say.)
thu dec 28
- Read section 22.1 of textbook, which is about NER.
- (re-)learned the definition of limit.
- it is important that values of a function get closer and closer to a value as x approaches a point.
- contrary to this “simple english” definition, when a function value always equals a constant, the limit of that function is also this constant.
- One more thought:
- You don’t know what exactly is the AI doing with your data. All “classifying” tasks are done correctly only to a certain degree. You get a statistically acceptable number. But you don’t know which facts are understood correctly. And for the data to be accepted by computers, you also need human intervention to normalize it. All the dirty works may well be what so many so-called data-scientists are doing.
- failed to config zim-wiki keyboard shortcuts.
wed dec 27
- after a day of diddling finally resumed actually learning stuff. It's almost ridiculous that I need to re-learn calculus at a website.
- and interestingly, at the same time, I still need to learn how to talk about/read out math equations in english.
tue dec 26
- Didn't learn much today.
mon, dec 25
- Started taking Coursera: Learning How to Learn. Most content of this course is also found in her book, A Mind for Numbers. How Barbara Oakley began studying math and finally became a professor in engineering is also an inspiring story.
- Made a 2018 resolution.
- Began reading papers on NER. (Since the other task working on words will take forever. NER cannot be held back any longer.)
fri, dec 22
- Will have to begin working crazily today. (but didn't)
thu, dec 21
- Finished a fiction book. Watched OJ Simpson on Netflix.
- Did nothing.
mon, dec 18
- passed math retake. it's not ideal, but still passed.
- read code of wsd_classifier() and happy to find that I actually understood some of it.
- What is precision, recall and f-measure? I'm getting more confused by the day. It's like I know every word but I don't know what is really being said.
- Weird, some blog posts explain stuff more clearly than the textbook.
- For now I understand precision as: tp / (tp + fp); recall as tp / (tp + fn)
- delivered assignment, sadly with the last part missing. (7:45pm)
sun, dec 17
- Assignment deadline is tomorrow.
- The more I read the textbook, the more I feel I know nothing at all.
- (What is a feature vector?)
- Working on the assignment for much of the day. How do I improve the performance of a WSD?
- I just need to give an idea. Don't need to and won't be able to deliver anything feasible. One thing that came to my mind is to refer to hypernyms from WordNet. For example, if an adjacent noun is a hyponym of “physical object”, then “hard” probably means “solid, firm, rigid”, rather than “difficult”. Not sure if it's a good idea.
sat, dec 16
- there are ambiguities, when one sentence can have different meanings. and there are many different sentences that have the same meaning. that's why a doctrine of “canonical form” is proposed. (but this doesn't seem to be a very important concept.)
- all this leads to something known as “word sense disambiguation”. the one i have currently learned about involves bayes theorem
- meaning representation schemes (“What they all have in common is the ability to represent objects, properties of objects and relations among objects.” And they represent both meanings of language and state of affairs in the world.)
- the “vocabulary” of meaning-representation: non-logic vocab and logic vocab.
- How to read a confusion matrix (after a while of reading, i'm confused again.)
- Looked up what argmax is.
- The “pseudoword” approach in evaluating how well a WSD performs is, hahaha, so hilarious.
fri, dec 15
- did not do anything today at all.
- still don't know the results of math exam. (I should be panicking, but I am not.)
thu, dec 14
- too bad, didn't update this log on tue and wed. (TOO BAD!!!)
- read a little bit in chapter 17: representing meaning.
- one thing that was interesting was the distinction between “ambiguity” and “vagueness”.
mon, dec 11
- watched all the lecture videos left in the course.
- including word sense disambiguity, conditional semantics (what is that?), in which the most interesting part is semantic role labelling.
- Penn Treebank for English is entirely “PropBank-ed”, ie. all semantic roles of words in it have been tagged by human beings. I almost feel sorry for those people who had done this job.
- Will continue to read book the textbook on these topics.
sun, dec 10
- watched video about wordnet.
fri, dec 8
- work on assignment 3. (this feels bad. i'm in low spirit and energy.)
- redid lab 8 using the online ud annotator.
- obl relation in UD is rather useful when a noun doesn't function as obj but locates at a weird position.
- wouldn't be able to hand in ass 3 in time. too bad. Let me try to get a VG for assignment 4. Starting by learn early.
- latex is a piece of work. why the hell is it still used?
thu, dec 7
- watched video: how “left-arc, right-arc, remove, shift” work. this falls under “dependency parsing”
- learned “context free grammar” at school. the basic concept seemed easy to grasp.
- should've spent more time to meet the assignment deadline on friday. dreadful.
wed, dec 6
- watched lecture video “constituency parsing” (continued from yesterday). Generally under delivered on today's goals.
- pcfg parsing is hard to grasp.
- explained why a piece of Python code didn't work to a classmate.
- practiced Swedish listening for a while. As a relax after all the abstract formulas and diagrams.
- found an online dictionary between swedish and turkish/persian; and an online dictionary between norweigian and Chinese. Could be useful for future reference.
- also simple Turkish text to read and learn (Only when you have time, dude!)
tue, dec 5
- watched lecture video continuency parsing
- learned what a “context-independent grammar” is. Using “grammar” as if it's countable is weird.
- top-down parsing, bottom-up parsing
- It is only when I began learning about parsing and trees that I realized those abstract discrete math concepts are useful.
- To understand a 15-minute-long lecture, it takes a lot longer than that.
mon, dec 4
- Took exam for Basic Swedish. It was dead easy.
- Watched dependency parsing. Totally lost. (Shit!)
- Did nothing special on Staturday. Went to a coffee event on Sunday. Basically, didn't really learn much.
fri, dec 1
- took math exam in the morning. it was fun. solved (almost) every problem. didn't review central limit theorem, so skipped that one.
- searched NLP research papers for today's deadline (8 pm, again). first thought about detecting gender traits in language, then detecting fake news. the first is too unmanageable and the second is so inconclusive. then somehow came across a term called “named entity recognition”. at this stage, I find it manageable. found a survey that reviewed the study on this topic from 1990's to 2000's. then several other papers that expand on it. then it's settled. I'll try to read them in the following weeks.