Since taking Rest of You class last semester with Dan O’Sullivan I got really interested and excited about everything that has to do with language and text: text and sentiment analysis, text generation.
For the final project of Learning Machines class we have to select an applied learning problem and outline what sort of learning algorithm I would use to address it and how I would implement it all together.
No surprise, I want to do something with text analysis and generation. Here is my first attempt to define it:
Applied learning problem: how can machine represent the mood of words? How can it be represented in color or generated in new text?
For example, is there any way I can detect the mood of American media or Lithuanian media? What kind of algorithm would I have to build or libraries to use to implement it?
If we looked from machine learning standpoint, the process for a machine could look something like this:
Task: Classify text that is going to be either positive, negative or something in between (a set of different moods)
Experience: A corpus of text where some articles are positive, negative, funny, neutral, etc.
Performance: Classification accuracy, the number of moods predicted correctly out of all moods considered as a percentage.
My questions about “performance”:
- How do we get this one? Do we have to set / guess it in advance?
- Would I have to outline specific words that might represent a certain mood?
Challenges and limitations: computers look for keywords that we give them not really understanding what they mean, so the success of the algorithm would depend on how much detail information I give them (?). So my main challenge and question is:
How can I solve this problem?
- What data should I collect?
- How do I prepare it?
- How do I design a program to solve this problem?
References for related projects: this one comes from Patrick Hebron’s (my Learning Machines teacher) references list, http://www.cortical.io/, tool converting the words into semantical fingerprints. Also Ross Goodwin’s ITP Thesis Talk: https://vimeo.com/166979130, where he presents his Narrated Reality project.
More to be added.
- Do more research on how others tried to solve similar problem
- Outline the process of problem solving (both conceptual and technical)
- List (or at least try) what tools would be needed to address the problem, etc.