Since taking Rest of You class last semester with Dan O’Sullivan I got really interested and excited about everything that has to do with language and text: text and sentiment analysis, text generation.

For the final project of Learning Machines class we have to select an applied learning problem  and outline what sort of learning algorithm I would use to address it and how I would implement it all together.
No surprise, I want to do something with text analysis and generation. Here is my first attempt to define it:

Applied learning problem: how can machine represent the mood of words? How can it be represented in color or generated in new text?

For example, is there any way I can detect the mood of American media or Lithuanian media? What kind of algorithm would I have to build or libraries to use to implement it?

If we looked from machine learning standpoint,  the process for a machine could look something like this:
Task: Classify text that is going to be either positive, negative or something in between (a set of different moods)
Experience: A corpus of text where some articles are positive, negative, funny, neutral, etc.
Performance: Classification accuracy, the number of moods predicted correctly out of all moods considered as a percentage.

My questions about “performance”:

  • How do we get this one? Do we have to set / guess it in advance?
  • Would I have to outline specific words that might represent a certain mood?

Challenges and limitations: computers look for keywords that we give them not really understanding what they mean, so the success of the algorithm would depend on how much detail information I give them (?). So my main challenge and question is:

How can I solve this problem?

  1. What data should I collect?
  2. How do I prepare it?
  3. How do I design a program to solve this problem?

References for related projects: this one comes from Patrick Hebron’s (my Learning Machines teacher) references list, http://www.cortical.io/, tool converting the words into semantical fingerprints. Also Ross Goodwin’s ITP Thesis Talk: https://vimeo.com/166979130, where he presents his Narrated Reality project.
More to be added.

Next steps:

  1. Do more research on how others tried to solve similar problem
  2. Outline the process of problem solving (both conceptual and technical)
  3. List (or at least try) what tools would be needed to address the problem, etc.

Leave a Reply

Your email address will not be published. Required fields are marked *