Conversation
| @@ -0,0 +1,89 @@ | |||
| from pattern.en import sentiment | |||
| from pattern.en import modality | |||
There was a problem hiding this comment.
You can import multiple things simultaneously, e.g.
from pattern.en import sentiment, modality
| from pattern.en import sentiment | ||
| from pattern.en import modality | ||
|
|
||
| def sort_data(data): |
There was a problem hiding this comment.
I recommend commenting functions - something along the lines of, "put in _______ and get out _______".
| names = [] | ||
| speech = [] | ||
| for line in fin: | ||
| line1 = line.replace(",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,","",1) |
There was a problem hiding this comment.
Usually when you hard-code in something like this -- something where you remove particular characters -- it's good to specify why. I imagine that for some reason, there was a line in your data with ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,," in it.
| line3 = line2.replace("\x85", "") | ||
| text = line3.replace("\r\n", "") | ||
| key = text[0:text.index(',')] | ||
| if key.strip() in d: |
There was a problem hiding this comment.
You call key.strip() multiple times in the same loop; it's more efficient to, say, have key be the .strip()'d version, and then use it directly. Also, just a heads-up: entries will vary based on case and capitalization. If this isn't intended, I'd recommend converting keys all to the same case (e.g., lowercase)--python has a built-in method for that.
| if key.strip() in d: | ||
| d[key.strip()].append(text[text.index(',')+1:len(text)]) | ||
| else: | ||
| d[key.strip()] = [text[text.index(','):len(text)]] |
There was a problem hiding this comment.
You can just use [text.index(',')+1:]; if you don't put a value after the : then you grab everything to the end of the string.
| d = sort_data(data) | ||
| avg_mod = average_modality(data) | ||
| avg_pol = average_polarity(data) | ||
| avg_sub = average_subjectivity(data) |
There was a problem hiding this comment.
This is rather computationally inefficient. average_modality, average_polarity, and average_subjectivity all repeat calls: once you've sorted d once, it's best to structure your program to not call sort_data again for average_modality, average_polarity, and the like. One way around this, for example, could be to pass d into average_polarity, then pass it into get_sentiment, and then skip the additional sortings.
| for x in value: | ||
| summation += x[0] | ||
| total += 1 | ||
| avg[key].append(summation/total) |
There was a problem hiding this comment.
Instead of creating an empty list, and then appending to it, we could just skip the avg[key] = [] and directly go to avg[key] = [summation/total]. Just food for thought!
Added comments, edited code for efficiency, and revised write up.
I submitted this to the wrong folder on Monday (on time) because I didn't understand Git. Can verify original files were uploaded on correct date from mini project 1 repository on my profile.