Holmes Mini Project 1 by AndrewHolmes · Pull Request #14 · sd16fall/TextMining

AndrewHolmes · 2016-10-06T19:25:59Z

I submitted this to the wrong folder on Monday (on time) because I didn't understand Git. Can verify original files were uploaded on correct date from mini project 1 repository on my profile.

matthewruehle · 2016-10-06T19:35:13Z

MiniProject1-master/Holmes_Mini_Project_1.py

@@ -0,0 +1,89 @@
+from pattern.en import sentiment
+from pattern.en import modality


You can import multiple things simultaneously, e.g.
from pattern.en import sentiment, modality

matthewruehle · 2016-10-06T19:35:55Z

MiniProject1-master/Holmes_Mini_Project_1.py

+from pattern.en import sentiment
+from pattern.en import modality
+
+def sort_data(data):


I recommend commenting functions - something along the lines of, "put in _______ and get out _______".

matthewruehle · 2016-10-06T19:39:37Z

MiniProject1-master/Holmes_Mini_Project_1.py

+    names = []
+    speech = []
+    for line in fin:
+        line1 = line.replace(",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,","",1)


Usually when you hard-code in something like this -- something where you remove particular characters -- it's good to specify why. I imagine that for some reason, there was a line in your data with ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,," in it.

matthewruehle · 2016-10-06T19:44:16Z

MiniProject1-master/Holmes_Mini_Project_1.py

+        line3 = line2.replace("\x85", "")
+        text = line3.replace("\r\n", "")
+        key = text[0:text.index(',')]
+        if key.strip() in d:


You call key.strip() multiple times in the same loop; it's more efficient to, say, have key be the .strip()'d version, and then use it directly. Also, just a heads-up: entries will vary based on case and capitalization. If this isn't intended, I'd recommend converting keys all to the same case (e.g., lowercase)--python has a built-in method for that.

matthewruehle · 2016-10-06T19:46:41Z

MiniProject1-master/Holmes_Mini_Project_1.py

+        if key.strip() in d:
+            d[key.strip()].append(text[text.index(',')+1:len(text)])
+        else:
+            d[key.strip()] = [text[text.index(','):len(text)]]


You can just use [text.index(',')+1:]; if you don't put a value after the : then you grab everything to the end of the string.

matthewruehle · 2016-10-06T19:52:47Z

MiniProject1-master/Holmes_Mini_Project_1.py

+    d = sort_data(data)
+    avg_mod = average_modality(data)
+    avg_pol = average_polarity(data)
+    avg_sub = average_subjectivity(data)


This is rather computationally inefficient. average_modality, average_polarity, and average_subjectivity all repeat calls: once you've sorted d once, it's best to structure your program to not call sort_data again for average_modality, average_polarity, and the like. One way around this, for example, could be to pass d into average_polarity, then pass it into get_sentiment, and then skip the additional sortings.

matthewruehle · 2016-10-06T19:55:08Z

MiniProject1-master/Holmes_Mini_Project_1.py

+        for x in value:
+            summation += x[0]
+            total += 1
+        avg[key].append(summation/total)


Instead of creating an empty list, and then appending to it, we could just skip the avg[key] = [] and directly go to avg[key] = [summation/total]. Just food for thought!

Added comments, edited code for efficiency, and revised write up.

Add files via upload

7fc0b7d

matthewruehle reviewed Oct 6, 2016

View reviewed changes

Adding Revised files for Mini Project 1

8373519

Added comments, edited code for efficiency, and revised write up.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Holmes Mini Project 1#14

Holmes Mini Project 1#14
AndrewHolmes wants to merge 2 commits intosd16fall:masterfrom
AndrewHolmes:master

AndrewHolmes commented Oct 6, 2016

Uh oh!

matthewruehle Oct 6, 2016

Uh oh!

matthewruehle Oct 6, 2016

Uh oh!

matthewruehle Oct 6, 2016

Uh oh!

matthewruehle Oct 6, 2016

Uh oh!

matthewruehle Oct 6, 2016

Uh oh!

matthewruehle Oct 6, 2016

Uh oh!

matthewruehle Oct 6, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -0,0 +1,89 @@
		from pattern.en import sentiment
		from pattern.en import modality

Conversation

AndrewHolmes commented Oct 6, 2016

Uh oh!

matthewruehle Oct 6, 2016

Choose a reason for hiding this comment

Uh oh!

matthewruehle Oct 6, 2016

Choose a reason for hiding this comment

Uh oh!

matthewruehle Oct 6, 2016

Choose a reason for hiding this comment

Uh oh!

matthewruehle Oct 6, 2016

Choose a reason for hiding this comment

Uh oh!

matthewruehle Oct 6, 2016

Choose a reason for hiding this comment

Uh oh!

matthewruehle Oct 6, 2016

Choose a reason for hiding this comment

Uh oh!

matthewruehle Oct 6, 2016

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants