A python script that tries to convey the meaning of a piece of text by content subtraction.
the script takes two parameters -f filename and -k keywords
Here are some results
- Run on Obama’s Nobel Peace Prize Speech, the script is used to compare the uses of the words “war” and “peace”, the appearance of the words are in the same order of the original text
-
-
-
- war
war
war
war
war
peace
war war war
war war
war
war peace
peace
peace
peace peace peace
peace
peace war
peace peace
war peace
- war
-
-
-
- Run on the article from New York Times : U.S. Military Faults Leaders in Deadly Attack on Base (http://www.nytimes.com/2010/02/06/world/asia/06afghan.html?scp=1 sq=U.S.%20Military%20Faults%20Leaders%20in%20Deadly%20Attack%20on%20Base&st=cse)
-
-
- Take 1
- results werehigh to In wasassist protection thea Friday,
complex
Keating mission than decided to combat
Combat
during
punishments.
- results werehigh to In wasassist protection thea Friday,
- take 2
- 2010AttackNORDLANDAfghanistan
the
base, District the attack. smaller
insurgents
had
a an five
force
the
hours
officer
Privacy
- 2010AttackNORDLANDAfghanistan
- Take 1
-
-
- Source Code :
import getopt,sys,random
def main():
key = “”
replace = “”
filename = “../texts/stein.txt”
try:
opts, args = getopt.getopt(sys.argv[1:], “hf:r:k:”, ["help","filename=","replace=","keywords="])
except getopt.GetoptError:
sys.exit(2)
for opt, arg in opts:
if opt in (“-h”,”–help”):
print “hello”
sys.exit()
if opt in (“-f”,”–filename”):
filename = arg
print filename
if opt in (“-r”,”–replace”):
replace = arg
print replace
if opt in (“-k”,”–keyword”):
key = arg
key = key.split(” “)
f = open(filename, “r”)
for line in f:
result = “”
line = line.split(” “)
for word in line:
#selection = random.randint(0, 30);
#if(selection == 2):
for c in key:
if(word == c):
result = result + ” ” + word
#line = line.replace(key, replace);
if(result != “”):
print result
return
if __name__ == “__main__”:
main()