Strip Down

February 6, 2010

A python script that tries to convey the meaning of a piece of text by content subtraction.

the script takes two parameters -f filename and -k keywords

Here are some results

  • Run on Obama’s Nobel Peace Prize Speech, the script is used to compare the uses of the words “war” and “peace”, the appearance of the words are in the same order of the original text
          • war
            war
            war
            war
            war
            peace
            war war war
            war war
            war
            war peace
            peace
            peace
            peace peace peace
            peace
            peace war
            peace peace
            war peace
  • Run on the article from New York Times : U.S. Military Faults Leaders in Deadly Attack on Base (http://www.nytimes.com/2010/02/06/world/asia/06afghan.html?scp=1 sq=U.S.%20Military%20Faults%20Leaders%20in%20Deadly%20Attack%20on%20Base&st=cse)
        • Take 1
          • results werehigh to In wasassist protection thea Friday,

            complex

            Keating mission than decided to combat

            Combat

            during

            punishments.

        • take 2
          • 2010AttackNORDLANDAfghanistan

            the

            base, District the attack. smaller

            insurgents

            had

            a an five

            force

            the

            hours

            officer

            Privacy

  • Source Code :

import getopt,sys,random

def main():

key = “”
replace = “”
filename = “../texts/stein.txt”

try:
opts, args = getopt.getopt(sys.argv[1:], “hf:r:k:”, ["help","filename=","replace=","keywords="])
except getopt.GetoptError:
sys.exit(2)

for opt, arg in opts:
if opt in (“-h”,”–help”):
print “hello”
sys.exit()
if opt in (“-f”,”–filename”):
filename = arg
print filename
if opt in (“-r”,”–replace”):
replace = arg
print replace
if opt in (“-k”,”–keyword”):
key = arg
key = key.split(” “)
f = open(filename, “r”)
for line in f:
result = “”
line = line.split(” “)
for word in line:
#selection = random.randint(0, 30);
#if(selection == 2):
for c in key:
if(word == c):
result = result + ” ” + word
#line = line.replace(key, replace);
if(result != “”):
print result
return

if __name__ == “__main__”:
main()

Leave a Comment

Previous post:

Next post: