Archive

Posts Tagged ‘python’

Pbox2avi for windows

December 17th, 2008 No comments

It appears that windows is incompatible with some of the shelling parts of my python script, so I have uploaded a windows specific version to take care of these issues.

Issues are:

  • piping
  • commands.getstatusoutput (doesn’t work on windows)

Download: pbox2avi_win.py

Pbox-to-avi conversion

December 15th, 2008 3 comments

So I just finished another round of grad finals yesterday. I’ve been meaning to update the blog with some other useful or interesting projects that I’ve been playing around with in my spare time.

Now last week, a friend of mine from my wonderful Cal days, came down from UCSF (med student) for a concert at Stanford.  He dropped by afterwards to catch up. Lucky for him, (and future UCSF students) he and his friends (amusing) mentioned that poor UCSF med students have these silly lecture “videos” in swf format, instead of standard web streamed video that Stanford and Cal both have abundantly. The great thing about video is that you can fast forward people. 2 hour lectures become 1!

Of course, having just finished the SCPD video downloading strips, I figured that it was probably video embedded in an swf video player. Turns out, it’s actually a bunch of pictures and audio. There’s no actual video! Apparently what happens is they take pictures every so often and then record sound, take another picture record the sound that goes with it. So, being the nerd I am, I had to try to see if I could convert it to video. So my friend sends me a sample link later that night. Thus, began my Pbox (that’s what they call it) to avi endeavor.

Before I briefly describe the process, I set certain goals I had to make.

  1. In order for all UCSF students to take advantage of it, it would have to be cross-platform.
  2. Automatic. Med students are busy animals that don’t sleep. The simpler and stabler it works, the happier they will be, and the more likely they would use it.
  3. The resulting video quality and size should be similar to that of the swf. After all, it’s just pictures and audio. It’s not REALLY video.
  4. The lowest priority is that it should be decently fast. After all, people need to use their computers at some point. One can’t spend the whole day decoding a lecture they could be watching.

To cut a long story short, I researched different ways of extract data from swf flash files. It turns out that swftools is a crossplatform (with precompiled binaries) for linux, os x, and windows. After figuring out how swfextract works, I realized that I would have to make a video clip of the presentation slide and the corresponding audio clip. To do that, I found mjpegtools, a cross-platform picture-to-video set of tools. To add the audio and video, I resorted to using my familiar tool mencoder. However, I had to make sure that the picture-video was exactly as long as the audio. I resorted to using exiftool, a program that parses tags of stuff. Last but not least, to automate all these tools, so a med student wouldn’t have to know how any of them worked, I used python to glue them altogether.

Requirements:

Usage:

python pbox2avi.py [lecture.swf] [output.avi]

Speed:

  • For an HSWF301 lecture, it took my 2.4Ghz computer 8 minutes to convert it. Encoding the video is a relatively quick process, since it’s mainly smashing pictures and audio into clips and then smashing the clips together.

If you’re a UCSF med student and you benefit from this conversion utility, and you feel like you would like to repay me somehow, feel free to send me an email. Since I’m working on biocomputational cardiac modeling, any future cardiologists, I would like to talk to you!

If you have questions, as usual feel free to leave a comment. Or if you know my friend at UCSF, you can ask him.

Code:

# python script to extract lecture slides and mp3's from UCSF lecture files for conversion
 
#requires swfextract
 
import commands,sys,re
 
def parse(filename):
    #check filename for .swf extension
    if not filename.find('.swf'): # not a comprehensive check
        return
 
    #open file
    status, output = commands.getstatusoutput("swfextract %s" % (filename,))
 
    print output
 
    #find "JPEGs: ID(s)"
    slide_identifier = "JPEGs: ID(s)"
    start=output.find(slide_identifier)
    slide_extract=[]
    if start:
        start += len(slide_identifier)+1
        slide_end = output.find("[-s]",start)
        print start,slide_end
        slide_extract=output[start:slide_end-2].replace(" ","")
 
    #find "Sounds: ID(s)"
    sound_identifier = "Sounds: ID(s)"
    start=output.find(sound_identifier)
    sound_extract=[]
    if start:
        start += len(sound_identifier)+1
        sound_end = output.find("[-f]",start)
        sound_extract=output[start:sound_end-2].replace(" ","")
 
    # now extract all the data
    print "swfextract %s -P -j %s -s %s" % (filename,slide_extract,sound_extract)
 
    status, output = commands.getstatusoutput("swfextract %s -P -j %s -s %s" % (filename,slide_extract,sound_extract))
 
    return slide_extract,sound_extract
 
def create(slide_extract,sound_extract,outputfile):
    #first throw out every other picture because second picture is always a thumbnail
    slides=slide_extract.replace(","," ").split(" ");
    slides=[slides[i] for i in range(len(slides)) if i%2 ==0]
 
    sounds=sound_extract.replace(","," ").split(" ");
 
    print "Number of slides: %s, number of mp3's: %s" % (len(slides),len(sounds))
    for i in range(len(sounds)):
 
        vidcmd="jpeg2yuv -n %d -I p -f 2 -j %s | yuv2lav -o temp.avi" % (int(round(2*10*(duration("sound%s.mp3" % (sounds[i],))))/10),"pic%s.jpg" % (slides[i],))
        # print vidcmd
        status,output=commands.getstatusoutput(vidcmd)
 
        sndcmd="mencoder temp.avi -o slide%d.avi -ovc lavc -lavcopts vcodec=msmpeg4 -oac copy -audiofile sound%s.mp3" %(i,sounds[i])
 
        # print sndcmd
        commands.getstatusoutput(sndcmd)
        print "Finished processing slide %d" % (i,)
 
    print "Cleaning up..."
    # remove unnecessary data
    commands.getstatusoutput("rm *.jpg *.mp3 temp.avi")
 
    slidevids=["slide%s.avi" % (i,) for i in range(len(slides))]
 
    slidevids=" ".join(slidevids)
 
    print "Combining slides..."
    # combine all of them
    commands.getstatusoutput("mencoder -oac copy -ovc copy %s -o %s" % (slidevids,outputfile))
 
    print "Cleaning up..."
    commands.getstatusoutput("rm slide*.avi")
 
def duration(mp3):
    # find duration of mp3 using exiftool
    cmd="exiftool -Duration %s" % (mp3,)
    status,output = commands.getstatusoutput(cmd)
    time=output[output.find(":")+1:output.find("(approx)")-1]
    reg=re.compile(r"(\d+):(\d+)|([\d\.]+) s")
    timestr=reg.search(time)
    #print time,timestr
    Min,Sec,Secs=timestr.groups()
    #print Min,Sec,Secs
    if Secs==None:
        # in terms of minutes
        return int(Min)*60+int(Sec)
    else:
        return float(Secs)
 
if __name__=="__main__":
    slides,sounds=parse(sys.argv[1])
    if len(sys.argv)==2:
        create(slides,sounds,"output.avi")
    elif len(sys.argv)==3:
        create(slides,sounds,sys.argv[2])

Download Stanford SCPD streams (wmv streams)

November 7th, 2008 49 comments

Since I am currently attending Stanford, and I once searched the intraweb for a solution for downloading wmv streams, but couldn’t find anything specific, I’ll post my solution here.

Normally, SCPD requires you to login and then you watch the videos streamed using some wmv player. However, should you want to rewind or seek through a stream, this will prove to be impossible. Many commercial wmv recorders are not free, and simply play the video to download it. My solution is similar to this, but it is FREE (as in beer).

What you need:

  1. Mencoder (An encoder that comes along with Mplayer) If you use windows, you should get the command-line non-gui download from the website since it will come with mencoder. Otherwise it will not.
  2. Optional: Firefox + Greasemonkey for extracting the video URL if you’re lazy like me.
  3. Optional: Python for executing mencoder and for converting http:// links to mms:// links.

To download from SCPD (or other unprotected wmv streams), simply execute the following at a command prompt:

mencoder mms://stream.wmv -ovc copy -oac copy -o output.avi
  1. Replace mms://stream.wmv is simply the url with “http:” replace with “mms:”
  2. Replace output.avi with the output file name. Extensions should govern the format that it spits out.

It will throw a bunch of text to the screen, and you simply leave the window open (or minimized) and wait for it to finish. However, since it’s free, I commonly download several streams at the same time, so when they are done I will have several streams downloaded instead of just one.

Download: Greasemonkey script for SCPD (gets links automatically when you view an SCPD video) (Updated 9/29/09)

If you are lazy like me, and don’t want to change the http://url to mms://url I’ve provided an mencoder script that, given an http url, will fill out the commandline and execute it for you. (Note: Many mac and linux users will have python already installed on their machines.)

Usage:

./stripSCPD http://url

Download: Python SCPD script

#!/usr/bin/python
 
import subprocess,sys
 
def strip(url):
    newurl = "mms"+url[url.find(":"):]
    print newurl
    vidname = url[url.rfind("/")+1:url.rfind("?")]
    print "mencoder %s -ovc copy -oac copy -o %s.avi" % (newurl,vidname)
    subprocess.Popen(["mencoder", newurl, "-ovc","copy","-oac","copy","-o", ("%s.avi" % (vidname,))])
 
if __name__=="__main__":
    strip(sys.argv[1])

Update (11/09/08):

I’ve just tried everything again in windows, and it seems to work. Note, for some reason mplayer doesn’t have the necessary codecs by default in the downloadable zip file. However, VLC, media player classic and Windows media Player seem to be able to seek. However only media player classic can increase the audio speed (in case you want to watch a stream 1.5 times fast and be able to comprehend what someone is saying even if they seem to be on helium!).