Never been to DZone Snippets before?

Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

About this user

Korakot Chaovavanich http://korakot.stumbleupon.com

« Newer Snippets
Older Snippets »
Showing 1-1 of 1 total  RSS 

Mass conversion from word to HTML

I got 1000 word files. Each contains 1 main image and some decorations.
(This is actually a big book scanned into 1-file-per-page format)
I need to extract all the images. What do I do?

Python can do some automation using COM. (or something like that)
   1  
   2  import pythoncom, win32com.client
   3  
   4  app = win32com.client.gencache.EnsureDispatch("Word.Application")
   5  
   6  doc = 'C:\\lang\\try\\bdham\\p1'
   7  app.Documents.Open(doc + '.doc')
   8  app.ActiveDocument.SaveAs(doc + '.html', FileFormat=win32com.client.constants.wdFormatHTML)
   9  app.ActiveDocument.Close()
  10  # now repeat with p2, p3, etc.

Actually, I should put it in a loop. But this non-loop version
is easier to read and remember.
« Newer Snippets
Older Snippets »
Showing 1-1 of 1 total  RSS