At my summer job I am doing a lot of converting large word files to rather simple html documents, and then styling it all with css. This is easy, but tedious. So my question is this: Is there a way to convert a word doc to a web page, but only with the basic structural html elements (paragraphs, headers, and lists, mainly)? I'll probably have to go into the .doc and use styles correctly, but I don't want to do that if there is no way to convert. I am well aware of the atrocities of microsoft's own export, but is there a better way?
I think the quickest and cleanest way is just to copy/paste each paragraph individually to a sort of html template you made. It's just a combination of <ctrl> <shift> <down arrow>; <ctrl> <c>; <alt> <tab>; <ctrl> <v>; <alt> <tab>; <down arrow> rince and repeat... And afterwards add the headings, lists,... where needed. Guess that is the quickest way of doing it. EDIT: Come to think of it, just copy the entire textfile to notepad and add the basic tags will be even faster I think.
For <p></p> tags, search & replace on Word's paragraph markers may work (replacing with </p><p> and editing first and last tags in the file). If headings have been done properly (as different styles) by the original typist, then 'Outline' view will isolate them from paragraph text for easier tagging. Lists and tables gets harder; a hard copy of the original advised to show you how it was intended to look.
I'm not positive... but I think you can export from word to HTML, then in Dreamweaver remove extra word formatting (It is a built-in command somewhere) then remove all of their Style information and create your own. Their advanced find and replace makes it super easy to get rid of extra word classes and styles. If you don't have dreamweaver, you could probably find a text editor out there that allows for regular expressions in your find-and-replace, which might help too. (although I don't know regular expressions, so I'm not sure how well that would work.)
As OneSeventeen says, you could Save As... HTML, but the HTML is VERY messy (at least with Word 2000). How about creating a VBA macro to do it?
Firstly, thanks for all the suggestions. I have a lot of things to try now... @glider - What you first described is pretty much what I have been doing, and it certainly makes for very happy, clean code. I don't know why I don't copy it all and then add tags, though. *smacks forehead* @cpemma - How do I do a find on word's paragraph markers? That seems like it could help a lot. Now, if I could get people to actually use styles... @OneSeventeen - I (unfortunately) have access to dreamweaver at work. That sounds like it is exactly what I am looking for; dreamweaver may have to become slightly less evil in my eyes.
It's a button which looks like a weird q with 2 legs... That shows the paragraph markers, and a lot more (like tabs)
Aaaahhhh...that thing...I remember that now from the computer apps class... Anyways, dreamweaver works fairly well at stripping the word crap, and then a find to get rid of spans makes it almost pretty. Thanks all.
Using anything as a straight WYSIWYG editor is not professional, IMO, but dreamweaver can be used as an awesome HTML editor with all kinds of cool tools built in. (I still haven't found a text editor with as simple an FTP client built in... meaning I can download a file, edit it, and then after saving it locally, hit ctrl+shift+u and have it upload it. Super quick and easy!) It is definitely worth it to play the strengths of different apps. I use iTunes on my windows partition at home to manage podcasts, since it is so easy (plus I get to check on reviews of my podcast there ), but I still buy CDs from the store rather than use the iTunes music store. Of course, TBH, I don't use dreamweaver anymore, but that's probably just because I switched to linux at work.
I agree with you 1:17, I had only ever used Dreamweaver as a WYSIWIG before and hated it, I use it at work know purely because of the interegrated FTP client, some bits are annoying like when the popup menu comes up when you are closing a tag
Yeah, I am starting to see that now. I guess I can join the "I hated it because of WYSIWYG, but now I like it more once I got past that" club. Granted, I still won't use it because I mainly use my mac at work, and I am too cheap to buy it for any of my computers. But for stuff like this, I will definitely get on a windows machine to use dreamweaver. I never knew about the ftp, thats cool. But we use ssh at work, and like I said, I am more cheap than lazy to get it for my personal site.
I think Dreamweaver MX can connect to SSH (they call it sFTP) with a free add-on from macromedia/adobe, and MX 2004 and above should be able to use it out of the box. Of course, with SSH, you could also just mount a drive on your mac and have at it... at least I'm assuming you can, since you can in most *nix'es.