Here are some tools I've written using AutoIt.
If you don't have AutoIt installed, I sometimes include an EXE file that can also be run. If you have AutoIt installed, you can also modify the scripts. Remember to run the AU3 file and not the EXE file when you modify the script.
Useful programs used in my scripts:
Index of scripts:
If you use AutoIt scripts on UTF8 text files without a BOM, AutoIt will read it as ANSI. So you often need a function to add a BOM to a file. See my Bommer and Bommer Reloaded scripts below, but if you want to do it in the script itself, see here for examples:
adding bom with autoit.zip (800 kb, contains EXE files)
adding bom with autoit (without EXEs).zip (5 kb, doesn't contains EXE files)
This script automatically skips a segment if a certain set of characters occur in it. The compiled script searches for "[[" or "]]", but you can customise it. Included is a script for autoskipping Trados uncleaned segments, to be used in OmegaT.
Usage: Run OmegaT, have "Tab to advance" enabled, run the script, and use Ctrl+ENTER to move to the next segment.
These are two scripts to extract all source text from the project files into a single, large text file. This is useful for sharing text with translators who may not be using OmegaT, or for doing all sorts of lexicographic stuff with the text (eg using word frequency lists to create glossaries). The two scripts are:
For creating a TMX file with all strings in it, with identical source and target.
For creating a plaintext file with all strings in it. Each string is separated from the next using two newlines (in other words, there is an open line between the segments).
Usage: Run OmegaT, run the script, and follow the prompts.
This essentially the same as ExtractTUs but it creates a PO/POT file, which can then be processed using the Translate Toolkit.
This script adds numbers to all segments, so that they are all "unique".
This script removes all text except tags, or all tags except text.
This script helps you insert glossry items using keyboard shortcuts. This script only works with OmegaT 1.6x (not with 1.7x)
This is the simplest autotranslation script in my group. It basically pressed "next segment" all the way to the bottom of your project. Useful if you want a project "pretranslated".
An AutoHotKey script that helps you type <s#/> and <f#> type tags in OmT when translating OOo files.
To insert the current segment's tags and/or words starting with capital letters.
With this script, you can easily add terms to a glossary while you translate in OmegaT. You can even add terms to a glossary currently used by OmegaT. Unfortunately, the new terms won't be recognised in OmegaT until you've reloaded the project. Incidently, the script works in all programs (so you can add a source term from OmegaT but add its target term from eg a web site).
Usage: Run the script, mark the source term, press Alt+Ctrl+T, mark the target term, press Alt+Ctrl+Y, edit the term in the popup box, and continue.
See also pootle-addterm_1_2.zip further down
Want to do find/replace on a TMX file, but afraid that you might accidently replace something in the source text too? This script interacts with jEdit to do find/replace in the target text of a TMX file. Included is also a description of how to do this without using the script.
Requires: jEdit. Usage: Run jEdit, open the TMX file in it, run the script, and follow the prompts.
If you have translated a file in OmegaT, and then the client suddenly asks for an uncleaned file, close OmegaT, run this script in the path of project_save.tmx, and reopen the project in OmegaT. Recreate the target files and convert it to DOC in OpenOffice.org. Further conversions may be required, depending on your client.
With this script, you can add terms to a glossary file, from any file. Highlight the source term, press Alt+Ctrl+T, highlight the target term, press Alt+Ctrl+T again, and there you go. The glossary file is a flat file (with columns, in other words). Choose your own delimiter. Two versions -- one version also attempts to keep the glossary open for you.
A method to translate Transit PE files if you prefer to use 's matching engine. Requires Wordfast and Transit, obviously.
A method to t?anslate Idiom WorldServer Desktop files if you prefer to use Wordfast's matching engine. Requires Idiom and Wordfast, obviously. Very convoluted... use Idiot2desktop instead.
These two scripts paste a translation into Idiom Desktop and Idiom Worldserver. Useful if you do your translation in Wordfast or some other tool like OmegaT. I haven't tried it but I wouldn't be surprised if you can use an altered version of this script on Transit too. Requires Idiom and a CAT tool, obviously. Well, you can use the webbased Idiom.
This script creates a bilingual untranslated TTX file from a monolingual untranslated one. Useful if you translate TTX files in Wordfast or something else. Requires a demo version of Trados TagEditor.
This little script converts a Trados TXT TM into PO format. You don't need Trados... all you need is the TXT TM.
Add terms to the Pootle glossary in real time, if you're running Pootle on Windows as root. Now with comment support. Well, you can use this with any program, but it writes a PO file, not a CSV file.
For Swordfish CAT tool, helps create a source=target document.
This script reads one line at a time from a text file, and pastes that line into Transit PE. Your job is to go "Num+Plus" to move to the next segment. Rather use Tragic2PE2 lower down... it's got more options.
This script reads one line at a time from a text file, and pastes that line into Idiom. In Idiom, it basically goes "paste" and the "down" to the next line.
Press Ctrl+L or Ctrl+K to paste one line at a time from a file named text.txt. Dead simple. Assumes file is UTF8.
This little script converts a Wordfast TM into PO format (version 0.2 hopefully fixes a certain BOM bug).
This script reads one line at a time from a text file, and pastes that line into Transit PE. It is better than the first script because it can also repaste the previous line, and it is laptop friendly because you don't have to use the NumPad. Unfortunately I couldn't get the EXE to compile, so this one needs AutoIt installed on your computer.
This little script converts a Wordfast TM into a useable TMX-type file.
Keyboard shortcuts for the Russian text aligner TextAlign. Sorry, no skin support.
A script for taking snapshots of web pages online or offline. Useful for betatesting a localised web site -- just load a list of all possible URLs into it, and run it, and then review all the snapshots for visual errors.
Requires: * K-HTM2BMP (google for "khtm2bmp.exe")
A script that helps you proofread two documents on screen by pressing "down" in the second document for you, when you press "down" in the first one. Two versions, one more useful and more buggy than the other.
A free script for translators and users of the Venda and Sesotho language, for typing those difficult diacritical marks more easily.
A free script for translators and users of the Twi language, for typing those difficult characters more easily.
A file enumerator that prepends numbers to your files... but you select which files, one at a time.
Create bilingual word lists using Wikipedia's interwiki links.
Bulk Scanner For ABBYY FineReader 8.0 Professional Edition. Requires ABBYY FineReader.
Do HTM2TXT en masse, without breaking the text at ends of lines. The script assumes you don't want to copy the entire file's content, but only the main content box (not the menus etc), but you can change it if that is in fact what you want. Perfect for large alignments. An older version is here. Requires a good browser, like FireFox, and Babelpad.
Warning: HTM2TXT doesn't really work, because those idiots who designed FireFox couldn't *think* straight enough to give context menu items the same keyboard shortcut regardless of context... this means that sometimes the script will press the wrong shortcut because the context is different. What would be really great is if the author of the Firefox extension "Table2clipboard" could add keyboard shortcuts (I mean, accellerators) to his menu items.
This script attemps to check whether files in one list have the same number of lines per pair than the files in another list. Useful for checking if your text extraction has gone smoothly before doing an alignment.
This script attempts to remove tabs, double spaces and double blank lines in multiple files. It makes backup copies of files that it edits. If you edit it, you can also use it to remove or find/replace other characters.
Adds a BOM (byte order mark) or removes the BOM of a whole bunch of files. Useful for working with Linux files on Windows or vice versa, especially UTF8. Requires a hex editor XVI32. See also "Bommer Reloaded" further down, a much better script.
Verbgetter works with one of Kevin's word lists, but I can't remember which. It is a tagged word list but all the words are all there... so to extract only verbs, this script should do it. I can't remember if the script was ever a success. I think I only ever got verbs with it.
For onehanded typists who can't reach the E?TER or BACKSPACE key, an AutoHotKey script that turns the spacebar into a hotkey.
To invite more than one person at a time using a Mailman mailing list like the ones used for Sourceforge.net.
To insert a number of standard responses at the cursor position.
For users of Novellglossaries.com, to insert terms and translations from a tab-delimited source file.
A script that tries to perform an action (such as making backups) no more than once a day, when you reboot.
If your PO files have strings marked as "fuzzy" but you're certain they're okay, you can use this script to remove the "#, fuzzy" indicators from it. The script requires BOM'ed files (see "Bommer Reloaded").
If you need your UTF8 files to have byte order marks (BOMs), which is a typical need on MS Windows, then this script will check and convert your files. Files with BOMs are skipped, but files without BOMs are backed up and then get their BOMs in place. The script uses uses Robert Bachmann's DumpHex and Michael Paul Thornbury's HexAlter (packages included, read the licences).
If you have one of those stupid multi-media keyboards and you keep hitting the extra buttons accidently, this script is for you. Put it in your Startup folder, and it disables all those fancy fancykeys (except for the sleep and shut-down keys, which will haunt you still).
This is a first generation machine translation system. It is ultra slow and it makes many mistakes.
This script uses the GNU utility recode.exe to convert a directory of files from Latin-1 to UTF8. The script does not add byte order marks. If a file is already UTF8, the script is likely to keep the file as-is. The script can also create backups. Filenames have to be "English".
TXTligner does not actually align text, but it helps you copy text from two or more sources that can later be used in an alignment procedure.
Create bilingual word lists using Wikipedia's interwiki links, but unlike Wikterma above (that retrieves the HTML files from the web), Wikterma_XML works on the dump files, and extracts all terms (not just the ones you specify).
To automate pofilter and pomerge in such a way that you can drag and drop the files. Warning... I've since learned that this script really only work if your PO project is 1 file big. It doesn't work for bigger projects.
To automate pogrep and pomerge in such a way that you can drag and drop the files. Warning... I've since learned that this script really only work if your PO project is 1 file big. It doesn't work for bigger projects.
To automate pofilter to create PO files with fuzzies only and untranslateds only (unwrapped). The script does not do POT2PO -- you have to do that yourself. Warning... I've since learned that this script really only work if your PO project is 1 file big. It doesn't work for bigger projects.
This script does pogrep on a PO file using a list of words from another file.
This script does msgunfmt (mo2po) on a number of files from a list.
This script creates a TMX file from a number of PO files, then translates a number of files using the TMX file.
This script removes strings from the OpenProj menu.properties file that shouldn't be translated, so you can run prop2po on it. Well, you can edit it and change it for any other file also. A very, very simple script.
These three scripts are for Pootle administrators who don't have access to the Pootle users list except via the web site. The users list is at yourserver/admin/users.html. The script pootleusers2longlist.au3 converts the HTML file into a tab-delimited file called longlist.txt. If you have a short list of users called shortlist.txt and you want their details from the longlist.txt, use the script longlist2finallist.au3 to do that. The script longlist2maillist.au3 does something similar as longlist2finallist.au3 except that it creates a list you can paste in your e-mail client's BCC field.
This is a collection of four scripts that goes to work on a Pootle HTTP log file. It creates (a) a list of all users (unsorted, multiples included), (b) a list of all users (sorted, multiples included), (c) a list of all users (sorted, uniques only) and (d) a stats file that counts how many lines for each user occurred in the log file. I recommend you also get grep.exe from UnxUtils if you do this a lot.
This script spell-checks each segment in Kastrul directly after the user moves to the next segment. Kastrul basically outputs a list of misspelt words, but it is very easy to create your own spelling dictionaries for it.
Requires: Kastrul. Usage: Run OmegaT, have "Tab to advance" enabled, run the script, run Kastrul. Use Ctrl+ENTER to spell-check the current segment, or to get back to OmegaT to edit the segment.
This script spell-checks each segment using three UnxUtils tools, but it mimicks the way Kastrulspell works.
Requires: three UnxUtils tools. Usage: see enclosed file.
This script spell-checks each segment using Spellcatcher. I can't remember how it works, but I know it did work.
Requires: Spellcatcher (not freeware). Usage: see enclosed file.
This script can be used for spell-checking or for machine translation or for Google searches etc. I haven't tested it in a long time.
Requires: a file called myfile.bat, in which whatever is defined.
This isn't an AutoIt script, but a method for spell-checking a TMX file in MS Word. Included is a macro, an MS Word document with tw4winExternal style in it, and a readme file.
This method with script was originally written in AutoIt v2, but I've rewritten it in Au3 and compiled it. It was also originally written for an earlier version of OmegaT, so you may have to tinker with it (or with the instructions). The misspelt words are displayed in an MS DOS window.
Requires: Winkey, Aspell. Usage: Run OmegaT, run Winkey, open an MS DOS window and go to the right directory and change the prompt to nothing, and then use Winkey-ENTER to move to next segments.
This is a longwinded description of how to use OmegaT to translate Trados uncleaned documents. It is fairly comprehensive, and includes a few macros. Theoretically, all things being perfect, you should be able to take an MS Word file and produce a translated Trados uncleaned file, using only free tools. Read it inconjuction with omt2trados3 (which contains the segmentation rules necessary to translate uncleaned files in OmegaT).
This is an older description of how to use OmegaT to translate Trados uncleaned documents. It includes a script, but I really can't vouch for it.
This is possibly the newest description of how to use OmegaT to translate Trados uncleaned documents. It includes the OmegaT segmentation rules to be used for Trados files.
This is a very complicated script that works only on my computer. It does the same as Omtextract, except that it attempts to get some of the information automatically (Omtextract prompts the user for the information).
This is a very complicated script that works only on my computer. It is meant for launching OmegaT from the commandline, launching projects into OmegaT from the commandline, and launching projects into OmegaT using drag-and-drop.
This isn't a complicated script, but it rather depends on your installation of OmegaT. It attempts to replace your OmegaT prefs file with one located in the script directory. It also creates a backup of your existing prefs file. Usage: Close OmegaT, put your preferred prefs file in the same directory as the script, and run the script.
Samuel Murray
2006