Samuel Murray's page of AutoIt scripts
Introduction
This page contains a
selection of my scripts (programs), most of which are written in AutoIt
scripting language. AutoIt is freeware
and it runs on Windows only. To run most
of the scripts, you must have installed AutoIt's interpreter. A few of the scripts are also available in
executable format (.exe files), which anyone can use even if they don't have
AutoIt on their computer. AutoIt is a
very simple language, and you're welcome to modify my scripts to suit your
needs.
Some of the scripts
here represent my early days of programming, and are not very
sophisticated. Even my fancy scripts can
be quite primitive. Most of my scripts
were written for use on my own computer, for my own purposes, and are therefore
not guaranteed to work on any other computer.
Some of the scripts really do need proper help files, but I know how to
use them myself.
A few years ago,
AutoIt did not support Unicode, and did not work on 64-bit programs, and so
therefore some of my earlier scripts do not support these things either.
Not all scripts and
macros on this page are in AutoIt. Some
are in AutoHotKey, which is more suited for creating shortcuts. Some are MS Word macros or Excel macros, and in
one or two cases it is simply a description of a process. Some of the scripts are no longer relevant
but are given here for historical reasons (e.g. if the function it tried to
emulate later became available in the software that the script was written for).
http://www.autoitscript.com/site/autoit/downloads/
Frequently asked questions's answers
1. There are no
viruses in the EXE versions of my scripts.
The 32-bit version of AutoIt scripts is compressed using a method that
is also used by some virus writers, but that does not mean that AutoIt scripts
are viruses. If your anti-virus program
blocks the script, try using the 64-bit version, or try using the AU3 version
of the script after you've installed AutoIt itself.
http://en.wikipedia.org/wiki/UPX
2. The EXE version of
a script is not simply a launcher for the AU3 file. It is a fully compiled version of the
script. So, if you edit the AU3 file,
then you have to run the AU3 file (not the EXE file), which means that if you
edit my scripts, you have to have AutoIt installed on your computer to make
them work.
3. Many of the
scripts are automation scripts, which means that they perform clicking and
keypressing on your behalf. However, the
script can't actually see your screen, but rather clicks and presses the keys
blindly. This is why it is important
that when you run such a script, nothing else becomes the active window while
that script is running. Some scripts
have safeguards built in, but they only work 99% of the time. If you're nervous before you run any of the
scripts, simply deactivate any programs that might produce popups, and look at
the screen while the script is running.
4. While an AutoIt
script is running, you'll see a round blue icon in the system tray. Click or right-click it to pause the script
or to exit the script (unless the script comes with a built-in exit
shortcut). Some scripts do their stuff
and then exit by themselves, but others stay resident until you kill them.
Various non-translation related scripts
If you need a certain
action to be performed really only once a day, no matter how often your eboot,
this script can run it for you and check that it runs only once a day. Pop it into your Start menu's startup folder,
and it will check every time you reboot whether the action has been done on
that day.
A proZian named Nessie complained that she had to type retorts over and over
and over again in a file that she was reviewing, so I wrote this script which
would have helped her to paste a set number of retorts with a keypress.
If you have many
files and you want to prepend them with numbers, so that they will sort
alphabetically in that sequence, this script is for you. Simply select a file and press the shortcut,
and the script will add the next number to the file name. An interesting feature of this script is that
the source code is inside the EXE file (you have to run it to get it).
This AutoHotKey script
attempts to scroll down in two separate programs. This is useful if you want to proofread
something in two windows and you want both windows to scroll down at the same
time. I think if I rewrote this in
AutoIt it could be much more userfriendly.
This script pops up a
message after 8 minutes, so that the user can calculate how much work he did in
those 8 minutes. The user has 2 minutes
in which to do the calculation. After 2
minutes, the next 8 minutes' time begins.
If you want to see how fast you do a particular task, this script is for
you.
Suppose you must do
something in a slow program (e.g a remote desktop session over a slow internet
connection) and you want to save time by doing stuff in another window while
you wait for the slow program to wake up.
You might forget to check the slow program if you get engrossed in the
fast window's work. This script
periodically restores a specific slow window while the user works on something
else in a faster window. The
"withmedia" indicates that this version also detects your media
player (if you're watching a movie while you wait) and pauses the film (if you
set up the shortcuts right).
A pair of scripts for
moving windows around the screen.
Middlesizer can
maximise active window's width, maximise active window's width to 90%, centre
active window vertically, and resize to roughly 90%, 75% or 60% of the screen
height, or move active window to one of four positions around the screen
(depending on whether you have a portrait or landscape screen).
Twinwindowsizer is
useful for comparing two files side by side or above and below, because it
resizes the window to 50% of the screen and puts it left, right, above or
below.
This script changes
the selected text to lowercase, or if nothing is selected, it selects all text
and then changes it to lowercase. It attempts
to preserve the clipboard.
ZNG Numpad (for non-US laptop users)
Many Afrikaans
translators type special characters like ê and ë using the Alt key plus the
numbers on the number pad. However, many
laptops don't have a number pad. Some
laptops have a fake number pad that works if you press a special key, but those
number pads typically don't work with the Alt key. With this script, by pressing Alt + Windows
Key + a letter on the keyboard, you can type those Alt codes again.
ZaKey
(for black South Africans)
Some South African
languages (Venda, Tswana, Northern Sotho and Zulu) use some pretty odd
characters that can't be typed using a normal keyboard. This script introduces dead key combinations
for those languages. With it, you can
type L, T, D and N with a circumflex, N with a dot, and S with a caron.
TwiKey
(for speakers of the Twi language)
TwiKey does for Twi
what ZaKey does for Venda, Tswana, Northern Sotho and Zulu. Twi is a language from West Africa.
LilyEnter (for one-handed typists)
This script is for
one-handed typists who want to use a standard sized keyboard and who don't want
to leave the home row. It adds the
ability to type Tab and Enter by using the spacebar as a modifier (in other
words, use the space bar as one would use the Shift or Ctrl key). I wrote it after I became aware of one-handed
typists and their difficulties.
This script attempts
to prevent focus stealing of windows.
If you have a
multimedia keyboard with lots of extra fancy keys that you keep accidentially
pressing, this script will disable them for you (hopefully).
Idiom Hotkeys (not just for Idiom but for any
editor)
This script was
originally written to help translators who use Idiom to write certain special
characters. Idiom does not have a
character map of its own, and many of the shortcuts that produce these
characters do not work in Idiom.
Examples include an en dash, three dots, a non-breaking space, and curly
quotes. The script also works in e.g. MS
Word, except if you have smart cut/paste enabled. When I want to type smart quotes, I run this
script.
Merge and Split Files (for plaintext
and XML files)
Imagine you have
several TTX or TXML or HTML files (sometimes in multiple sub directories) that
you would like to process as a single file (e.g. in Wordfast Classic). These scripts will merge all those files
together in a single file in such a way that it can be split later in a folder
of your choice, with its original folder structure intact, and with the file
names intact.
This script was
written for TuxPaint but was used by me for other jobs as well. It helps you to record short phrases that you
speak into the computer, and save it using file names that you had chosen
beforehand. If you have a thousand
phrases to record, this script will safe your life.
A perhaps
unsuccessful attempt at creating a video screen capturer. It takes a screenshot every few seconds,
while recording the sound. If you have a
slideshow creator program, you can then create a slideshow. Actually it needs work.
KeeKee takes a full
screen screenshot every minute, or optionally every time the focus
changes. This is useful for keeping
track of what you do all day, or to keep tabs on what your children do on the
computer.
This script tries to
switch off the computer if the mouse hadn't moved for 5 minutes. It warns the user for 30 seconds. Useful for children who forget to turn off
the computer.
If a Windows user
can't tell you what his file's extension is, let him download this little
utility, which attempts to tell him that.
Various unsorted translation related scripts
Add Term Flat/Note (creates a tab-delimited
file)
Two versions of my AddTerm
script that is useful for translators who create glossaries. The version called "note" will
reopen your glossary in Notepad every time you add a new term.
These scripts attempt
to fix broken TMX files by fixing broken entities or removing characters that
should not be present in XML. This is
useful if your CAT tool tells you that it can't read the entire TMX file
because of one teeny tiny little error somewhere in one of the segments.
Glossary Look-up 4 Idiom (for Idiom, but can
be adapted for other tools)
One problem with
Idiom is that users can't add their own glossaries. So if there is a project-wide glossary
that must be followed, but the client refrains
from adding it to Idiom, translators are out of luck, unless they have some way
to look up words in the glossary manually.
This script looks up words from Idiom in the glossary and outputs it to
the web browser. It can be easily
adapted for any other tool.
GlossLookup v2 (the original GlossLookup
script)
This is the original
GlossLookup script that works anywhere, but the results don't look very
pretty. This version comes with EXE
files.
Various Wordfast scripts
Glossary Rechecker (for Wordfast
Classic)
Wordfast Classic
can't do glossary QC on partial word matches. This means that the QC report
contains many false positives. This
script takes that QC report and rechecks it and flags all matches that really,
really do not match. The script is still a bit buggy but it is a
tremendous help as it is.
Wf TM Anon[imizer] (for Wordfast Classic and Pro)
This is a Wordfast TM
anonymiser. If you get a TM from a
client in which many different user IDs occur, this script will change all
those user IDs to one ID (and store the original IDs as attributes). The script can also undelete deleted
segments, and you can update the language codes to something more specific, if
you want.
Bilingual Review Fixer (for
Wordfast Pro)
One problem with
Wordfast Pro's "bilingual review" is that segments with different match
percentages are marked by background colour only, and it is therefore not
possible to sort the table by match percentage.
For example, if you want all 100% matches together, all 0% matches
together, and all fuzzy matches together, you can't do that in MS Word
alone. This script changes it so that
you can sort the table in MS Word by match type. It is a teensy little complex, but it works.
TXML word counter (for Wordfast Pro)
Wordfast Pro can't
count words of translated TXML files (at the time of writing). This script extracts all target text from a
TXML file so that you can count it in MS Word.
Extract Uniquer (for Wordfast Classic)
Wordfast Classic can
extract all translatable segments from a file, and it can show either all
segments or all repeating segments, but it can't show all segments with
repetitions reduced to one line per repetition.
This script processes the extraction text file of a Wordfast Classic
extraction so that no segment in it is repeated, and so that the first instance
of a repeating segment occurs in its actual position in the file.
Various OmegaT scripts
In Which Folder (for OmegaT and for Virtaal)
There are two scripts
in this package -- one for OmegaT and one for Virtaal. The user directories for these programs
differ, depending on your operating system version, and these scripts find the
right directory for you, and open it. So
if you need to find the scripts folder of OmegaT, this script does it. Or if you need to find the error log of
Virtaal, this script does it for you. It
is also useful that it opens the folder, because sometimes it is a hidden
folder that you would otherwise not have access to.
Uncleanify TMX (for OmegaT)
Fewer and fewer
clients use the "uncleaned RTF" format of translation text, but there
was a time when it was quite common, and users of OmegaT were at a disadvantage
for not being able to deliver in that format.
This script is a simple solution that works in 99% of cases. It magically turns any OmegaT translation
into an uncleaned file by editing the project's TMX file. Known bugs include hyperlinked files.
Text/Tag Rem[over]
(for OmegaT)
This script takes the
input text (from your target segment in OmegaT) and removes either all the tags
from it (leaving only the text) or removes all the text from it (leaving only
the tags). This version works by copying
the current target field, but in the mean time OmegaT got a scripting feature
that would make it possible for me to write a version of this script that works
in the background.
Tag Grabber (for OmegaT)
This script works
with OmegT's scripting feature, which means that it is capable of working
"in the background" without visibly copying the source text into the
target field just to work. As of
writing, OmegaT still has no shortcut for manipulating tags, and this script
helps paste those pesky tags in the right positions. The script grabs no only tags but also words
that start with an uppercase letter.
Seg[ment]
Adder (for OmegaT)
As of writing, OmegaT's
segmentation rule feature is still extremely complex and complicated to
use. And merging and splitting segments
in OmegaT can only be done by adding segmentation rules. So, this script attempts to add segmentation
rules directly to the rules file, bypassing OmegaT entirely. It is kludgy because it creates 100
non-working rules and then edits them one by one as the user adds new rules,
and the user has to reload the project every time as well. But it is 100 times simpler to use than to
add the rules from within OmegaT itself.
AutoSkipper (for OmegaT, can be adapted for
elsewhere)
This script will help
you to skip a segment if that segment contains certain characters, in
OmegaT. It uses the target field, but I
suppose I could rewrite it for use with OmegaT's scripting file feature.
Omt Add Term (for OmegaT)
I wrote this script
before I knew how to encode a tab character
:-). In the current version of
OmegaT, you can add terms to the glossary using a shortcut, but this script was
written in the days when that was not possible.
You select a word, press a key, select another word, press another key,
and then the script adds your term to the glossary.
omtFolderize (for OmegaT)
This script will
change an OmegaT project folder's icon to something else, so that you can
easily identify them. There is also an
optional sample registry that shows how to add an "Open with OmegaT"
right-click menu in Windows (very primitive).
Omt OOO Format (somewhat outdated, for OmegaT)
This script is for
quick typing of OpenOffice.org formatting tags in OmegaT. I wrote this at a time when OmegaT's tagging
was a little more primitive and when ODT was about the only formatted text that
OmegaT supported.
Omt Extract (outdated, for OmegaT)
This is a pair of
scripts that I used long ago to extract the source text from a running OmegaT
project. These days, you can use a
commandline in OmegaT itself to accomplish this, or you can use the Find dialog
and search for *. But in the early days,
this was the only way to extract text from OmegaT.
Wordfast2TMX (outdated, converts to TMX)
I think I wrote this
script because I used to work a lot with translation tools that didn't like
UTF8 files to have a byte order mark, and most/all existing converters for
Wordfast translation memories to TMX added it, and it was an effort to remove
it each time I converted a TM. This
converter is half-baked, however, because it doesn't convert any entities or
tags.
Three "uncleaned RTF" methods
for OmegaT (unfinished)
These are three early
attempts to create methods to use OpenOffice.org and OmegaT to create or to
translate uncleaned RTF files.
GlossPaste (no longer works, for OmegaT)
Here is version 1.0
and version 1.1 of my fantastic glosspaste script. This script made it possible to insert
glossary matches in OmegaT using only keyboard shortcuts. This feature is still not available in OmegaT
(although it is now possible to insert glossary matches using a mouse
click). Both these scripts died an
instant death when the OmegaT developers changed the format of the glossary
match window to something that could not be parsed in plain text.
CycleAddNum (no longer works, for OmegaT)
I don't understand
how this script could have ever worked.
The theory is that it adds numbers to all segments, thereby making them
unique, but the script used OmegaT, which by definition could not perform this
action.
BulkFindReplace (outdated, for OmegaT)
I wrote this script
before I knew how to do regex in AutoIt.
It is for doing find/replace in a TMX file. As of writing, OmegaT still doesn't have this
feature.
OmtLauncher (no longer works, for OmegaT)
These days, you can
launch OmegaT from the commandline and you can specify various things, such as
which project to open, but in the days when I wrote this script, there was no
such function. This was a launcher that
tried to launch OmegaT and open specific files or projects in it.
PrefsChange (no longer works, for OmegaT)
These days, you can
save different program and user settings in OmegaT, but when I wrote this
script, it was not possible. This script
attempts to change the settings of the program, so that project managers could
specify certain settings for translators.
TMextract (no longer works, for OmegaT)
My other scripts that
extract segments from an OmegaT project all ask the user how many segments
there are in the project, but I tried to make a script that would figure out
that number by itself. It turned out to
be a nightmare.
Autotranslate (outdated, for OmegaT)
This script helped pre-translate
a file in OmegaT. This was written
before OmegaT had the /tm/auto/ feature, which accomplishes much the same
thing. I suppose one could write a
similar script that will also insert non-100% matches.
Various Pootle, Translate Toolkit and Gettext
PO scripts
Remover
Fuzzy (for PO files)
This script attempts
to take all segments in a list of PO files that are marked as "fuzzy"
and turn them into non-fuzzy segments. I
needed this because I used some CAT tools that would edit PO files without
updating the fuzzy status, even though the segments were no longer fuzzy.
PO
Zapper (for Translate Toolkit)
This is a pair of
scripts that work with the Translate Toolkit to unpack PO files into separate
files with untranslated and fuzzy strings, and to merge them back into the
mother files when they are translated or checked. This is useful when you simply must do the
translation in PO format and you work with many, many different files in
different subdirectories. These are
drag-and-drop scripts, so you need to compile EXE versions of them.
Pootle Users Get (for Pootle administrators)
For one year long, I
was an administrator of a Pootle server, and at that time (and still?) there
was no simple way to get a list of all users from the server. This script extracts such a list from one of
the public administration web pages.
Thanks to this script the Pootle server administrator can then send
mails to all users.
Pootle Log Get (for Pootle administrators)
In Pootle (when I
used it), there was no easy-to-access record of which users did which
translations, or an easy way to get statistics about how many strings were
edited by how many translators from how many languages. The only way to get this information was to
extract it from the web server's POST weblogs.
These four scripts are to be used one after the other to process such a
web log.
PO Grepper (for Translate Toolkit)
This is a series of
scripts that performs the pogrep command on a number of PO files. Pogrep is a utility from the Translate
Toolkit that enables you to extract small PO files from large PO files with
only the segments that contain the word or phrase that you had grepped
for. These are drag-and-drop scripts, so
you need to compile EXE versions of them.
PO Greepie (for PoEdit)
This script performs
pogrep on a PO file or PO files to produce smaller files that can be opened in
PoEdit. I think I must have created this
script to enable translators to create glossaries using information from larger
PO files. This is a drag-and-drop
script, so you need to compile an EXE version of it. The script's comments say that the procedure
was meant to be one-way, so I really
can't remember why on earth I wrote it.
PO Filter Merger (for Translate Toolkit)
This script performs
pofilter and then pomerge on a directory of PO files, using selected pofilter
checks. Pofilter is a utility from the
Translate Toolkit to perform quality checking of translations on PO files. These are drag-and-drop scripts, so you need
to compile EXE versions of them.
Pootle AddTerm (for Pootle home-users)
This is a glossary
script that allows you to add terms to a glossary, by selecting the terms and
then pressing a shortcut. This one works
in Pootle, if you run Pootle on your own computer. It requires admin rights on Pootle, so it
really only works if you use Pootle as a CAT tool on your very own computer.
Moo2Poo
(for PO files)
The readme file says
it all: "Moo2poo does msgunfmt on a list of files." Basically, it turns a bunch of MO files into
PO files. MO files are compiled PO
files. PO files are human-readable
translation files. Some programs that
use Gnu Gettext are shipped with only MO files and no PO files. The ZIP file contains some extra program
files, but they're all GPL.
MakePOTty (for PO files, via OmegaT)
This script ran
through an OmegaT project and saved its segments as an untranslated PO
file. I suspect I was trying to find a
way to translate ODT files in Pootle, via OmegaT.
DoDosDoo
(for PO files)
At one time I had to
extract a translation memory from large numbers of PO files and then use that
translation memory to pretranslate a second set of PO files... for many
different languages. This script did it.
Wordfast TM2PO (convert to PO file)
This is a rush job to
help convert Wordfast translation memories into the Gettext PO format. One reason for that is that it enabled me to
use Zuza Software's Translate Toolkit, which does amazing things with PO
files. The script contains a few safety
features for problems that existed at the time, but I don't know if they are
still relevant.
TradosTXT2PO (convert to PO file)
The CAT tool Trados
2007 and earlier used a translation memory format called "TXT", for
which I could not find any converters.
At the time that I wrote this script, many Trados using clients would
send translators this file instead of a TMX file. The Trados TXT translation memory is simple
enough, and this script attempted to convert it to PO format. The script worked for a while and then no
longer worked -- I can't remember what I changed and why it broke. There is also a version that produces a
tab-delimited file instead of a PO file, called tradostxt2tab.zip.
Alignment related scripts
I wanted to create a translation
memory by aligning files that I downloaded from the web. Many of these files were PDF files. Now, if you convert a PDF file to text using
an OCR program (even if it is an editable PDF), then there are none of those
pesky line breaks. So I wrote this
script that read from a list of PDF files and interacted with a version of
Abbyy FineReader to produce text files that I could use in the alignment. The reason I could not simply drop all the
PDF files into the OCR program was that the OCR program would then treat all
the PDF files as a single project, and output a single text file, which is
useless.
This script was based
on the ABBZZ script and is for the same purpose, but doesn't make use of Abbyy
FineReader at all. I wanted to convert
HTML files to plaintext files, to align them into a translation memory. Now, the odd thing is that all freeware
programs that convert HTML to text insert line breaks into the text. The only way I could figure out to convert a
web page to text without line breaks, is to select the text in a web browser
and then copy/paste it to a text file manually.
This script tried to automate that.
There is also an older, simpler version of it called htm_abby.zip.
This script compares
two lists of files to ensure that the files themselves have the same number of
lines in each pair of files. This is
necessary for some types of alignment. It
also helps identify files that didn't extract successfully, based on
differences in the number of lines.
If you want to create
a translation memory by aligning texts that you got from the web, this script
will help with copying the text and saving it with useful names. Sometimes web sites are available in more
than one language, and so you could use the multiple language pages to create a
TM, but it is not always simple to get
those pages and to extract the text from them.
With this script, you simply browse to the two (or three) versions of a
page, and click in it and press a shortcut, and the script will save the pages
in plaintext format.
This script helps
extract text from two PDF files (one in each language) so that each page in the
PDF is a separate file (well, two files, because there are two languages). This makes it easier to align the files for
creating a TM, because if any page is mangled, it mangles only that pair of
files.
This is the beginning
of my own HTM2TXT program. It is much,
much, much more useful than any other HTM2TXT program that I have found so far.
At the time when I
wrote this script, there weren't many freeware aligners in the translation
industry, and when the program called TextAlign became available (in Russian
only), I created this script to help people use the menus. It basically clicks the various menus in the
Russian program.
Bug report related scripts
If you need to
proofread a program or web site and you want to take screenshots and write
comments about it easily, this script is for you. Simply press the shortcut and it takes a
screenshot, and allows you to write a small comment, which is stored in a
separate file along with the screenshot's name.
This script reads one
URL at a time from a text file, and then uses the program kHTM2BMP to create
bitmap screenshots of each web page.
This is useful if your client needs
you to proofread a web site that is not yet on the internet and must be
viewed at their premises on their computer.
The client simply creates the BMPs and e-mails them to you.
Peekhasa
-- This is a play on the word "Picasa". This pair of scripts is for logging bugs on a
web site, by capturing text and images, and uploading it to Picasa, and then
having the image URL so that you can mention the image URL in the bug report.
PDF Bulk Commenter -- Imagine you have to write the
same comment to the same piece of text hundreds upon hundreds of times
throughout the entire PDF file. Well, I
had to do that. It would have taken me
about 10 hours if I had to do it manually and I didn't lose my mind. With the script, it took a little over 1
hour.
OOO Autocaptures -- Semi-automated screenshot
taking of various toolbars and dialogs in OpenOffice Writer, to be used by
volunteer proofreaders who are sent the screenshots to proofread.
Spellchecking scripts
Most of these scripts
were designed to help provide spell-checking in OmegaT when it didn't have it
yet. Most of the scripts rely on the
user copying the text to the clipboard, and then the spelling errors are listed
in a separate window.
* aspellspell.zip -- This is the script that started
me on AutoIt, way back in 2004. It
performs a spell-check using Aspell, by outputting a list of spelling errors in
a commandline window. There is even a
screenshot in the package.
* didierspell.zip -- This is not really a
spell-checker but a script that passes the selected text to another program
that is specified in a batch file. Dead
simple, but these were the early days of OmegaT when such dead simple scripts
meant being able to extend the capabilities of an otherwise fairly fine CAT
tool. I can't recall why I named it
after Didier (nor which Didier it was).
* kastrulspell.zip -- This spell-checker works
similar to Aspellspell, except that you don't have to use the commandline
window, and you don't have to install Aspell (you can use any word list as your
spellchecking dictionary). It uses Kastrul. One reason why I wrote this was because I
could not compile Aspell dictionaries for more languages (or for my own
language, whose Aspell dictionary was very old and quite small).
* spellcatcherspell.zip -- Essentially the same
as Kastrulspell, but it works with the 30-day trial version of Spellcatcher.
* tmxspeller.zip -- This is not an AutoIt script, but a
description and a macro in MS Word, that can be used to mark a TMX file in such
a way that a spell-checker will spell-check only the target text. It may be interesting for people who don't
know much about styles and about MS Word macros. The idea with this speller was that you would
do the translation in OmegaT, and then spell-check the resulting project TMX file
afterwards.
* unxutilspell.zip -- This is the same method as with
Kastrulspell, except that some people wanted an opensource solution, and that
meant using something like UnxUtils. The spelling mistakes pop up in a tooltip.
Insane CAT hopping scripts (and other users)
CAT hopping is the
habit of doing a translation in one CAT tool while making use of another CAT tool. Or, it is the habit of using your own
preferred CAT tool even though you're supposed to use the client's retarded
choice of a CAT tool. There are many
ways to hop between CAT tools -- sometimes all one needs to do is convert an
export file temporarily, but sometimes it requires extracting the text and
inserting it in fancy ways.
I originally made
scripts that jumped between two CAT tools on a segment by segment basis. However, this is very slow, and it means that
whole-file quality checking can't be done in the tool you're translating in.
For me, the ideal
method is to somehow convert the other tool's format to something that I can
translate directly. For example, I run a
macro on TTX files in MS Word to protect non-translatable text, and then I simply
translate the translatable text. Or, I
add brackets to all translatable segments in Idiom so that I can open the
associated XLF file in Wordfast and then simply ignore anything that is not in
brackets.
If the other CAT
tool's file format is unreachable, another method is to extract the text from
that tool, translate it, and then paste the text back into that other tool, one
segment at a time. Of course, a paster
script is used for the pasting, otherwise it would take hours to paste it.
Star Transit PE:
* transithack.zip -- This script allows hopping
between Star Transit PE and Wordfast Classic.
It is one of my first hopper scripts.
It actually attempts to hop between the programs in real time, by copying
one segment at a time from Transit and pasting it into Wordfast, to be
translated, and then to be copied and pasted back into Transit again before the
next segment is done. It is a cumbersome
method of CAT hopping.
* tragic2pe v2.zip -- This is my second hopper
script for Star Transit PE. It is a
paster script, which means it doesn't hop in real time between two programs but
instead reads lines of text from an input file and pastes them one line at a
time into Transit, using a keyboard shortcut.
My later scripts using this method often include features that allow the
user to walk away and let the script do the entire paste operation
automatically, without errors, or do work in tools that randomise the sequence
of segments between sessions.
Idiom WorldServer and Desktop Workbench:
* idiom scripts.zip -- A collection of Idiom
scripts, from version 2 to version 6, just in case you're interested in seeing
the progression.
* idiothack.zip -- This script is for translating Idiom
files in Wordfast using the same method as with TransitHack. In other words, it switches between the two
programs on a per-segment basis.
* idiot2yootype.zip -- One of my first paster
scripts, very primitive.
* idiot2big.zip -- A slightly more sophisticad script
for Idiom, with a script for the server edition too.
Trados TagEditor and Studio:
* tageditor toolsets.zip -- Here is a small
"toolset" of actions that one can perform in Trados TagEditor. Some of the actions are half baked. Basically the script automates clicking the
various menu items in TagEditor.
* sdlxliff_colourbracketer.zip -- This
script processes an SDLXLIFF file and puts brackets around text that have a
certain colour. It breaks the SDLXLIFF
file's format, but what I would do is open that file in MS Word after that, and
remove the brackets later anyway. This
is nice because Trados 2009 is a cool tool for dealing with e.g. Excel files
that have text in specific colours.
* Set all SDLXLIFF to
translated.zip -- sets the status of segments in multiple SDLXLIFF files to
"Translated". Tested only
once... can't guarantee nothing.
Other CAT tools:
* xtm paster.zip -- These are the beginnings of
scripts for XTM. I have used them on a
few jobs, but only on a few.
* xtm paster v2.zip -- Version 2 solves the
problem of the latest version of XTM in which placeables are *images*.
* chross scripts.zip -- Here are some scripts
that I used in a ChrossTranslate job, but I can't remember much about it,
sorry.
Nexting scripts:
The following scripts
all simply press "next" in various CAT tools. Sometimes this is useful if you want to
create a dummy translated file in that tool and the only way to do it is to
manually press "next" all the time.
* tageditornext.zip (Trados TagEditor)
* swordnext.zip (SwordFish)
* txml_ALT_down.zip (Wordfast Pro)
Other scripts and stuff:
* anypaste.zip -- If you did your translation in your
favourite program, but you need to paste every segment into another program
(e.g. the client's retarded in-house tool), then this script is for you. Simply save your translation as a plaintext
file with one segment per line, and this script will paste the next line at the
cursor position. You can see which line
will be pasted because it shows in a traytip, and if you make a mistake, you
can select the previous or next segment in the text file easily.
* tranzvoo.zip -- This is actually an advanced version of
my paster scripts. It is almost a mini
CAT tool, that offers only exact matching.
Use this paster if you can't guarantee that the segments will be in the
same order in the target tool as they were when you extracted the
segments. One caveat: it can't deal with
non-unique segments.
* tw4winstyles.doc -- This is an MS Word file with
all of Wordfast Classic's tw4win styles in it.
The same styles are used in Trados uncleaned RTF files. This is a useful file to have if you want to
create an uncleaned RTF manually, e.g. if you're CAT hopping.
Google Translate Toolkit:
I use a number of
scripts to process text from Google Translate Toolkit (GTT) so that I can do
the translation in Wordfast and also perform various quality checking
procedures that can't be done in GTT.
Some of these scripts are difficult and complex to use, but I took four
of them and simplified them for translators who want maximum benefit for
minimum effort.
The "easy gtt
scripts" consists of four scripts that produce two reports that can be
opened in a browser or in MS Word, namely a glossary look-up on the source
text, and a blacklist lookup on the target text. The target text report can also be useful for
spell-checking because it marks the text in the right language in MS Word
already. In theory the blacklist
supports regular expressions, but I haven't tested it extensively. There is a long MS Word file that explains
everything in much detail.
The
"non-easy" scripts are the GTT scripts that I usually use
myself. The documentation is patchy
because they scripts are really written for myself, and because I continuously
improved them as I went along. This
represents the latest versions of the scripts.
* Easy GTT scripts for
translators
* Non-Easy GTT scripts
for translators
ProZ.com related scripts
If you're like me and
you want to multiquote in the ProZ.com forums, this script helps create a text
file with everyone's comments on it in a single place. You can then easily edit the text file and
not have to copy/paste everyone's comments individually, and then paste your
reply in one go.
This pair of scripts
help users download certain types of ProZ.com glossaries from the ProZ.com web
site. It only works for a certain type
of glossary, and I've forgotten which (but it worked because I had some
positive comments).
In the ProZ.com
forums a discussion sometimes stretches over several pages, and it can be
useful to download the entire thread in a single HTML file. That is what this pair of scripts is
for. I wrote it during a very heated
discussion that ran over more than 100 pages long. On the internet, you absolutely have to tell
people if they're wrong, right? :-)
If you want to use
the Blue Board to promote your translation services, it can be useful to have a
single search in a single file. This
would allow you to sort the entries on more than the sort options available on
ProZ.com. This pair of scripts create a
single HTML file that you can open in MS Word to sort by the various columns. Note that it may be much simpler to use
ProZ.com's company directory.
Older one-purpose scripts
WikTerma
-- The sole purpose of this script (in retrospect) is to illustrate how a great
idea may not be so great when you actually do it. I tried to implement in AutoIt what someone
else had done in other programming language, namely to create a glossary based
on the interlanguage links of Wikipedia.
The theory is that you can use the article titles of Wikipedia pages in
various languages to create a glossary, since these articles often have the
same name in both languages, and there are links between these articles in the
different languages.
Verb Getter -- There is a parts-of-speech word
list of 300 000 words among Kevin's
word lists. I wanted to extract only
the words that were labelled with a certain part-of-speech. This script extracts not only verbs but any
of the parts-of-speech in that file. It
would appear that the script was made for an older version of Kevin's POS file,
because I had to update the script to make it work again. The script examines about 900 lines per
second on my computer. I'm not sure if
it actually works for all parts of speech.
Hey, this reminds me, I still have to write something that will convert
a Hunspell dictionary into a human readable one.
TuxPaintgetlist -- TuxPaint is a children's
program that can come with voice files for different languages, to be spoken in
by volunteers. This script takes the
TuxPaint source code and produces a file with both file names and actual text
to be spoken, so that volunteers can record and save the words quicker.
TextBrucie -- This is a word-for-word machine translation
system. You feed it a bilingual glossary
and it will "translate" a text using that glossary. I just wanted to see how bad such a system
would be, and now I know. I suspect the
Brucie in the name refers to a character that frequented alt.html.critique many
years ago, but for the life of me I can't remember why I decided to use his
name in this script -- it makes no sense.
SpeechLEN -- Suppose you have a list of phrases and
you wonder how long it takes to say each phrase, this script will tell
you. It pastes lines of text into a
simple text-to-speech program and takes note of how long it takes to say each
phrase. I must have needed to know this,
at some time or another.
Sifilist
-- I had to invite a hundred or a thousand people to a mailing list that used
MailMan software. The MailMan software
we used did not allow us to invite more than ONE member at a time. To use this script, have a text file with
everyone's e-mail addresses in it, and let the script run and invite everyone,
one by one by one.
RemoveStuff -- This script reads a list of files
from a text file and attempts to remove all double spaces, tabs and double line
breaks from those files. I think I used
it when I wanted to align some files and I needed the files to be cleared of
those things.
OpenProjjer -- When I administrated the translators
of OpenProj, we had to convert the OpenProj .properties file to PO, but before
we did that, we had to sanitise it from strings that should not have been
translated. This script did that.
MycroseftGlass -- At one time, I had to perform glossary
lookups in a Microsoft Glossary web site, and it could only be done slowly and
one word at a time, so I wrote this script to look up the terms for me and save
the results in text files on my own computer.
This was at a time when internet was slow, and looking up words in an
online glossary was frustrating.
iTermPaste -- I had a job once that required me to
paste terminology (terms plus definitions plus parts of speech etc) into an
online translation system called iTerm.
I think it was a Novell of Microsoft thing. Doing all of this manually would have
resulted in wrist cramp from hell, so I wrote a script that would click in
exactly the right locations on the web site and paste the precise things from
lists of text files. I saved hours and
hours of work.
Bommer Reloaded (for Windows users of Linux
files) -- This script is called "reloaded" because it is an improved
version of an older script. I tried all
sorts of ways to work with UTF8 files that did or did not have byte order
marks, and I could (at the time I wrote this script) not figure out how to do
it exclusively inside AutoIt. Also,
AutoIt's support for BOM-less UTF8 only came later. I needed this script because the Translate
Toolkit can't read BOM'ed UTF8 files, and most of my tools think BOM-less UTF8
files are ANSI, ASCII or ISO-whatsisname.
Adding BOM with
AutoIt (for Windows users of Linux files) -- Three ways to add a BOM to
files using only AutoIt, with examples.
This is an old file, and I suspect that I would do it even different
today if I had to do it.
ASCII2UTF8 (doesn't really work) -- This script is
for converting a folder with ASCII files to BOM-less UTF8 files.
Verbinators -- scripts for recording short phrases
via online text-to-speech services, for practicing stuff like verb
conjugations. The scripts help create
WAV files that you can listen to on an MP3 player (e.g. on the bus or while
walking to town), because let's face it: the parrot method works best!